Sample records for protein structures provide

  1. Gaia: automated quality assessment of protein structure models.

    PubMed

    Kota, Pradeep; Ding, Feng; Ramachandran, Srinivas; Dokholyan, Nikolay V

    2011-08-15

    Increasing use of structural modeling for understanding structure-function relationships in proteins has led to the need to ensure that the protein models being used are of acceptable quality. Quality of a given protein structure can be assessed by comparing various intrinsic structural properties of the protein to those observed in high-resolution protein structures. In this study, we present tools to compare a given structure to high-resolution crystal structures. We assess packing by calculating the total void volume, the percentage of unsatisfied hydrogen bonds, the number of steric clashes and the scaling of the accessible surface area. We assess covalent geometry by determining bond lengths, angles, dihedrals and rotamers. The statistical parameters for the above measures, obtained from high-resolution crystal structures enable us to provide a quality-score that points to specific areas where a given protein structural model needs improvement. We provide these tools that appraise protein structures in the form of a web server Gaia (http://chiron.dokhlab.org). Gaia evaluates the packing and covalent geometry of a given protein structure and provides quantitative comparison of the given structure to high-resolution crystal structures. dokh@unc.edu Supplementary data are available at Bioinformatics online.

  2. Classification of proteins: available structural space for molecular modeling.

    PubMed

    Andreeva, Antonina

    2012-01-01

    The wealth of available protein structural data provides unprecedented opportunity to study and better understand the underlying principles of protein folding and protein structure evolution. A key to achieving this lies in the ability to analyse these data and to organize them in a coherent classification scheme. Over the past years several protein classifications have been developed that aim to group proteins based on their structural relationships. Some of these classification schemes explore the concept of structural neighbourhood (structural continuum), whereas other utilize the notion of protein evolution and thus provide a discrete rather than continuum view of protein structure space. This chapter presents a strategy for classification of proteins with known three-dimensional structure. Steps in the classification process along with basic definitions are introduced. Examples illustrating some fundamental concepts of protein folding and evolution with a special focus on the exceptions to them are presented.

  3. Exploring Human Diseases and Biological Mechanisms by Protein Structure Prediction and Modeling.

    PubMed

    Wang, Juexin; Luttrell, Joseph; Zhang, Ning; Khan, Saad; Shi, NianQing; Wang, Michael X; Kang, Jing-Qiong; Wang, Zheng; Xu, Dong

    2016-01-01

    Protein structure prediction and modeling provide a tool for understanding protein functions by computationally constructing protein structures from amino acid sequences and analyzing them. With help from protein prediction tools and web servers, users can obtain the three-dimensional protein structure models and gain knowledge of functions from the proteins. In this chapter, we will provide several examples of such studies. As an example, structure modeling methods were used to investigate the relation between mutation-caused misfolding of protein and human diseases including epilepsy and leukemia. Protein structure prediction and modeling were also applied in nucleotide-gated channels and their interaction interfaces to investigate their roles in brain and heart cells. In molecular mechanism studies of plants, rice salinity tolerance mechanism was studied via structure modeling on crucial proteins identified by systems biology analysis; trait-associated protein-protein interactions were modeled, which sheds some light on the roles of mutations in soybean oil/protein content. In the age of precision medicine, we believe protein structure prediction and modeling will play more and more important roles in investigating biomedical mechanism of diseases and drug design.

  4. CASTp 3.0: computed atlas of surface topography of proteins.

    PubMed

    Tian, Wei; Chen, Chang; Lei, Xue; Zhao, Jieling; Liang, Jie

    2018-06-01

    Geometric and topological properties of protein structures, including surface pockets, interior cavities and cross channels, are of fundamental importance for proteins to carry out their functions. Computed Atlas of Surface Topography of proteins (CASTp) is a web server that provides online services for locating, delineating and measuring these geometric and topological properties of protein structures. It has been widely used since its inception in 2003. In this article, we present the latest version of the web server, CASTp 3.0. CASTp 3.0 continues to provide reliable and comprehensive identifications and quantifications of protein topography. In addition, it now provides: (i) imprints of the negative volumes of pockets, cavities and channels, (ii) topographic features of biological assemblies in the Protein Data Bank, (iii) improved visualization of protein structures and pockets, and (iv) more intuitive structural and annotated information, including information of secondary structure, functional sites, variant sites and other annotations of protein residues. The CASTp 3.0 web server is freely accessible at http://sts.bioe.uic.edu/castp/.

  5. Data-assisted protein structure modeling by global optimization in CASP12.

    PubMed

    Joo, Keehyoung; Heo, Seungryong; Joung, InSuk; Hong, Seung Hwan; Lee, Sung Jong; Lee, Jooyoung

    2018-03-01

    In CASP12, 2 types of data-assisted protein structure modeling were experimented. Either SAXS experimental data or cross-linking experimental data was provided for a selected number of CASP12 targets that the CASP12 predictor could utilize for better protein structure modeling. We devised 2 separate energy terms for SAXS data and cross-linking data to drive the model structures into more native-like structures that satisfied the given experimental data as much as possible. In CASP11, we successfully performed protein structure modeling using simulated sparse and ambiguously assigned NOE data and/or correct residue-residue contact information, where the only energy term that folded the protein into its native structure was the term which was originated from the given experimental data. However, the 2 types of experimental data provided in CASP12 were far from being sufficient enough to fold the target protein into its native structure because SAXS data provides only the overall shape of the molecule and the cross-linking contact information provides only very low-resolution distance information. For this reason, we combined the SAXS or cross-linking energy term with our regular modeling energy function that includes both the template energy term and the de novo energy terms. By optimizing the newly formulated energy function, we obtained protein models that fit better with provided SAXS data than the X-ray structure of the target. However, the improvement of the model relative to the 1 modeled without the SAXS data, was not significant. Consistent structural improvement was achieved by incorporating cross-linking data into the protein structure modeling. © 2018 Wiley Periodicals, Inc.

  6. Structure-based barcoding of proteins.

    PubMed

    Metri, Rahul; Jerath, Gaurav; Kailas, Govind; Gacche, Nitin; Pal, Adityabarna; Ramakrishnan, Vibin

    2014-01-01

    A reduced representation in the format of a barcode has been developed to provide an overview of the topological nature of a given protein structure from 3D coordinate file. The molecular structure of a protein coordinate file from Protein Data Bank is first expressed in terms of an alpha-numero code and further converted to a barcode image. The barcode representation can be used to compare and contrast different proteins based on their structure. The utility of this method has been exemplified by comparing structural barcodes of proteins that belong to same fold family, and across different folds. In addition to this, we have attempted to provide an illustration to (i) the structural changes often seen in a given protein molecule upon interaction with ligands and (ii) Modifications in overall topology of a given protein during evolution. The program is fully downloadable from the website http://www.iitg.ac.in/probar/. © 2013 The Protein Society.

  7. Some of the most interesting CASP11 targets through the eyes of their authors.

    PubMed

    Kryshtafovych, Andriy; Moult, John; Baslé, Arnaud; Burgin, Alex; Craig, Timothy K; Edwards, Robert A; Fass, Deborah; Hartmann, Marcus D; Korycinski, Mateusz; Lewis, Richard J; Lorimer, Donald; Lupas, Andrei N; Newman, Janet; Peat, Thomas S; Piepenbrink, Kurt H; Prahlad, Janani; van Raaij, Mark J; Rohwer, Forest; Segall, Anca M; Seguritan, Victor; Sundberg, Eric J; Singh, Abhimanyu K; Wilson, Mark A; Schwede, Torsten

    2016-09-01

    The Critical Assessment of protein Structure Prediction (CASP) experiment would not have been possible without the prediction targets provided by the experimental structural biology community. In this article, selected crystallographers providing targets for the CASP11 experiment discuss the functional and biological significance of the target proteins, highlight their most interesting structural features, and assess whether these features were correctly reproduced in the predictions submitted to CASP11. Proteins 2016; 84(Suppl 1):34-50. © 2015 The Authors. Proteins: Structure, Function, and Bioinformatics Published by Wiley Periodicals, Inc. © 2015 The Authors. Proteins: Structure, Function, and Bioinformatics Published by Wiley Periodicals, Inc.

  8. Expressing the human proteome for affinity proteomics: optimising expression of soluble protein domains and in vivo biotinylation.

    PubMed

    Keates, Tracy; Cooper, Christopher D O; Savitsky, Pavel; Allerston, Charles K; Phillips, Claire; Hammarström, Martin; Daga, Neha; Berridge, Georgina; Mahajan, Pravin; Burgess-Brown, Nicola A; Müller, Susanne; Gräslund, Susanne; Gileadi, Opher

    2012-06-15

    The generation of affinity reagents to large numbers of human proteins depends on the ability to express the target proteins as high-quality antigens. The Structural Genomics Consortium (SGC) focuses on the production and structure determination of human proteins. In a 7-year period, the SGC has deposited crystal structures of >800 human protein domains, and has additionally expressed and purified a similar number of protein domains that have not yet been crystallised. The targets include a diversity of protein domains, with an attempt to provide high coverage of protein families. The family approach provides an excellent basis for characterising the selectivity of affinity reagents. We present a summary of the approaches used to generate purified human proteins or protein domains, a test case demonstrating the ability to rapidly generate new proteins, and an optimisation study on the modification of >70 proteins by biotinylation in vivo. These results provide a unique synergy between large-scale structural projects and the recent efforts to produce a wide coverage of affinity reagents to the human proteome. Copyright © 2011 Elsevier B.V. All rights reserved.

  9. Expressing the human proteome for affinity proteomics: optimising expression of soluble protein domains and in vivo biotinylation

    PubMed Central

    Keates, Tracy; Cooper, Christopher D.O.; Savitsky, Pavel; Allerston, Charles K.; Phillips, Claire; Hammarström, Martin; Daga, Neha; Berridge, Georgina; Mahajan, Pravin; Burgess-Brown, Nicola A.; Müller, Susanne; Gräslund, Susanne; Gileadi, Opher

    2012-01-01

    The generation of affinity reagents to large numbers of human proteins depends on the ability to express the target proteins as high-quality antigens. The Structural Genomics Consortium (SGC) focuses on the production and structure determination of human proteins. In a 7-year period, the SGC has deposited crystal structures of >800 human protein domains, and has additionally expressed and purified a similar number of protein domains that have not yet been crystallised. The targets include a diversity of protein domains, with an attempt to provide high coverage of protein families. The family approach provides an excellent basis for characterising the selectivity of affinity reagents. We present a summary of the approaches used to generate purified human proteins or protein domains, a test case demonstrating the ability to rapidly generate new proteins, and an optimisation study on the modification of >70 proteins by biotinylation in vivo. These results provide a unique synergy between large-scale structural projects and the recent efforts to produce a wide coverage of affinity reagents to the human proteome. PMID:22027370

  10. Pre-calculated protein structure alignments at the RCSB PDB website.

    PubMed

    Prlic, Andreas; Bliven, Spencer; Rose, Peter W; Bluhm, Wolfgang F; Bizon, Chris; Godzik, Adam; Bourne, Philip E

    2010-12-01

    With the continuous growth of the RCSB Protein Data Bank (PDB), providing an up-to-date systematic structure comparison of all protein structures poses an ever growing challenge. Here, we present a comparison tool for calculating both 1D protein sequence and 3D protein structure alignments. This tool supports various applications at the RCSB PDB website. First, a structure alignment web service calculates pairwise alignments. Second, a stand-alone application runs alignments locally and visualizes the results. Third, pre-calculated 3D structure comparisons for the whole PDB are provided and updated on a weekly basis. These three applications allow users to discover novel relationships between proteins available either at the RCSB PDB or provided by the user. A web user interface is available at http://www.rcsb.org/pdb/workbench/workbench.do. The source code is available under the LGPL license from http://www.biojava.org. A source bundle, prepared for local execution, is available from http://source.rcsb.org andreas@sdsc.edu; pbourne@ucsd.edu.

  11. Structural Genomics of Protein Phosphatases

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Almo,S.; Bonanno, J.; Sauder, J.

    The New York SGX Research Center for Structural Genomics (NYSGXRC) of the NIGMS Protein Structure Initiative (PSI) has applied its high-throughput X-ray crystallographic structure determination platform to systematic studies of all human protein phosphatases and protein phosphatases from biomedically-relevant pathogens. To date, the NYSGXRC has determined structures of 21 distinct protein phosphatases: 14 from human, 2 from mouse, 2 from the pathogen Toxoplasma gondii, 1 from Trypanosoma brucei, the parasite responsible for African sleeping sickness, and 2 from the principal mosquito vector of malaria in Africa, Anopheles gambiae. These structures provide insights into both normal and pathophysiologic processes, including transcriptionalmore » regulation, regulation of major signaling pathways, neural development, and type 1 diabetes. In conjunction with the contributions of other international structural genomics consortia, these efforts promise to provide an unprecedented database and materials repository for structure-guided experimental and computational discovery of inhibitors for all classes of protein phosphatases.« less

  12. Catalytic site identification—a web server to identify catalytic site structural matches throughout PDB

    PubMed Central

    Kirshner, Daniel A.; Nilmeier, Jerome P.; Lightstone, Felice C.

    2013-01-01

    The catalytic site identification web server provides the innovative capability to find structural matches to a user-specified catalytic site among all Protein Data Bank proteins rapidly (in less than a minute). The server also can examine a user-specified protein structure or model to identify structural matches to a library of catalytic sites. Finally, the server provides a database of pre-calculated matches between all Protein Data Bank proteins and the library of catalytic sites. The database has been used to derive a set of hypothesized novel enzymatic function annotations. In all cases, matches and putative binding sites (protein structure and surfaces) can be visualized interactively online. The website can be accessed at http://catsid.llnl.gov. PMID:23680785

  13. Catalytic site identification--a web server to identify catalytic site structural matches throughout PDB.

    PubMed

    Kirshner, Daniel A; Nilmeier, Jerome P; Lightstone, Felice C

    2013-07-01

    The catalytic site identification web server provides the innovative capability to find structural matches to a user-specified catalytic site among all Protein Data Bank proteins rapidly (in less than a minute). The server also can examine a user-specified protein structure or model to identify structural matches to a library of catalytic sites. Finally, the server provides a database of pre-calculated matches between all Protein Data Bank proteins and the library of catalytic sites. The database has been used to derive a set of hypothesized novel enzymatic function annotations. In all cases, matches and putative binding sites (protein structure and surfaces) can be visualized interactively online. The website can be accessed at http://catsid.llnl.gov.

  14. An Interactive Introduction to Protein Structure

    ERIC Educational Resources Information Center

    Lee, W. Theodore

    2004-01-01

    To improve student understanding of protein structure and the significance of noncovalent interactions in protein structure and function, students are assigned a project to write a paper complemented with computer-generated images. The assignment provides an opportunity for students to select a protein structure that is of interest and detail…

  15. Elfin: An algorithm for the computational design of custom three-dimensional structures from modular repeat protein building blocks.

    PubMed

    Yeh, Chun-Ting; Brunette, T J; Baker, David; McIntosh-Smith, Simon; Parmeggiani, Fabio

    2018-02-01

    Computational protein design methods have enabled the design of novel protein structures, but they are often still limited to small proteins and symmetric systems. To expand the size of designable proteins while controlling the overall structure, we developed Elfin, a genetic algorithm for the design of novel proteins with custom shapes using structural building blocks derived from experimentally verified repeat proteins. By combining building blocks with compatible interfaces, it is possible to rapidly build non-symmetric large structures (>1000 amino acids) that match three-dimensional geometric descriptions provided by the user. A run time of about 20min on a laptop computer for a 3000 amino acid structure makes Elfin accessible to users with limited computational resources. Protein structures with controlled geometry will allow the systematic study of the effect of spatial arrangement of enzymes and signaling molecules, and provide new scaffolds for functional nanomaterials. Copyright © 2017 Elsevier Inc. All rights reserved.

  16. Quantitative Protein Topography Analysis and High-Resolution Structure Prediction Using Hydroxyl Radical Labeling and Tandem-Ion Mass Spectrometry (MS)*

    PubMed Central

    Kaur, Parminder; Kiselar, Janna; Yang, Sichun; Chance, Mark R.

    2015-01-01

    Hydroxyl radical footprinting based MS for protein structure assessment has the goal of understanding ligand induced conformational changes and macromolecular interactions, for example, protein tertiary and quaternary structure, but the structural resolution provided by typical peptide-level quantification is limiting. In this work, we present experimental strategies using tandem-MS fragmentation to increase the spatial resolution of the technique to the single residue level to provide a high precision tool for molecular biophysics research. Overall, in this study we demonstrated an eightfold increase in structural resolution compared with peptide level assessments. In addition, to provide a quantitative analysis of residue based solvent accessibility and protein topography as a basis for high-resolution structure prediction; we illustrate strategies of data transformation using the relative reactivity of side chains as a normalization strategy and predict side-chain surface area from the footprinting data. We tested the methods by examination of Ca+2-calmodulin showing highly significant correlations between surface area and side-chain contact predictions for individual side chains and the crystal structure. Tandem ion based hydroxyl radical footprinting-MS provides quantitative high-resolution protein topology information in solution that can fill existing gaps in structure determination for large proteins and macromolecular complexes. PMID:25687570

  17. Replica exchange molecular dynamics simulation of structure variation from α/4β-fold to 3α-fold protein.

    PubMed

    Lazim, Raudah; Mei, Ye; Zhang, Dawei

    2012-03-01

    Replica exchange molecular dynamics (REMD) simulation provides an efficient conformational sampling tool for the study of protein folding. In this study, we explore the mechanism directing the structure variation from α/4β-fold protein to 3α-fold protein after mutation by conducting REMD simulation on 42 replicas with temperatures ranging from 270 K to 710 K. The simulation began from a protein possessing the primary structure of GA88 but the tertiary structure of GB88, two G proteins with "high sequence identity." Albeit the large Cα-root mean square deviation (RMSD) of the folded protein (4.34 Å at 270 K and 4.75 Å at 304 K), a variation in tertiary structure was observed. Together with the analysis of secondary structure assignment, cluster analysis and principal component, it provides insights to the folding and unfolding pathway of 3α-fold protein and α/4β-fold protein respectively paving the way toward the understanding of the ongoings during conformational variation.

  18. Evaluation of variability in high-resolution protein structures by global distance scoring.

    PubMed

    Anzai, Risa; Asami, Yoshiki; Inoue, Waka; Ueno, Hina; Yamada, Koya; Okada, Tetsuji

    2018-01-01

    Systematic analysis of the statistical and dynamical properties of proteins is critical to understanding cellular events. Extraction of biologically relevant information from a set of high-resolution structures is important because it can provide mechanistic details behind the functional properties of protein families, enabling rational comparison between families. Most of the current structural comparisons are pairwise-based, which hampers the global analysis of increasing contents in the Protein Data Bank. Additionally, pairing of protein structures introduces uncertainty with respect to reproducibility because it frequently accompanies other settings for superimposition. This study introduces intramolecular distance scoring for the global analysis of proteins, for each of which at least several high-resolution structures are available. As a pilot study, we have tested 300 human proteins and showed that the method is comprehensively used to overview advances in each protein and protein family at the atomic level. This method, together with the interpretation of the model calculations, provide new criteria for understanding specific structural variation in a protein, enabling global comparison of the variability in proteins from different species.

  19. MEGADOCK-Web: an integrated database of high-throughput structure-based protein-protein interaction predictions.

    PubMed

    Hayashi, Takanori; Matsuzaki, Yuri; Yanagisawa, Keisuke; Ohue, Masahito; Akiyama, Yutaka

    2018-05-08

    Protein-protein interactions (PPIs) play several roles in living cells, and computational PPI prediction is a major focus of many researchers. The three-dimensional (3D) structure and binding surface are important for the design of PPI inhibitors. Therefore, rigid body protein-protein docking calculations for two protein structures are expected to allow elucidation of PPIs different from known complexes in terms of 3D structures because known PPI information is not explicitly required. We have developed rapid PPI prediction software based on protein-protein docking, called MEGADOCK. In order to fully utilize the benefits of computational PPI predictions, it is necessary to construct a comprehensive database to gather prediction results and their predicted 3D complex structures and to make them easily accessible. Although several databases exist that provide predicted PPIs, the previous databases do not contain a sufficient number of entries for the purpose of discovering novel PPIs. In this study, we constructed an integrated database of MEGADOCK PPI predictions, named MEGADOCK-Web. MEGADOCK-Web provides more than 10 times the number of PPI predictions than previous databases and enables users to conduct PPI predictions that cannot be found in conventional PPI prediction databases. In MEGADOCK-Web, there are 7528 protein chains and 28,331,628 predicted PPIs from all possible combinations of those proteins. Each protein structure is annotated with PDB ID, chain ID, UniProt AC, related KEGG pathway IDs, and known PPI pairs. Additionally, MEGADOCK-Web provides four powerful functions: 1) searching precalculated PPI predictions, 2) providing annotations for each predicted protein pair with an experimentally known PPI, 3) visualizing candidates that may interact with the query protein on biochemical pathways, and 4) visualizing predicted complex structures through a 3D molecular viewer. MEGADOCK-Web provides a huge amount of comprehensive PPI predictions based on docking calculations with biochemical pathways and enables users to easily and quickly assess PPI feasibilities by archiving PPI predictions. MEGADOCK-Web also promotes the discovery of new PPIs and protein functions and is freely available for use at http://www.bi.cs.titech.ac.jp/megadock-web/ .

  20. BindML/BindML+: Detecting Protein-Protein Interaction Interface Propensity from Amino Acid Substitution Patterns.

    PubMed

    Wei, Qing; La, David; Kihara, Daisuke

    2017-01-01

    Prediction of protein-protein interaction sites in a protein structure provides important information for elucidating the mechanism of protein function and can also be useful in guiding a modeling or design procedures of protein complex structures. Since prediction methods essentially assess the propensity of amino acids that are likely to be part of a protein docking interface, they can help in designing protein-protein interactions. Here, we introduce BindML and BindML+ protein-protein interaction sites prediction methods. BindML predicts protein-protein interaction sites by identifying mutation patterns found in known protein-protein complexes using phylogenetic substitution models. BindML+ is an extension of BindML for distinguishing permanent and transient types of protein-protein interaction sites. We developed an interactive web-server that provides a convenient interface to assist in structural visualization of protein-protein interactions site predictions. The input data for the web-server are a tertiary structure of interest. BindML and BindML+ are available at http://kiharalab.org/bindml/ and http://kiharalab.org/bindml/plus/ .

  1. Self assembling proteins

    DOEpatents

    Yeates, Todd O.; Padilla, Jennifer; Colovos, Chris

    2004-06-29

    Novel fusion proteins capable of self-assembling into regular structures, as well as nucleic acids encoding the same, are provided. The subject fusion proteins comprise at least two oligomerization domains rigidly linked together, e.g. through an alpha helical linking group. Also provided are regular structures comprising a plurality of self-assembled fusion proteins of the subject invention, and methods for producing the same. The subject fusion proteins find use in the preparation of a variety of nanostructures, where such structures include: cages, shells, double-layer rings, two-dimensional layers, three-dimensional crystals, filaments, and tubes.

  2. The Structural Biology Knowledgebase: a portal to protein structures, sequences, functions, and methods.

    PubMed

    Gabanyi, Margaret J; Adams, Paul D; Arnold, Konstantin; Bordoli, Lorenza; Carter, Lester G; Flippen-Andersen, Judith; Gifford, Lida; Haas, Juergen; Kouranov, Andrei; McLaughlin, William A; Micallef, David I; Minor, Wladek; Shah, Raship; Schwede, Torsten; Tao, Yi-Ping; Westbrook, John D; Zimmerman, Matthew; Berman, Helen M

    2011-07-01

    The Protein Structure Initiative's Structural Biology Knowledgebase (SBKB, URL: http://sbkb.org ) is an open web resource designed to turn the products of the structural genomics and structural biology efforts into knowledge that can be used by the biological community to understand living systems and disease. Here we will present examples on how to use the SBKB to enable biological research. For example, a protein sequence or Protein Data Bank (PDB) structure ID search will provide a list of related protein structures in the PDB, associated biological descriptions (annotations), homology models, structural genomics protein target status, experimental protocols, and the ability to order available DNA clones from the PSI:Biology-Materials Repository. A text search will find publication and technology reports resulting from the PSI's high-throughput research efforts. Web tools that aid in research, including a system that accepts protein structure requests from the community, will also be described. Created in collaboration with the Nature Publishing Group, the Structural Biology Knowledgebase monthly update also provides a research library, editorials about new research advances, news, and an events calendar to present a broader view of structural genomics and structural biology.

  3. CCProf: exploring conformational change profile of proteins

    PubMed Central

    Chang, Che-Wei; Chou, Chai-Wei; Chang, Darby Tien-Hao

    2016-01-01

    In many biological processes, proteins have important interactions with various molecules such as proteins, ions or ligands. Many proteins undergo conformational changes upon these interactions, where regions with large conformational changes are critical to the interactions. This work presents the CCProf platform, which provides conformational changes of entire proteins, named conformational change profile (CCP) in the context. CCProf aims to be a platform where users can study potential causes of novel conformational changes. It provides 10 biological features, including conformational change, potential binding target site, secondary structure, conservation, disorder propensity, hydropathy propensity, sequence domain, structural domain, phosphorylation site and catalytic site. All these information are integrated into a well-aligned view, so that researchers can capture important relevance between different biological features visually. The CCProf contains 986 187 protein structure pairs for 3123 proteins. In addition, CCProf provides a 3D view in which users can see the protein structures before and after conformational changes as well as binding targets that induce conformational changes. All information (e.g. CCP, binding targets and protein structures) shown in CCProf, including intermediate data are available for download to expedite further analyses. Database URL: http://zoro.ee.ncku.edu.tw/ccprof/ PMID:27016699

  4. Comprehensive inventory of protein complexes in the Protein Data Bank from consistent classification of interfaces

    DOE PAGES

    Bordner, Andrew J.; Gorin, Andrey A.

    2008-05-12

    Here, protein-protein interactions are ubiquitous and essential for cellular processes. High-resolution X-ray crystallographic structures of protein complexes can elucidate the details of their function and provide a basis for many computational and experimental approaches. Here we demonstrate that existing annotations of protein complexes, including those provided by the Protein Data Bank (PDB) itself, contain a significant fraction of incorrect annotations. Results: We have developed a method for identifying protein complexes in the PDB X-ray structures by a four step procedure: (1) comprehensively collecting all protein-protein interfaces; (2) clustering similar protein-protein interfaces together; (3) estimating the probability that each cluster ismore » relevant based on a diverse set of properties; and (4) finally combining these scores for each entry in order to predict the complex structure. Unlike previous annotation methods, consistent prediction of complexes with identical or almost identical protein content is insured. The resulting clusters of biologically relevant interfaces provide a reliable catalog of evolutionary conserved protein-protein interactions.« less

  5. Visualizing and Clustering Protein Similarity Networks: Sequences, Structures, and Functions.

    PubMed

    Mai, Te-Lun; Hu, Geng-Ming; Chen, Chi-Ming

    2016-07-01

    Research in the recent decade has demonstrated the usefulness of protein network knowledge in furthering the study of molecular evolution of proteins, understanding the robustness of cells to perturbation, and annotating new protein functions. In this study, we aimed to provide a general clustering approach to visualize the sequence-structure-function relationship of protein networks, and investigate possible causes for inconsistency in the protein classifications based on sequences, structures, and functions. Such visualization of protein networks could facilitate our understanding of the overall relationship among proteins and help researchers comprehend various protein databases. As a demonstration, we clustered 1437 enzymes by their sequences and structures using the minimum span clustering (MSC) method. The general structure of this protein network was delineated at two clustering resolutions, and the second level MSC clustering was found to be highly similar to existing enzyme classifications. The clustering of these enzymes based on sequence, structure, and function information is consistent with each other. For proteases, the Jaccard's similarity coefficient is 0.86 between sequence and function classifications, 0.82 between sequence and structure classifications, and 0.78 between structure and function classifications. From our clustering results, we discussed possible examples of divergent evolution and convergent evolution of enzymes. Our clustering approach provides a panoramic view of the sequence-structure-function network of proteins, helps visualize the relation between related proteins intuitively, and is useful in predicting the structure and function of newly determined protein sequences.

  6. PDBStat: a universal restraint converter and restraint analysis software package for protein NMR.

    PubMed

    Tejero, Roberto; Snyder, David; Mao, Binchen; Aramini, James M; Montelione, Gaetano T

    2013-08-01

    The heterogeneous array of software tools used in the process of protein NMR structure determination presents organizational challenges in the structure determination and validation processes, and creates a learning curve that limits the broader use of protein NMR in biology. These challenges, including accurate use of data in different data formats required by software carrying out similar tasks, continue to confound the efforts of novices and experts alike. These important issues need to be addressed robustly in order to standardize protein NMR structure determination and validation. PDBStat is a C/C++ computer program originally developed as a universal coordinate and protein NMR restraint converter. Its primary function is to provide a user-friendly tool for interconverting between protein coordinate and protein NMR restraint data formats. It also provides an integrated set of computational methods for protein NMR restraint analysis and structure quality assessment, relabeling of prochiral atoms with correct IUPAC names, as well as multiple methods for analysis of the consistency of atomic positions indicated by their convergence across a protein NMR ensemble. In this paper we provide a detailed description of the PDBStat software, and highlight some of its valuable computational capabilities. As an example, we demonstrate the use of the PDBStat restraint converter for restrained CS-Rosetta structure generation calculations, and compare the resulting protein NMR structure models with those generated from the same NMR restraint data using more traditional structure determination methods. These results demonstrate the value of a universal restraint converter in allowing the use of multiple structure generation methods with the same restraint data for consensus analysis of protein NMR structures and the underlying restraint data.

  7. PDBStat: A Universal Restraint Converter and Restraint Analysis Software Package for Protein NMR

    PubMed Central

    Tejero, Roberto; Snyder, David; Mao, Binchen; Aramini, James M.; Montelione, Gaetano T

    2013-01-01

    The heterogeneous array of software tools used in the process of protein NMR structure determination presents organizational challenges in the structure determination and validation processes, and creates a learning curve that limits the broader use of protein NMR in biology. These challenges, including accurate use of data in different data formats required by software carrying out similar tasks, continue to confound the efforts of novices and experts alike. These important issues need to be addressed robustly in order to standardize protein NMR structure determination and validation. PDBStat is a C/C++ computer program originally developed as a universal coordinate and protein NMR restraint converter. Its primary function is to provide a user-friendly tool for interconverting between protein coordinate and protein NMR restraint data formats. It also provides an integrated set of computational methods for protein NMR restraint analysis and structure quality assessment, relabeling of prochiral atoms with correct IUPAC names, as well as multiple methods for analysis of the consistency of atomic positions indicated by their convergence across a protein NMR ensemble. In this paper we provide a detailed description of the PDBStat software, and highlight some of its valuable computational capabilities. As an example, we demonstrate the use of the PDBStat restraint converter for restrained CS-Rosetta structure generation calculations, and compare the resulting protein NMR structure models with those generated from the same NMR restraint data using more traditional structure determination methods. These results demonstrate the value of a universal restraint converter in allowing the use of multiple structure generation methods with the same restraint data for consensus analysis of protein NMR structures and the underlying restraint data. PMID:23897031

  8. Unraveling the meaning of chemical shifts in protein NMR.

    PubMed

    Berjanskii, Mark V; Wishart, David S

    2017-11-01

    Chemical shifts are among the most informative parameters in protein NMR. They provide wealth of information about protein secondary and tertiary structure, protein flexibility, and protein-ligand binding. In this report, we review the progress in interpreting and utilizing protein chemical shifts that has occurred over the past 25years, with a particular focus on the large body of work arising from our group and other Canadian NMR laboratories. More specifically, this review focuses on describing, assessing, and providing some historical context for various chemical shift-based methods to: (1) determine protein secondary and super-secondary structure; (2) derive protein torsion angles; (3) assess protein flexibility; (4) predict residue accessible surface area; (5) refine 3D protein structures; (6) determine 3D protein structures and (7) characterize intrinsically disordered proteins. This review also briefly covers some of the methods that we previously developed to predict chemical shifts from 3D protein structures and/or protein sequence data. It is hoped that this review will help to increase awareness of the considerable utility of NMR chemical shifts in structural biology and facilitate more widespread adoption of chemical-shift based methods by the NMR spectroscopists, structural biologists, protein biophysicists, and biochemists worldwide. This article is part of a Special Issue entitled: Biophysics in Canada, edited by Lewis Kay, John Baenziger, Albert Berghuis and Peter Tieleman. Copyright © 2017 Elsevier B.V. All rights reserved.

  9. Some of the most interesting CASP11 targets through the eyes of their authors

    PubMed Central

    Kryshtafovych, Andriy; Moult, John; Baslé, Arnaud; Burgin, Alex; Craig, Timothy K.; Edwards, Robert A.; Fass, Deborah; Hartmann, Marcus D.; Korycinski, Mateusz; Lewis, Richard J.; Lorimer, Donald; Lupas, Andrei N.; Newman, Janet; Peat, Thomas S.; Piepenbrink, Kurt H.; Prahlad, Janani; van Raaij, Mark J.; Rohwer, Forest; Segall, Anca M.; Seguritan, Victor; Sundberg, Eric J.; Singh, Abhimanyu K.; Wilson, Mark A.

    2015-01-01

    ABSTRACT The Critical Assessment of protein Structure Prediction (CASP) experiment would not have been possible without the prediction targets provided by the experimental structural biology community. In this article, selected crystallographers providing targets for the CASP11 experiment discuss the functional and biological significance of the target proteins, highlight their most interesting structural features, and assess whether these features were correctly reproduced in the predictions submitted to CASP11. Proteins 2016; 84(Suppl 1):34–50. © 2015 The Authors. Proteins: Structure, Function, and Bioinformatics Published by Wiley Periodicals, Inc. PMID:26473983

  10. Controllable assembly and disassembly of nanoparticle systems via protein and DNA agents

    DOEpatents

    Lee, Soo-Kwan; Gang, Oleg; van der Lelie, Daniel

    2014-05-20

    The invention relates to the use of peptides, proteins, and other oligomers to provide a means by which normally quenched nanoparticle fluorescence may be recovered upon detection of a target molecule. Further, the inventive technology provides a structure and method to carry out detection of target molecules without the need to label the target molecules before detection. In another aspect, a method for forming arbitrarily shaped two- and three-dimensional protein-mediated nanoparticle structures and the resulting structures are described. Proteins mediating structure formation may themselves be functionalized with a variety of useful moieties, including catalytic functional groups.

  11. Protein docking by the interface structure similarity: how much structure is needed?

    PubMed

    Sinha, Rohita; Kundrotas, Petras J; Vakser, Ilya A

    2012-01-01

    The increasing availability of co-crystallized protein-protein complexes provides an opportunity to use template-based modeling for protein-protein docking. Structure alignment techniques are useful in detection of remote target-template similarities. The size of the structure involved in the alignment is important for the success in modeling. This paper describes a systematic large-scale study to find the optimal definition/size of the interfaces for the structure alignment-based docking applications. The results showed that structural areas corresponding to the cutoff values <12 Å across the interface inadequately represent structural details of the interfaces. With the increase of the cutoff beyond 12 Å, the success rate for the benchmark set of 99 protein complexes, did not increase significantly for higher accuracy models, and decreased for lower-accuracy models. The 12 Å cutoff was optimal in our interface alignment-based docking, and a likely best choice for the large-scale (e.g., on the scale of the entire genome) applications to protein interaction networks. The results provide guidelines for the docking approaches, including high-throughput applications to modeled structures.

  12. SInCRe—structural interactome computational resource for Mycobacterium tuberculosis

    PubMed Central

    Metri, Rahul; Hariharaputran, Sridhar; Ramakrishnan, Gayatri; Anand, Praveen; Raghavender, Upadhyayula S.; Ochoa-Montaño, Bernardo; Higueruelo, Alicia P.; Sowdhamini, Ramanathan; Chandra, Nagasuma R.; Blundell, Tom L.; Srinivasan, Narayanaswamy

    2015-01-01

    We have developed an integrated database for Mycobacterium tuberculosis H37Rv (Mtb) that collates information on protein sequences, domain assignments, functional annotation and 3D structural information along with protein–protein and protein–small molecule interactions. SInCRe (Structural Interactome Computational Resource) is developed out of CamBan (Cambridge and Bangalore) collaboration. The motivation for development of this database is to provide an integrated platform to allow easily access and interpretation of data and results obtained by all the groups in CamBan in the field of Mtb informatics. In-house algorithms and databases developed independently by various academic groups in CamBan are used to generate Mtb-specific datasets and are integrated in this database to provide a structural dimension to studies on tuberculosis. The SInCRe database readily provides information on identification of functional domains, genome-scale modelling of structures of Mtb proteins and characterization of the small-molecule binding sites within Mtb. The resource also provides structure-based function annotation, information on small-molecule binders including FDA (Food and Drug Administration)-approved drugs, protein–protein interactions (PPIs) and natural compounds that bind to pathogen proteins potentially and result in weakening or elimination of host–pathogen protein–protein interactions. Together they provide prerequisites for identification of off-target binding. Database URL: http://proline.biochem.iisc.ernet.in/sincre PMID:26130660

  13. Predicting protein-protein interactions on a proteome scale by matching evolutionary and structural similarities at interfaces using PRISM.

    PubMed

    Tuncbag, Nurcan; Gursoy, Attila; Nussinov, Ruth; Keskin, Ozlem

    2011-08-11

    Prediction of protein-protein interactions at the structural level on the proteome scale is important because it allows prediction of protein function, helps drug discovery and takes steps toward genome-wide structural systems biology. We provide a protocol (termed PRISM, protein interactions by structural matching) for large-scale prediction of protein-protein interactions and assembly of protein complex structures. The method consists of two components: rigid-body structural comparisons of target proteins to known template protein-protein interfaces and flexible refinement using a docking energy function. The PRISM rationale follows our observation that globally different protein structures can interact via similar architectural motifs. PRISM predicts binding residues by using structural similarity and evolutionary conservation of putative binding residue 'hot spots'. Ultimately, PRISM could help to construct cellular pathways and functional, proteome-scale annotation. PRISM is implemented in Python and runs in a UNIX environment. The program accepts Protein Data Bank-formatted protein structures and is available at http://prism.ccbb.ku.edu.tr/prism_protocol/.

  14. Dynameomics: data-driven methods and models for utilizing large-scale protein structure repositories for improving fragment-based loop prediction.

    PubMed

    Rysavy, Steven J; Beck, David A C; Daggett, Valerie

    2014-11-01

    Protein function is intimately linked to protein structure and dynamics yet experimentally determined structures frequently omit regions within a protein due to indeterminate data, which is often due protein dynamics. We propose that atomistic molecular dynamics simulations provide a diverse sampling of biologically relevant structures for these missing segments (and beyond) to improve structural modeling and structure prediction. Here we make use of the Dynameomics data warehouse, which contains simulations of representatives of essentially all known protein folds. We developed novel computational methods to efficiently identify, rank and retrieve small peptide structures, or fragments, from this database. We also created a novel data model to analyze and compare large repositories of structural data, such as contained within the Protein Data Bank and the Dynameomics data warehouse. Our evaluation compares these structural repositories for improving loop predictions and analyzes the utility of our methods and models. Using a standard set of loop structures, containing 510 loops, 30 for each loop length from 4 to 20 residues, we find that the inclusion of Dynameomics structures in fragment-based methods improves the quality of the loop predictions without being dependent on sequence homology. Depending on loop length, ∼ 25-75% of the best predictions came from the Dynameomics set, resulting in lower main chain root-mean-square deviations for all fragment lengths using the combined fragment library. We also provide specific cases where Dynameomics fragments provide better predictions for NMR loop structures than fragments from crystal structures. Online access to these fragment libraries is available at http://www.dynameomics.org/fragments. © 2014 The Protein Society.

  15. DichroMatch at the protein circular dichroism data bank (DM@PCDDB): A web-based tool for identifying protein nearest neighbors using circular dichroism spectroscopy.

    PubMed

    Whitmore, Lee; Mavridis, Lazaros; Wallace, B A; Janes, Robert W

    2018-01-01

    Circular dichroism spectroscopy is a well-used, but simple method in structural biology for providing information on the secondary structure and folds of proteins. DichroMatch (DM@PCDDB) is an online tool that is newly available in the Protein Circular Dichroism Data Bank (PCDDB), which takes advantage of the wealth of spectral and metadata deposited therein, to enable identification of spectral nearest neighbors of a query protein based on four different methods of spectral matching. DM@PCDDB can potentially provide novel information about structural relationships between proteins and can be used in comparison studies of protein homologs and orthologs. © 2017 The Authors Protein Science published by Wiley Periodicals, Inc. on behalf of The Protein Society.

  16. Amaranth, quinoa and chia protein isolates: Physicochemical and structural properties.

    PubMed

    López, Débora N; Galante, Micaela; Robson, María; Boeris, Valeria; Spelzini, Darío

    2018-04-01

    An increasing use of vegetable protein is required to support the production of protein-rich foods which can replace animal proteins in the human diet. Amaranth, chia and quinoa seeds contain proteins which have biological and functional properties that provide nutritional benefits due to their reasonably well-balanced aminoacid content. This review analyses these vegetable proteins and focuses on recent research on protein classification and isolation as well as structural characterization by means of fluorescence spectroscopy, surface hydrophobicity and differential scanning calorimetry. Isolation procedures have a profound influence on the structural properties of the proteins and, therefore, on their in vitro digestibility. The present article provides a comprehensive overview of the properties and characterization of these proteins. Copyright © 2017 Elsevier B.V. All rights reserved.

  17. G2S: a web-service for annotating genomic variants on 3D protein structures.

    PubMed

    Wang, Juexin; Sheridan, Robert; Sumer, S Onur; Schultz, Nikolaus; Xu, Dong; Gao, Jianjiong

    2018-06-01

    Accurately mapping and annotating genomic locations on 3D protein structures is a key step in structure-based analysis of genomic variants detected by recent large-scale sequencing efforts. There are several mapping resources currently available, but none of them provides a web API (Application Programming Interface) that supports programmatic access. We present G2S, a real-time web API that provides automated mapping of genomic variants on 3D protein structures. G2S can align genomic locations of variants, protein locations, or protein sequences to protein structures and retrieve the mapped residues from structures. G2S API uses REST-inspired design and it can be used by various clients such as web browsers, command terminals, programming languages and other bioinformatics tools for bringing 3D structures into genomic variant analysis. The webserver and source codes are freely available at https://g2s.genomenexus.org. g2s@genomenexus.org. Supplementary data are available at Bioinformatics online.

  18. Use of designed sequences in protein structure recognition.

    PubMed

    Kumar, Gayatri; Mudgal, Richa; Srinivasan, Narayanaswamy; Sandhya, Sankaran

    2018-05-09

    Knowledge of the protein structure is a pre-requisite for improved understanding of molecular function. The gap in the sequence-structure space has increased in the post-genomic era. Grouping related protein sequences into families can aid in narrowing the gap. In the Pfam database, structure description is provided for part or full-length proteins of 7726 families. For the remaining 52% of the families, information on 3-D structure is not yet available. We use the computationally designed sequences that are intermediately related to two protein domain families, which are already known to share the same fold. These strategically designed sequences enable detection of distant relationships and here, we have employed them for the purpose of structure recognition of protein families of yet unknown structure. We first measured the success rate of our approach using a dataset of protein families of known fold and achieved a success rate of 88%. Next, for 1392 families of yet unknown structure, we made structural assignments for part/full length of the proteins. Fold association for 423 domains of unknown function (DUFs) are provided as a step towards functional annotation. The results indicate that knowledge-based filling of gaps in protein sequence space is a lucrative approach for structure recognition. Such sequences assist in traversal through protein sequence space and effectively function as 'linkers', where natural linkers between distant proteins are unavailable. This article was reviewed by Oliviero Carugo, Christine Orengo and Srikrishna Subramanian.

  19. SFG analysis of surface bound proteins: a route towards structure determination.

    PubMed

    Weidner, Tobias; Castner, David G

    2013-08-14

    The surface of a material is rapidly covered with proteins once that material is placed in a biological environment. The structure and function of these bound proteins play a key role in the interactions and communications of the material with the biological environment. Thus, it is crucial to gain a molecular level understanding of surface bound protein structure. While X-ray diffraction and solution phase NMR methods are well established for determining the structure of proteins in the crystalline or solution phase, there is not a corresponding single technique that can provide the same level of structural detail about proteins at surfaces or interfaces. However, recent advances in sum frequency generation (SFG) vibrational spectroscopy have significantly increased our ability to obtain structural information about surface bound proteins and peptides. A multi-technique approach of combining SFG with (1) protein engineering methods to selectively introduce mutations and isotopic labels, (2) other experimental methods such as time-of-flight secondary ion mass spectrometry (ToF-SIMS) and near edge X-ray absorption fine structure (NEXAFS) to provide complementary information, and (3) molecular dynamic (MD) simulations to extend the molecular level experimental results is a particularly promising route for structural characterization of surface bound proteins and peptides. By using model peptides and small proteins with well-defined structures, methods have been developed to determine the orientation of both backbone and side chains to the surface.

  20. SFG analysis of surface bound proteins: A route towards structure determination

    PubMed Central

    Weidner, Tobias; Castner, David G.

    2013-01-01

    The surface of a material is rapidly covered with proteins once that material is placed in a biological environment. The structure and function of these bound proteins play a key role in the interactions and communications of the material with the biological environment. Thus, it is crucial to gain a molecular level understanding of surface bound protein structure. While X-ray diffraction and solution phase NMR methods are well established for determining the structure of proteins in the crystalline or solution phase, there is not a corresponding single technique that can provide the same level of structural detail about proteins at surfaces or interfaces. However, recent advances in sum frequency generation (SFG) vibrational spectroscopy have significantly increased our ability to obtain structural information about surface bound proteins and peptides. A multi-technique approach of combining SFG with (1) protein engineering methods to selectively introduce mutations and isotopic labels, (2) other experimental methods such as time-of-flight secondary ion mass spectrometry (ToF-SIMS) and near edge x-ray absorption fine structure (NEXAFS) to provide complementary information, and (3) molecular dynamic (MD) simulations to extend the molecular level experimental results is a particularly promising route for structural characterization of surface bound proteins and peptides. By using model peptides and small proteins with well-defined structures, methods have been developed to determine the orientation of both backbone and side chains to the surface. PMID:23727992

  1. Structure-Based Characterization of Multiprotein Complexes

    PubMed Central

    Wiederstein, Markus; Gruber, Markus; Frank, Karl; Melo, Francisco; Sippl, Manfred J.

    2014-01-01

    Summary Multiprotein complexes govern virtually all cellular processes. Their 3D structures provide important clues to their biological roles, especially through structural correlations among protein molecules and complexes. The detection of such correlations generally requires comprehensive searches in databases of known protein structures by means of appropriate structure-matching techniques. Here, we present a high-speed structure search engine capable of instantly matching large protein oligomers against the complete and up-to-date database of biologically functional assemblies of protein molecules. We use this tool to reveal unseen structural correlations on the level of protein quaternary structure and demonstrate its general usefulness for efficiently exploring complex structural relationships among known protein assemblies. PMID:24954616

  2. An Augmented Pocketome: Detection and Analysis of Small-Molecule Binding Pockets in Proteins of Known 3D Structure.

    PubMed

    Bhagavat, Raghu; Sankar, Santhosh; Srinivasan, Narayanaswamy; Chandra, Nagasuma

    2018-03-06

    Protein-ligand interactions form the basis of most cellular events. Identifying ligand binding pockets in proteins will greatly facilitate rationalizing and predicting protein function. Ligand binding sites are unknown for many proteins of known three-dimensional (3D) structure, creating a gap in our understanding of protein structure-function relationships. To bridge this gap, we detect pockets in proteins of known 3D structures, using computational techniques. This augmented pocketome (PocketDB) consists of 249,096 pockets, which is about seven times larger than what is currently known. We deduce possible ligand associations for about 46% of the newly identified pockets. The augmented pocketome, when subjected to clustering based on similarities among pockets, yielded 2,161 site types, which are associated with 1,037 ligand types, together providing fold-site-type-ligand-type associations. The PocketDB resource facilitates a structure-based function annotation, delineation of the structural basis of ligand recognition, and provides functional clues for domains of unknown functions, allosteric proteins, and druggable pockets. Copyright © 2018 Elsevier Ltd. All rights reserved.

  3. Modularity of Protein Folds as a Tool for Template-Free Modeling of Structures.

    PubMed

    Vallat, Brinda; Madrid-Aliste, Carlos; Fiser, Andras

    2015-08-01

    Predicting the three-dimensional structure of proteins from their amino acid sequences remains a challenging problem in molecular biology. While the current structural coverage of proteins is almost exclusively provided by template-based techniques, the modeling of the rest of the protein sequences increasingly require template-free methods. However, template-free modeling methods are much less reliable and are usually applicable for smaller proteins, leaving much space for improvement. We present here a novel computational method that uses a library of supersecondary structure fragments, known as Smotifs, to model protein structures. The library of Smotifs has saturated over time, providing a theoretical foundation for efficient modeling. The method relies on weak sequence signals from remotely related protein structures to create a library of Smotif fragments specific to the target protein sequence. This Smotif library is exploited in a fragment assembly protocol to sample decoys, which are assessed by a composite scoring function. Since the Smotif fragments are larger in size compared to the ones used in other fragment-based methods, the proposed modeling algorithm, SmotifTF, can employ an exhaustive sampling during decoy assembly. SmotifTF successfully predicts the overall fold of the target proteins in about 50% of the test cases and performs competitively when compared to other state of the art prediction methods, especially when sequence signal to remote homologs is diminishing. Smotif-based modeling is complementary to current prediction methods and provides a promising direction in addressing the structure prediction problem, especially when targeting larger proteins for modeling.

  4. Visualisation of variable binding pockets on protein surfaces by probabilistic analysis of related structure sets.

    PubMed

    Ashford, Paul; Moss, David S; Alex, Alexander; Yeap, Siew K; Povia, Alice; Nobeli, Irene; Williams, Mark A

    2012-03-14

    Protein structures provide a valuable resource for rational drug design. For a protein with no known ligand, computational tools can predict surface pockets that are of suitable size and shape to accommodate a complementary small-molecule drug. However, pocket prediction against single static structures may miss features of pockets that arise from proteins' dynamic behaviour. In particular, ligand-binding conformations can be observed as transiently populated states of the apo protein, so it is possible to gain insight into ligand-bound forms by considering conformational variation in apo proteins. This variation can be explored by considering sets of related structures: computationally generated conformers, solution NMR ensembles, multiple crystal structures, homologues or homology models. It is non-trivial to compare pockets, either from different programs or across sets of structures. For a single structure, difficulties arise in defining particular pocket's boundaries. For a set of conformationally distinct structures the challenge is how to make reasonable comparisons between them given that a perfect structural alignment is not possible. We have developed a computational method, Provar, that provides a consistent representation of predicted binding pockets across sets of related protein structures. The outputs are probabilities that each atom or residue of the protein borders a predicted pocket. These probabilities can be readily visualised on a protein using existing molecular graphics software. We show how Provar simplifies comparison of the outputs of different pocket prediction algorithms, of pockets across multiple simulated conformations and between homologous structures. We demonstrate the benefits of use of multiple structures for protein-ligand and protein-protein interface analysis on a set of complexes and consider three case studies in detail: i) analysis of a kinase superfamily highlights the conserved occurrence of surface pockets at the active and regulatory sites; ii) a simulated ensemble of unliganded Bcl2 structures reveals extensions of a known ligand-binding pocket not apparent in the apo crystal structure; iii) visualisations of interleukin-2 and its homologues highlight conserved pockets at the known receptor interfaces and regions whose conformation is known to change on inhibitor binding. Through post-processing of the output of a variety of pocket prediction software, Provar provides a flexible approach to the analysis and visualization of the persistence or variability of pockets in sets of related protein structures.

  5. Protein Structure Determination using Metagenome sequence data

    PubMed Central

    Ovchinnikov, Sergey; Park, Hahnbeom; Varghese, Neha; Huang, Po-Ssu; Pavlopoulos, Georgios A.; Kim, David E.; Kamisetty, Hetunandan; Kyrpides, Nikos C.; Baker, David

    2017-01-01

    Despite decades of work by structural biologists, there are still ~5200 protein families with unknown structure outside the range of comparative modeling. We show that Rosetta structure prediction guided by residue-residue contacts inferred from evolutionary information can accurately model proteins that belong to large families, and that metagenome sequence data more than triples the number of protein families with sufficient sequences for accurate modeling. We then integrate metagenome data, contact based structure matching and Rosetta structure calculations to generate models for 614 protein families with currently unknown structures; 206 are membrane proteins and 137 have folds not represented in the PDB. This approach provides the representative models for large protein families originally envisioned as the goal of the protein structure initiative at a fraction of the cost. PMID:28104891

  6. Deformation and Failure of Protein Materials in Physiologically Extreme Conditions and Disease

    DTIC Science & Technology

    2009-03-01

    resonance (NMR) spectroscopy and X- ray crystallography have advanced our ability to identify 3D protein structures57. Site-specific studies using NMR, a... ray crystallography, providing structural and temporal information about mechanisms of deformation and assembly (for example in intermediate...tens of thousands of 3D atomistic protein structures, identifying the structure of numerous proteins from varying species sources60. X- ray

  7. StralSV: assessment of sequence variability within similar 3D structures and application to polio RNA-dependent RNA polymerase.

    PubMed

    Zemla, Adam T; Lang, Dorothy M; Kostova, Tanya; Andino, Raul; Ecale Zhou, Carol L

    2011-06-02

    Most of the currently used methods for protein function prediction rely on sequence-based comparisons between a query protein and those for which a functional annotation is provided. A serious limitation of sequence similarity-based approaches for identifying residue conservation among proteins is the low confidence in assigning residue-residue correspondences among proteins when the level of sequence identity between the compared proteins is poor. Multiple sequence alignment methods are more satisfactory--still, they cannot provide reliable results at low levels of sequence identity. Our goal in the current work was to develop an algorithm that could help overcome these difficulties by facilitating the identification of structurally (and possibly functionally) relevant residue-residue correspondences between compared protein structures. Here we present StralSV (structure-alignment sequence variability), a new algorithm for detecting closely related structure fragments and quantifying residue frequency from tight local structure alignments. We apply StralSV in a study of the RNA-dependent RNA polymerase of poliovirus, and we demonstrate that the algorithm can be used to determine regions of the protein that are relatively unique, or that share structural similarity with proteins that would be considered distantly related. By quantifying residue frequencies among many residue-residue pairs extracted from local structural alignments, one can infer potential structural or functional importance of specific residues that are determined to be highly conserved or that deviate from a consensus. We further demonstrate that considerable detailed structural and phylogenetic information can be derived from StralSV analyses. StralSV is a new structure-based algorithm for identifying and aligning structure fragments that have similarity to a reference protein. StralSV analysis can be used to quantify residue-residue correspondences and identify residues that may be of particular structural or functional importance, as well as unusual or unexpected residues at a given sequence position. StralSV is provided as a web service at http://proteinmodel.org/AS2TS/STRALSV/.

  8. WEBnm@ v2.0: Web server and services for comparing protein flexibility.

    PubMed

    Tiwari, Sandhya P; Fuglebakk, Edvin; Hollup, Siv M; Skjærven, Lars; Cragnolini, Tristan; Grindhaug, Svenn H; Tekle, Kidane M; Reuter, Nathalie

    2014-12-30

    Normal mode analysis (NMA) using elastic network models is a reliable and cost-effective computational method to characterise protein flexibility and by extension, their dynamics. Further insight into the dynamics-function relationship can be gained by comparing protein motions between protein homologs and functional classifications. This can be achieved by comparing normal modes obtained from sets of evolutionary related proteins. We have developed an automated tool for comparative NMA of a set of pre-aligned protein structures. The user can submit a sequence alignment in the FASTA format and the corresponding coordinate files in the Protein Data Bank (PDB) format. The computed normalised squared atomic fluctuations and atomic deformation energies of the submitted structures can be easily compared on graphs provided by the web user interface. The web server provides pairwise comparison of the dynamics of all proteins included in the submitted set using two measures: the Root Mean Squared Inner Product and the Bhattacharyya Coefficient. The Comparative Analysis has been implemented on our web server for NMA, WEBnm@, which also provides recently upgraded functionality for NMA of single protein structures. This includes new visualisations of protein motion, visualisation of inter-residue correlations and the analysis of conformational change using the overlap analysis. In addition, programmatic access to WEBnm@ is now available through a SOAP-based web service. Webnm@ is available at http://apps.cbu.uib.no/webnma . WEBnm@ v2.0 is an online tool offering unique capability for comparative NMA on multiple protein structures. Along with a convenient web interface, powerful computing resources, and several methods for mode analyses, WEBnm@ facilitates the assessment of protein flexibility within protein families and superfamilies. These analyses can give a good view of how the structures move and how the flexibility is conserved over the different structures.

  9. X-ray laser diffraction for structure determination of the rhodopsin-arrestin complex

    NASA Astrophysics Data System (ADS)

    Zhou, X. Edward; Gao, Xiang; Barty, Anton; Kang, Yanyong; He, Yuanzheng; Liu, Wei; Ishchenko, Andrii; White, Thomas A.; Yefanov, Oleksandr; Han, Gye Won; Xu, Qingping; de Waal, Parker W.; Suino-Powell, Kelly M.; Boutet, Sébastien; Williams, Garth J.; Wang, Meitian; Li, Dianfan; Caffrey, Martin; Chapman, Henry N.; Spence, John C. H.; Fromme, Petra; Weierstall, Uwe; Stevens, Raymond C.; Cherezov, Vadim; Melcher, Karsten; Xu, H. Eric

    2016-04-01

    Serial femtosecond X-ray crystallography (SFX) using an X-ray free electron laser (XFEL) is a recent advancement in structural biology for solving crystal structures of challenging membrane proteins, including G-protein coupled receptors (GPCRs), which often only produce microcrystals. An XFEL delivers highly intense X-ray pulses of femtosecond duration short enough to enable the collection of single diffraction images before significant radiation damage to crystals sets in. Here we report the deposition of the XFEL data and provide further details on crystallization, XFEL data collection and analysis, structure determination, and the validation of the structural model. The rhodopsin-arrestin crystal structure solved with SFX represents the first near-atomic resolution structure of a GPCR-arrestin complex, provides structural insights into understanding of arrestin-mediated GPCR signaling, and demonstrates the great potential of this SFX-XFEL technology for accelerating crystal structure determination of challenging proteins and protein complexes.

  10. X-ray laser diffraction for structure determination of the rhodopsin-arrestin complex.

    PubMed

    Zhou, X Edward; Gao, Xiang; Barty, Anton; Kang, Yanyong; He, Yuanzheng; Liu, Wei; Ishchenko, Andrii; White, Thomas A; Yefanov, Oleksandr; Han, Gye Won; Xu, Qingping; de Waal, Parker W; Suino-Powell, Kelly M; Boutet, Sébastien; Williams, Garth J; Wang, Meitian; Li, Dianfan; Caffrey, Martin; Chapman, Henry N; Spence, John C H; Fromme, Petra; Weierstall, Uwe; Stevens, Raymond C; Cherezov, Vadim; Melcher, Karsten; Xu, H Eric

    2016-04-12

    Serial femtosecond X-ray crystallography (SFX) using an X-ray free electron laser (XFEL) is a recent advancement in structural biology for solving crystal structures of challenging membrane proteins, including G-protein coupled receptors (GPCRs), which often only produce microcrystals. An XFEL delivers highly intense X-ray pulses of femtosecond duration short enough to enable the collection of single diffraction images before significant radiation damage to crystals sets in. Here we report the deposition of the XFEL data and provide further details on crystallization, XFEL data collection and analysis, structure determination, and the validation of the structural model. The rhodopsin-arrestin crystal structure solved with SFX represents the first near-atomic resolution structure of a GPCR-arrestin complex, provides structural insights into understanding of arrestin-mediated GPCR signaling, and demonstrates the great potential of this SFX-XFEL technology for accelerating crystal structure determination of challenging proteins and protein complexes.

  11. X-ray laser diffraction for structure determination of the rhodopsin-arrestin complex

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhou, X. Edward; Gao, Xiang; Barty, Anton

    Here, serial femtosecond X-ray crystallography (SFX) using an X-ray free electron laser (XFEL) is a recent advancement in structural biology for solving crystal structures of challenging membrane proteins, including G-protein coupled receptors (GPCRs), which often only produce microcrystals. An XFEL delivers highly intense X-ray pulses of femtosecond duration short enough to enable the collection of single diffraction images before significant radiation damage to crystals sets in. Here we report the deposition of the XFEL data and provide further details on crystallization, XFEL data collection and analysis, structure determination, and the validation of the structural model. The rhodopsin-arrestin crystal structure solvedmore » with SFX represents the first near-atomic resolution structure of a GPCR-arrestin complex, provides structural insights into understanding of arrestin-mediated GPCR signaling, and demonstrates the great potential of this SFX-XFEL technology for accelerating crystal structure determination of challenging proteins and protein complexes.« less

  12. X-ray laser diffraction for structure determination of the rhodopsin-arrestin complex

    PubMed Central

    Zhou, X. Edward; Gao, Xiang; Barty, Anton; Kang, Yanyong; He, Yuanzheng; Liu, Wei; Ishchenko, Andrii; White, Thomas A.; Yefanov, Oleksandr; Han, Gye Won; Xu, Qingping; de Waal, Parker W.; Suino-Powell, Kelly M.; Boutet, Sébastien; Williams, Garth J.; Wang, Meitian; Li, Dianfan; Caffrey, Martin; Chapman, Henry N.; Spence, John C.H.; Fromme, Petra; Weierstall, Uwe; Stevens, Raymond C.; Cherezov, Vadim; Melcher, Karsten; Xu, H. Eric

    2016-01-01

    Serial femtosecond X-ray crystallography (SFX) using an X-ray free electron laser (XFEL) is a recent advancement in structural biology for solving crystal structures of challenging membrane proteins, including G-protein coupled receptors (GPCRs), which often only produce microcrystals. An XFEL delivers highly intense X-ray pulses of femtosecond duration short enough to enable the collection of single diffraction images before significant radiation damage to crystals sets in. Here we report the deposition of the XFEL data and provide further details on crystallization, XFEL data collection and analysis, structure determination, and the validation of the structural model. The rhodopsin-arrestin crystal structure solved with SFX represents the first near-atomic resolution structure of a GPCR-arrestin complex, provides structural insights into understanding of arrestin-mediated GPCR signaling, and demonstrates the great potential of this SFX-XFEL technology for accelerating crystal structure determination of challenging proteins and protein complexes. PMID:27070998

  13. X-ray laser diffraction for structure determination of the rhodopsin-arrestin complex

    DOE PAGES

    Zhou, X. Edward; Gao, Xiang; Barty, Anton; ...

    2016-04-12

    Here, serial femtosecond X-ray crystallography (SFX) using an X-ray free electron laser (XFEL) is a recent advancement in structural biology for solving crystal structures of challenging membrane proteins, including G-protein coupled receptors (GPCRs), which often only produce microcrystals. An XFEL delivers highly intense X-ray pulses of femtosecond duration short enough to enable the collection of single diffraction images before significant radiation damage to crystals sets in. Here we report the deposition of the XFEL data and provide further details on crystallization, XFEL data collection and analysis, structure determination, and the validation of the structural model. The rhodopsin-arrestin crystal structure solvedmore » with SFX represents the first near-atomic resolution structure of a GPCR-arrestin complex, provides structural insights into understanding of arrestin-mediated GPCR signaling, and demonstrates the great potential of this SFX-XFEL technology for accelerating crystal structure determination of challenging proteins and protein complexes.« less

  14. Tandem Repeat Proteins Inspired By Squid Ring Teeth

    NASA Astrophysics Data System (ADS)

    Pena-Francesch, Abdon

    Proteins are large biomolecules consisting of long chains of amino acids that hierarchically assemble into complex structures, and provide a variety of building blocks for biological materials. The repetition of structural building blocks is a natural evolutionary strategy for increasing the complexity and stability of protein structures. However, the relationship between amino acid sequence, structure, and material properties of protein systems remains unclear due to the lack of control over the protein sequence and the intricacies of the assembly process. In order to investigate the repetition of protein building blocks, a recently discovered protein from squids is examined as an ideal protein system. Squid ring teeth are predatory appendages located inside the suction cups that provide a strong grasp of prey, and are solely composed of a group of proteins with tandem repetition of building blocks. The objective of this thesis is the understanding of sequence, structure and property relationship in repetitive protein materials inspired in squid ring teeth for the first time. Specifically, this work focuses on squid-inspired structural proteins with tandem repeat units in their sequence (i.e., repetition of alternating building blocks) that are physically cross-linked via beta-sheet structures. The research work presented here tests the hypothesis that, in these systems, increasing the number of building blocks in the polypeptide chain decreases the protein network defects and improves the material properties. Hence, the sequence, nanostructure, and properties (thermal, mechanical, and conducting) of tandem repeat squid-inspired protein materials are examined. Spectroscopic structural analysis, advanced materials characterization, and entropic elasticity theory are combined to elucidate the structure and material properties of these repetitive proteins. This approach is applied not only to native squid proteins but also to squid-inspired synthetic polypeptides that allow for a fine control of the sequence and network morphology. The results provided in this work establish a clear dependence between the repetitive building blocks, the network morphology, and the properties of squid-inspired repetitive protein materials. Increasing the number of tandem repeat units in SRT-inspired proteins led to more effective protein networks with superior properties. Through increasing tandem repetition and optimization of network morphology, highly efficient protein materials capable of withstanding deformations up to 400% of their original length, with MPa-GPa modulus, high energy absorption (50 MJ m-3), peak proton conductivity of 3.7 mS cm-1 (at pH 7, highest reported to date for biological materials), and peak thermal conductivity of 1.4 W m-1 K -1 (which exceeds that of most polymer materials) were developed. These findings introduce new design rules in the engineering of proteins based on tandem repetition and morphology control, and provide a novel framework for tailoring and optimizing the properties of protein-based materials.

  15. Dynameomics: Data-driven methods and models for utilizing large-scale protein structure repositories for improving fragment-based loop prediction

    PubMed Central

    Rysavy, Steven J; Beck, David AC; Daggett, Valerie

    2014-01-01

    Protein function is intimately linked to protein structure and dynamics yet experimentally determined structures frequently omit regions within a protein due to indeterminate data, which is often due protein dynamics. We propose that atomistic molecular dynamics simulations provide a diverse sampling of biologically relevant structures for these missing segments (and beyond) to improve structural modeling and structure prediction. Here we make use of the Dynameomics data warehouse, which contains simulations of representatives of essentially all known protein folds. We developed novel computational methods to efficiently identify, rank and retrieve small peptide structures, or fragments, from this database. We also created a novel data model to analyze and compare large repositories of structural data, such as contained within the Protein Data Bank and the Dynameomics data warehouse. Our evaluation compares these structural repositories for improving loop predictions and analyzes the utility of our methods and models. Using a standard set of loop structures, containing 510 loops, 30 for each loop length from 4 to 20 residues, we find that the inclusion of Dynameomics structures in fragment-based methods improves the quality of the loop predictions without being dependent on sequence homology. Depending on loop length, ∼25–75% of the best predictions came from the Dynameomics set, resulting in lower main chain root-mean-square deviations for all fragment lengths using the combined fragment library. We also provide specific cases where Dynameomics fragments provide better predictions for NMR loop structures than fragments from crystal structures. Online access to these fragment libraries is available at http://www.dynameomics.org/fragments. PMID:25142412

  16. HDAPD: a web tool for searching the disease-associated protein structures

    PubMed Central

    2010-01-01

    Background The protein structures of the disease-associated proteins are important for proceeding with the structure-based drug design to against a particular disease. Up until now, proteins structures are usually searched through a PDB id or some sequence information. However, in the HDAPD database presented here the protein structure of a disease-associated protein can be directly searched through the associated disease name keyed in. Description The search in HDAPD can be easily initiated by keying some key words of a disease, protein name, protein type, or PDB id. The protein sequence can be presented in FASTA format and directly copied for a BLAST search. HDAPD is also interfaced with Jmol so that users can observe and operate a protein structure with Jmol. The gene ontological data such as cellular components, molecular functions, and biological processes are provided once a hyperlink to Gene Ontology (GO) is clicked. Further, HDAPD provides a link to the KEGG map such that where the protein is placed and its relationship with other proteins in a metabolic pathway can be found from the map. The latest literatures namely titles, journals, authors, and abstracts searched from PubMed for the protein are also presented as a length controllable list. Conclusions Since the HDAPD data content can be routinely updated through a PHP-MySQL web page built, the new database presented is useful for searching the structures for some disease-associated proteins that may play important roles in the disease developing process for performing the structure-based drug design to against the diseases. PMID:20158919

  17. Mixture models for protein structure ensembles.

    PubMed

    Hirsch, Michael; Habeck, Michael

    2008-10-01

    Protein structure ensembles provide important insight into the dynamics and function of a protein and contain information that is not captured with a single static structure. However, it is not clear a priori to what extent the variability within an ensemble is caused by internal structural changes. Additional variability results from overall translations and rotations of the molecule. And most experimental data do not provide information to relate the structures to a common reference frame. To report meaningful values of intrinsic dynamics, structural precision, conformational entropy, etc., it is therefore important to disentangle local from global conformational heterogeneity. We consider the task of disentangling local from global heterogeneity as an inference problem. We use probabilistic methods to infer from the protein ensemble missing information on reference frames and stable conformational sub-states. To this end, we model a protein ensemble as a mixture of Gaussian probability distributions of either entire conformations or structural segments. We learn these models from a protein ensemble using the expectation-maximization algorithm. Our first model can be used to find multiple conformers in a structure ensemble. The second model partitions the protein chain into locally stable structural segments or core elements and less structured regions typically found in loops. Both models are simple to implement and contain only a single free parameter: the number of conformers or structural segments. Our models can be used to analyse experimental ensembles, molecular dynamics trajectories and conformational change in proteins. The Python source code for protein ensemble analysis is available from the authors upon request.

  18. An affinity-structure database of helix-turn-helix: DNA complexes with a universal coordinate system.

    PubMed

    AlQuraishi, Mohammed; Tang, Shengdong; Xia, Xide

    2015-11-19

    Molecular interactions between proteins and DNA molecules underlie many cellular processes, including transcriptional regulation, chromosome replication, and nucleosome positioning. Computational analyses of protein-DNA interactions rely on experimental data characterizing known protein-DNA interactions structurally and biochemically. While many databases exist that contain either structural or biochemical data, few integrate these two data sources in a unified fashion. Such integration is becoming increasingly critical with the rapid growth of structural and biochemical data, and the emergence of algorithms that rely on the synthesis of multiple data types to derive computational models of molecular interactions. We have developed an integrated affinity-structure database in which the experimental and quantitative DNA binding affinities of helix-turn-helix proteins are mapped onto the crystal structures of the corresponding protein-DNA complexes. This database provides access to: (i) protein-DNA structures, (ii) quantitative summaries of protein-DNA binding affinities using position weight matrices, and (iii) raw experimental data of protein-DNA binding instances. Critically, this database establishes a correspondence between experimental structural data and quantitative binding affinity data at the single basepair level. Furthermore, we present a novel alignment algorithm that structurally aligns the protein-DNA complexes in the database and creates a unified residue-level coordinate system for comparing the physico-chemical environments at the interface between complexes. Using this unified coordinate system, we compute the statistics of atomic interactions at the protein-DNA interface of helix-turn-helix proteins. We provide an interactive website for visualization, querying, and analyzing this database, and a downloadable version to facilitate programmatic analysis. This database will facilitate the analysis of protein-DNA interactions and the development of programmatic computational methods that capitalize on integration of structural and biochemical datasets. The database can be accessed at http://ProteinDNA.hms.harvard.edu.

  19. Structure-based characterization of multiprotein complexes.

    PubMed

    Wiederstein, Markus; Gruber, Markus; Frank, Karl; Melo, Francisco; Sippl, Manfred J

    2014-07-08

    Multiprotein complexes govern virtually all cellular processes. Their 3D structures provide important clues to their biological roles, especially through structural correlations among protein molecules and complexes. The detection of such correlations generally requires comprehensive searches in databases of known protein structures by means of appropriate structure-matching techniques. Here, we present a high-speed structure search engine capable of instantly matching large protein oligomers against the complete and up-to-date database of biologically functional assemblies of protein molecules. We use this tool to reveal unseen structural correlations on the level of protein quaternary structure and demonstrate its general usefulness for efficiently exploring complex structural relationships among known protein assemblies. Copyright © 2014 The Authors. Published by Elsevier Inc. All rights reserved.

  20. PhyreStorm: A Web Server for Fast Structural Searches Against the PDB.

    PubMed

    Mezulis, Stefans; Sternberg, Michael J E; Kelley, Lawrence A

    2016-02-22

    The identification of structurally similar proteins can provide a range of biological insights, and accordingly, the alignment of a query protein to a database of experimentally determined protein structures is a technique commonly used in the fields of structural and evolutionary biology. The PhyreStorm Web server has been designed to provide comprehensive, up-to-date and rapid structural comparisons against the Protein Data Bank (PDB) combined with a rich and intuitive user interface. It is intended that this facility will enable biologists inexpert in bioinformatics access to a powerful tool for exploring protein structure relationships beyond what can be achieved by sequence analysis alone. By partitioning the PDB into similar structures, PhyreStorm is able to quickly discard the majority of structures that cannot possibly align well to a query protein, reducing the number of alignments required by an order of magnitude. PhyreStorm is capable of finding 93±2% of all highly similar (TM-score>0.7) structures in the PDB for each query structure, usually in less than 60s. PhyreStorm is available at http://www.sbg.bio.ic.ac.uk/phyrestorm/. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.

  1. Optimization of protein-protein docking for predicting Fc-protein interactions.

    PubMed

    Agostino, Mark; Mancera, Ricardo L; Ramsland, Paul A; Fernández-Recio, Juan

    2016-11-01

    The antibody crystallizable fragment (Fc) is recognized by effector proteins as part of the immune system. Pathogens produce proteins that bind Fc in order to subvert or evade the immune response. The structural characterization of the determinants of Fc-protein association is essential to improve our understanding of the immune system at the molecular level and to develop new therapeutic agents. Furthermore, Fc-binding peptides and proteins are frequently used to purify therapeutic antibodies. Although several structures of Fc-protein complexes are available, numerous others have not yet been determined. Protein-protein docking could be used to investigate Fc-protein complexes; however, improved approaches are necessary to efficiently model such cases. In this study, a docking-based structural bioinformatics approach is developed for predicting the structures of Fc-protein complexes. Based on the available set of X-ray structures of Fc-protein complexes, three regions of the Fc, loosely corresponding to three turns within the structure, were defined as containing the essential features for protein recognition and used as restraints to filter the initial docking search. Rescoring the filtered poses with an optimal scoring strategy provided a success rate of approximately 80% of the test cases examined within the top ranked 20 poses, compared to approximately 20% by the initial unrestrained docking. The developed docking protocol provides a significant improvement over the initial unrestrained docking and will be valuable for predicting the structures of currently undetermined Fc-protein complexes, as well as in the design of peptides and proteins that target Fc. Copyright © 2016 John Wiley & Sons, Ltd.

  2. Discrete Molecular Dynamics Approach to the Study of Disordered and Aggregating Proteins.

    PubMed

    Emperador, Agustí; Orozco, Modesto

    2017-03-14

    We present a refinement of the Coarse Grained PACSAB force field for Discrete Molecular Dynamics (DMD) simulations of proteins in aqueous conditions. As the original version, the refined method provides good representation of the structure and dynamics of folded proteins but provides much better representations of a variety of unfolded proteins, including some very large, impossible to analyze by atomistic simulation methods. The PACSAB/DMD method also reproduces accurately aggregation properties, providing good pictures of the structural ensembles of proteins showing a folded core and an intrinsically disordered region. The combination of accuracy and speed makes the method presented here a good alternative for the exploration of unstructured protein systems.

  3. SDSL-ESR-based protein structure characterization.

    PubMed

    Strancar, Janez; Kavalenka, Aleh; Urbancic, Iztok; Ljubetic, Ajasja; Hemminga, Marcus A

    2010-03-01

    As proteins are key molecules in living cells, knowledge about their structure can provide important insights and applications in science, biotechnology, and medicine. However, many protein structures are still a big challenge for existing high-resolution structure-determination methods, as can be seen in the number of protein structures published in the Protein Data Bank. This is especially the case for less-ordered, more hydrophobic and more flexible protein systems. The lack of efficient methods for structure determination calls for urgent development of a new class of biophysical techniques. This work attempts to address this problem with a novel combination of site-directed spin labelling electron spin resonance spectroscopy (SDSL-ESR) and protein structure modelling, which is coupled by restriction of the conformational spaces of the amino acid side chains. Comparison of the application to four different protein systems enables us to generalize the new method and to establish a general procedure for determination of protein structure.

  4. DNAproDB: an interactive tool for structural analysis of DNA–protein complexes

    PubMed Central

    Sagendorf, Jared M.

    2017-01-01

    Abstract Many biological processes are mediated by complex interactions between DNA and proteins. Transcription factors, various polymerases, nucleases and histones recognize and bind DNA with different levels of binding specificity. To understand the physical mechanisms that allow proteins to recognize DNA and achieve their biological functions, it is important to analyze structures of DNA–protein complexes in detail. DNAproDB is a web-based interactive tool designed to help researchers study these complexes. DNAproDB provides an automated structure-processing pipeline that extracts structural features from DNA–protein complexes. The extracted features are organized in structured data files, which are easily parsed with any programming language or viewed in a browser. We processed a large number of DNA–protein complexes retrieved from the Protein Data Bank and created the DNAproDB database to store this data. Users can search the database by combining features of the DNA, protein or DNA–protein interactions at the interface. Additionally, users can upload their own structures for processing privately and securely. DNAproDB provides several interactive and customizable tools for creating visualizations of the DNA–protein interface at different levels of abstraction that can be exported as high quality figures. All functionality is documented and freely accessible at http://dnaprodb.usc.edu. PMID:28431131

  5. Unraveling protein catalysis through neutron diffraction

    NASA Astrophysics Data System (ADS)

    Myles, Dean

    Neutron scattering and diffraction are exquisitely sensitive to the location, concentration and dynamics of hydrogen atoms in materials and provide a powerful tool for the characterization of structure-function and interfacial relationships in biological systems. Modern neutron scattering facilities offer access to a sophisticated, non-destructive suite of instruments for biophysical characterization that provide spatial and dynamic information spanning from Angstroms to microns and from picoseconds to microseconds, respectively. Applications range from atomic-resolution analysis of individual hydrogen atoms in enzymes, through to multi-scale analysis of hierarchical structures and assemblies in biological complexes, membranes and in living cells. Here we describe how the precise location of protein and water hydrogen atoms using neutron diffraction provides a more complete description of the atomic and electronic structures of proteins, enabling key questions concerning enzyme reaction mechanisms, molecular recognition and binding and protein-water interactions to be addressed. Current work is focused on understanding how molecular structure and dynamics control function in photosynthetic, cell signaling and DNA repair proteins. We will highlight recent studies that provide detailed understanding of the physiochemical mechanisms through which proteins recognize ligands and catalyze reactions, and help to define and understand the key principles involved.

  6. SARS-unique fold in the Rousettus bat coronavirus HKU9.

    PubMed

    Hammond, Robert G; Tan, Xuan; Johnson, Margaret A

    2017-09-01

    The coronavirus nonstructural protein 3 (nsp3) is a multifunctional protein that comprises multiple structural domains. This protein assists viral polyprotein cleavage, host immune interference, and may play other roles in genome replication or transcription. Here, we report the solution NMR structure of a protein from the "SARS-unique region" of the bat coronavirus HKU9. The protein contains a frataxin fold or double-wing motif, which is an α + β fold that is associated with protein/protein interactions, DNA binding, and metal ion binding. High structural similarity to the human severe acute respiratory syndrome (SARS) coronavirus nsp3 is present. A possible functional site that is conserved among some betacoronaviruses has been identified using bioinformatics and biochemical analyses. This structure provides strong experimental support for the recent proposal advanced by us and others that the "SARS-unique" region is not unique to the human SARS virus, but is conserved among several different phylogenetic groups of coronaviruses and provides essential functions. © 2017 The Protein Society.

  7. BeStSel: a web server for accurate protein secondary structure prediction and fold recognition from the circular dichroism spectra.

    PubMed

    Micsonai, András; Wien, Frank; Bulyáki, Éva; Kun, Judit; Moussong, Éva; Lee, Young-Ho; Goto, Yuji; Réfrégiers, Matthieu; Kardos, József

    2018-06-11

    Circular dichroism (CD) spectroscopy is a widely used method to study the protein secondary structure. However, for decades, the general opinion was that the correct estimation of β-sheet content is challenging because of the large spectral and structural diversity of β-sheets. Recently, we showed that the orientation and twisting of β-sheets account for the observed spectral diversity, and developed a new method to estimate accurately the secondary structure (PNAS, 112, E3095). BeStSel web server provides the Beta Structure Selection method to analyze the CD spectra recorded by conventional or synchrotron radiation CD equipment. Both normalized and measured data can be uploaded to the server either as a single spectrum or series of spectra. The originality of BeStSel is that it carries out a detailed secondary structure analysis providing information on eight secondary structure components including parallel-β structure and antiparallel β-sheets with three different groups of twist. Based on these, it predicts the protein fold down to the topology/homology level of the CATH protein fold classification. The server also provides a module to analyze the structures deposited in the PDB for BeStSel secondary structure contents in relation to Dictionary of Secondary Structure of Proteins data. The BeStSel server is freely accessible at http://bestsel.elte.hu.

  8. The simulation approach to lipid-protein interactions.

    PubMed

    Paramo, Teresa; Garzón, Diana; Holdbrook, Daniel A; Khalid, Syma; Bond, Peter J

    2013-01-01

    The interactions between lipids and proteins are crucial for a range of biological processes, from the folding and stability of membrane proteins to signaling and metabolism facilitated by lipid-binding proteins. However, high-resolution structural details concerning functional lipid/protein interactions are scarce due to barriers in both experimental isolation of native lipid-bound complexes and subsequent biophysical characterization. The molecular dynamics (MD) simulation approach provides a means to complement available structural data, yielding dynamic, structural, and thermodynamic data for a protein embedded within a physiologically realistic, modelled lipid environment. In this chapter, we provide a guide to current methods for setting up and running simulations of membrane proteins and soluble, lipid-binding proteins, using standard atomistically detailed representations, as well as simplified, coarse-grained models. In addition, we outline recent studies that illustrate the power of the simulation approach in the context of biologically relevant lipid/protein interactions.

  9. Fast computational methods for predicting protein structure from primary amino acid sequence

    DOEpatents

    Agarwal, Pratul Kumar [Knoxville, TN

    2011-07-19

    The present invention provides a method utilizing primary amino acid sequence of a protein, energy minimization, molecular dynamics and protein vibrational modes to predict three-dimensional structure of a protein. The present invention also determines possible intermediates in the protein folding pathway. The present invention has important applications to the design of novel drugs as well as protein engineering. The present invention predicts the three-dimensional structure of a protein independent of size of the protein, overcoming a significant limitation in the prior art.

  10. Analysis of Functional Dynamics of Modular Multidomain Proteins by SAXS and NMR.

    PubMed

    Thompson, Matthew K; Ehlinger, Aaron C; Chazin, Walter J

    2017-01-01

    Multiprotein machines drive virtually all primary cellular processes. Modular multidomain proteins are widely distributed within these dynamic complexes because they provide the flexibility needed to remodel structure as well as rapidly assemble and disassemble components of the machinery. Understanding the functional dynamics of modular multidomain proteins is a major challenge confronting structural biology today because their structure is not fixed in time. Small-angle X-ray scattering (SAXS) and nuclear magnetic resonance (NMR) spectroscopy have proven particularly useful for the analysis of the structural dynamics of modular multidomain proteins because they provide highly complementary information for characterizing the architectural landscape accessible to these proteins. SAXS provides a global snapshot of all architectural space sampled by a molecule in solution. Furthermore, SAXS is sensitive to conformational changes, organization and oligomeric states of protein assemblies, and the existence of flexibility between globular domains in multiprotein complexes. The power of NMR to characterize dynamics provides uniquely complementary information to the global snapshot of the architectural ensemble provided by SAXS because it can directly measure domain motion. In particular, NMR parameters can be used to define the diffusion of domains within modular multidomain proteins, connecting the amplitude of interdomain motion to the architectural ensemble derived from SAXS. Our laboratory has been studying the roles of modular multidomain proteins involved in human DNA replication using SAXS and NMR. Here, we present the procedure for acquiring and analyzing SAXS and NMR data, using DNA primase and replication protein A as examples. © 2017 Elsevier Inc. All rights reserved.

  11. Protein Structure Classification and Loop Modeling Using Multiple Ramachandran Distributions.

    PubMed

    Najibi, Seyed Morteza; Maadooliat, Mehdi; Zhou, Lan; Huang, Jianhua Z; Gao, Xin

    2017-01-01

    Recently, the study of protein structures using angular representations has attracted much attention among structural biologists. The main challenge is how to efficiently model the continuous conformational space of the protein structures based on the differences and similarities between different Ramachandran plots. Despite the presence of statistical methods for modeling angular data of proteins, there is still a substantial need for more sophisticated and faster statistical tools to model the large-scale circular datasets. To address this need, we have developed a nonparametric method for collective estimation of multiple bivariate density functions for a collection of populations of protein backbone angles. The proposed method takes into account the circular nature of the angular data using trigonometric spline which is more efficient compared to existing methods. This collective density estimation approach is widely applicable when there is a need to estimate multiple density functions from different populations with common features. Moreover, the coefficients of adaptive basis expansion for the fitted densities provide a low-dimensional representation that is useful for visualization, clustering, and classification of the densities. The proposed method provides a novel and unique perspective to two important and challenging problems in protein structure research: structure-based protein classification and angular-sampling-based protein loop structure prediction.

  12. DWARF – a data warehouse system for analyzing protein families

    PubMed Central

    Fischer, Markus; Thai, Quan K; Grieb, Melanie; Pleiss, Jürgen

    2006-01-01

    Background The emerging field of integrative bioinformatics provides the tools to organize and systematically analyze vast amounts of highly diverse biological data and thus allows to gain a novel understanding of complex biological systems. The data warehouse DWARF applies integrative bioinformatics approaches to the analysis of large protein families. Description The data warehouse system DWARF integrates data on sequence, structure, and functional annotation for protein fold families. The underlying relational data model consists of three major sections representing entities related to the protein (biochemical function, source organism, classification to homologous families and superfamilies), the protein sequence (position-specific annotation, mutant information), and the protein structure (secondary structure information, superimposed tertiary structure). Tools for extracting, transforming and loading data from public available resources (ExPDB, GenBank, DSSP) are provided to populate the database. The data can be accessed by an interface for searching and browsing, and by analysis tools that operate on annotation, sequence, or structure. We applied DWARF to the family of α/β-hydrolases to host the Lipase Engineering database. Release 2.3 contains 6138 sequences and 167 experimentally determined protein structures, which are assigned to 37 superfamilies 103 homologous families. Conclusion DWARF has been designed for constructing databases of large structurally related protein families and for evaluating their sequence-structure-function relationships by a systematic analysis of sequence, structure and functional annotation. It has been applied to predict biochemical properties from sequence, and serves as a valuable tool for protein engineering. PMID:17094801

  13. Membrane-spanning α-helical barrels as tractable protein-design targets.

    PubMed

    Niitsu, Ai; Heal, Jack W; Fauland, Kerstin; Thomson, Andrew R; Woolfson, Derek N

    2017-08-05

    The rational ( de novo ) design of membrane-spanning proteins lags behind that for water-soluble globular proteins. This is due to gaps in our knowledge of membrane-protein structure, and experimental difficulties in studying such proteins compared to water-soluble counterparts. One limiting factor is the small number of experimentally determined three-dimensional structures for transmembrane proteins. By contrast, many tens of thousands of globular protein structures provide a rich source of 'scaffolds' for protein design, and the means to garner sequence-to-structure relationships to guide the design process. The α-helical coiled coil is a protein-structure element found in both globular and membrane proteins, where it cements a variety of helix-helix interactions and helical bundles. Our deep understanding of coiled coils has enabled a large number of successful de novo designs. For one class, the α-helical barrels-that is, symmetric bundles of five or more helices with central accessible channels-there are both water-soluble and membrane-spanning examples. Recent computational designs of water-soluble α-helical barrels with five to seven helices have advanced the design field considerably. Here we identify and classify analogous and more complicated membrane-spanning α-helical barrels from the Protein Data Bank. These provide tantalizing but tractable targets for protein engineering and de novo protein design.This article is part of the themed issue 'Membrane pores: from structure and assembly, to medicine and technology'. © 2017 The Author(s).

  14. The effect of increasing levels of fish oil-containing structured triglycerides on protein metabolism in parenterally fed rats stressed by burn plus endotoxin.

    PubMed

    Gollaher, C J; Fechner, K; Karlstad, M; Babayan, V K; Bistrian, B R

    1993-01-01

    This report investigates the effect of various levels of medium-chain/fish oil structured triglycerides on protein and energy metabolism in hypermetabolic rats. Male Sprague-Dawley rats (192 to 226 g) were continuously infused with isovolemic diets that provided 200 kcal/kg per day and 2 g of amino acid nitrogen per kilogram per day. The percentage of nonnitrogen calories as structured triglyceride was varied: no fat, 5%, 15%, or 30%. A 30% long-chain triglyceride diet was also provided as a control to compare the protein-sparing abilities of these two types of fat. Nitrogen excretion, plasma albumin, plasma triglycerides, and whole-body and liver and muscle protein kinetics were determined after 3 days of feeding. Whole-body protein breakdown, flux, and oxidation were similar in all groups. The 15% structured triglyceride diet maximized whole-body protein synthesis (p < .05). Liver fractional synthetic rate was significantly greater in animals receiving 5% of nonprotein calories as structured triglyceride (p < .05). Muscle fractional synthetic rate was unchanged. Plasma triglycerides were markedly elevated in the 30% structured triglyceride-fed rats. The 30% structured triglyceride diet maintained plasma albumin levels better than those diets containing no fat, 5% medium-chain triglyceride/fish oil structured triglyceride, or 30% long-chain triglycerides. Nitrogen excretion was lower in animals receiving 30% of nonnitrogen calories as a structured triglyceride than in those receiving 30% as long-chain triglycerides, but this difference did not reach statistical significance (p = .1). These data suggest that protein metabolism is optimized when structured triglyceride is provided at relatively low dietary fat intakes.

  15. PDBsum: Structural summaries of PDB entries.

    PubMed

    Laskowski, Roman A; Jabłońska, Jagoda; Pravda, Lukáš; Vařeková, Radka Svobodová; Thornton, Janet M

    2018-01-01

    PDBsum is a web server providing structural information on the entries in the Protein Data Bank (PDB). The analyses are primarily image-based and include protein secondary structure, protein-ligand and protein-DNA interactions, PROCHECK analyses of structural quality, and many others. The 3D structures can be viewed interactively in RasMol, PyMOL, and a JavaScript viewer called 3Dmol.js. Users can upload their own PDB files and obtain a set of password-protected PDBsum analyses for each. The server is freely accessible to all at: http://www.ebi.ac.uk/pdbsum. © 2017 The Protein Society.

  16. Understanding protein evolution: from protein physics to Darwinian selection.

    PubMed

    Zeldovich, Konstantin B; Shakhnovich, Eugene I

    2008-01-01

    Efforts in whole-genome sequencing and structural proteomics start to provide a global view of the protein universe, the set of existing protein structures and sequences. However, approaches based on the selection of individual sequences have not been entirely successful at the quantitative description of the distribution of structures and sequences in the protein universe because evolutionary pressure acts on the entire organism, rather than on a particular molecule. In parallel to this line of study, studies in population genetics and phenomenological molecular evolution established a mathematical framework to describe the changes in genome sequences in populations of organisms over time. Here, we review both microscopic (physics-based) and macroscopic (organism-level) models of protein-sequence evolution and demonstrate that bridging the two scales provides the most complete description of the protein universe starting from clearly defined, testable, and physiologically relevant assumptions.

  17. Recent Advances and Applications in Synchrotron X-Ray Protein Footprinting for Protein Structure and Dynamics Elucidation.

    PubMed

    Gupta, Sayan; Feng, Jun; Chance, Mark; Ralston, Corie

    2016-01-01

    Synchrotron X-ray Footprinting is a powerful in situ hydroxyl radical labeling method for analysis of protein structure, interactions, folding and conformation change in solution. In this method, water is ionized by high flux density broad band synchrotron X-rays to produce a steady-state concentration of hydroxyl radicals, which then react with solvent accessible side-chains. The resulting stable modification products are analyzed by liquid chromatography coupled to mass spectrometry. A comparative reactivity rate between known and unknown states of a protein provides local as well as global information on structural changes, which is then used to develop structural models for protein function and dynamics. In this review we describe the XF-MS method, its unique capabilities and its recent technical advances at the Advanced Light Source. We provide a comparison of other hydroxyl radical and mass spectrometry based methods with XFMS. We also discuss some of the latest developments in its usage for studying bound water, transmembrane proteins and photosynthetic protein components, and the synergy of the method with other synchrotron based structural biology methods.

  18. Efficient Relaxation of Protein-Protein Interfaces by Discrete Molecular Dynamics Simulations.

    PubMed

    Emperador, Agusti; Solernou, Albert; Sfriso, Pedro; Pons, Carles; Gelpi, Josep Lluis; Fernandez-Recio, Juan; Orozco, Modesto

    2013-02-12

    Protein-protein interactions are responsible for the transfer of information inside the cell and represent one of the most interesting research fields in structural biology. Unfortunately, after decades of intense research, experimental approaches still have difficulties in providing 3D structures for the hundreds of thousands of interactions formed between the different proteins in a living organism. The use of theoretical approaches like docking aims to complement experimental efforts to represent the structure of the protein interactome. However, we cannot ignore that current methods have limitations due to problems of sampling of the protein-protein conformational space and the lack of accuracy of available force fields. Cases that are especially difficult for prediction are those in which complex formation implies a non-negligible change in the conformation of the interacting proteins, i.e., those cases where protein flexibility plays a key role in protein-protein docking. In this work, we present a new approach to treat flexibility in docking by global structural relaxation based on ultrafast discrete molecular dynamics. On a standard benchmark of protein complexes, the method provides a general improvement over the results obtained by rigid docking. The method is especially efficient in cases with large conformational changes upon binding, in which structure relaxation with discrete molecular dynamics leads to a predictive success rate double that obtained with state-of-the-art rigid-body docking.

  19. Protein Structure Prediction by Protein Threading

    NASA Astrophysics Data System (ADS)

    Xu, Ying; Liu, Zhijie; Cai, Liming; Xu, Dong

    The seminal work of Bowie, Lüthy, and Eisenberg (Bowie et al., 1991) on "the inverse protein folding problem" laid the foundation of protein structure prediction by protein threading. By using simple measures for fitness of different amino acid types to local structural environments defined in terms of solvent accessibility and protein secondary structure, the authors derived a simple and yet profoundly novel approach to assessing if a protein sequence fits well with a given protein structural fold. Their follow-up work (Elofsson et al., 1996; Fischer and Eisenberg, 1996; Fischer et al., 1996a,b) and the work by Jones, Taylor, and Thornton (Jones et al., 1992) on protein fold recognition led to the development of a new brand of powerful tools for protein structure prediction, which we now term "protein threading." These computational tools have played a key role in extending the utility of all the experimentally solved structures by X-ray crystallography and nuclear magnetic resonance (NMR), providing structural models and functional predictions for many of the proteins encoded in the hundreds of genomes that have been sequenced up to now.

  20. High-throughput Crystallography for Structural Genomics

    PubMed Central

    Joachimiak, Andrzej

    2009-01-01

    Protein X-ray crystallography recently celebrated its 50th anniversary. The structures of myoglobin and hemoglobin determined by Kendrew and Perutz provided the first glimpses into the complex protein architecture and chemistry. Since then, the field of structural molecular biology has experienced extraordinary progress and now over 53,000 proteins structures have been deposited into the Protein Data Bank. In the past decade many advances in macromolecular crystallography have been driven by world-wide structural genomics efforts. This was made possible because of third-generation synchrotron sources, structure phasing approaches using anomalous signal and cryo-crystallography. Complementary progress in molecular biology, proteomics, hardware and software for crystallographic data collection, structure determination and refinement, computer science, databases, robotics and automation improved and accelerated many processes. These advancements provide the robust foundation for structural molecular biology and assure strong contribution to science in the future. In this report we focus mainly on reviewing structural genomics high-throughput X-ray crystallography technologies and their impact. PMID:19765976

  1. Self-assembling enzymes and the origins of the cytoskeleton

    PubMed Central

    Barry, Rachael; Gitai, Zemer

    2011-01-01

    The bacterial cytoskeleton is composed of a complex and diverse group of proteins that self-assemble into linear filaments. These filaments support and organize cellular architecture and provide a dynamic network controlling transport and localization within the cell. Here, we review recent discoveries related to a newly appreciated class of self-assembling proteins that expand our view of the bacterial cytoskeleton and provide potential explanations for its evolutionary origins. Specifically, several types of metabolic enzymes can form structures similar to established cytoskeletal filaments and, in some cases, these structures have been repurposed for structural uses independent of their normal role. The behaviors of these enzymes suggest that some modern cytoskeletal proteins may have evolved from dual-role proteins with catalytic and structural functions. PMID:22014508

  2. PDBFlex: exploring flexibility in protein structures

    PubMed Central

    Hrabe, Thomas; Li, Zhanwen; Sedova, Mayya; Rotkiewicz, Piotr; Jaroszewski, Lukasz; Godzik, Adam

    2016-01-01

    The PDBFlex database, available freely and with no login requirements at http://pdbflex.org, provides information on flexibility of protein structures as revealed by the analysis of variations between depositions of different structural models of the same protein in the Protein Data Bank (PDB). PDBFlex collects information on all instances of such depositions, identifying them by a 95% sequence identity threshold, performs analysis of their structural differences and clusters them according to their structural similarities for easy analysis. The PDBFlex contains tools and viewers enabling in-depth examination of structural variability including: 2D-scaling visualization of RMSD distances between structures of the same protein, graphs of average local RMSD in the aligned structures of protein chains, graphical presentation of differences in secondary structure and observed structural disorder (unresolved residues), difference distance maps between all sets of coordinates and 3D views of individual structures and simulated transitions between different conformations, the latter displayed using JSMol visualization software. PMID:26615193

  3. NIAS-Server: Neighbors Influence of Amino acids and Secondary Structures in Proteins.

    PubMed

    Borguesan, Bruno; Inostroza-Ponta, Mario; Dorn, Márcio

    2017-03-01

    The exponential growth in the number of experimentally determined three-dimensional protein structures provide a new and relevant knowledge about the conformation of amino acids in proteins. Only a few of probability densities of amino acids are publicly available for use in structure validation and prediction methods. NIAS (Neighbors Influence of Amino acids and Secondary structures) is a web-based tool used to extract information about conformational preferences of amino acid residues and secondary structures in experimental-determined protein templates. This information is useful, for example, to characterize folds and local motifs in proteins, molecular folding, and can help the solution of complex problems such as protein structure prediction, protein design, among others. The NIAS-Server and supplementary data are available at http://sbcb.inf.ufrgs.br/nias .

  4. De Novo Protein Structure Prediction

    NASA Astrophysics Data System (ADS)

    Hung, Ling-Hong; Ngan, Shing-Chung; Samudrala, Ram

    An unparalleled amount of sequence data is being made available from large-scale genome sequencing efforts. The data provide a shortcut to the determination of the function of a gene of interest, as long as there is an existing sequenced gene with similar sequence and of known function. This has spurred structural genomic initiatives with the goal of determining as many protein folds as possible (Brenner and Levitt, 2000; Burley, 2000; Brenner, 2001; Heinemann et al., 2001). The purpose of this is twofold: First, the structure of a gene product can often lead to direct inference of its function. Second, since the function of a protein is dependent on its structure, direct comparison of the structures of gene products can be more sensitive than the comparison of sequences of genes for detecting homology. Presently, structural determination by crystallography and NMR techniques is still slow and expensive in terms of manpower and resources, despite attempts to automate the processes. Computer structure prediction algorithms, while not providing the accuracy of the traditional techniques, are extremely quick and inexpensive and can provide useful low-resolution data for structure comparisons (Bonneau and Baker, 2001). Given the immense number of structures which the structural genomic projects are attempting to solve, there would be a considerable gain even if the computer structure prediction approach were applicable to a subset of proteins.

  5. Quality assessment of protein model-structures based on structural and functional similarities.

    PubMed

    Konopka, Bogumil M; Nebel, Jean-Christophe; Kotulska, Malgorzata

    2012-09-21

    Experimental determination of protein 3D structures is expensive, time consuming and sometimes impossible. A gap between number of protein structures deposited in the World Wide Protein Data Bank and the number of sequenced proteins constantly broadens. Computational modeling is deemed to be one of the ways to deal with the problem. Although protein 3D structure prediction is a difficult task, many tools are available. These tools can model it from a sequence or partial structural information, e.g. contact maps. Consequently, biologists have the ability to generate automatically a putative 3D structure model of any protein. However, the main issue becomes evaluation of the model quality, which is one of the most important challenges of structural biology. GOBA--Gene Ontology-Based Assessment is a novel Protein Model Quality Assessment Program. It estimates the compatibility between a model-structure and its expected function. GOBA is based on the assumption that a high quality model is expected to be structurally similar to proteins functionally similar to the prediction target. Whereas DALI is used to measure structure similarity, protein functional similarity is quantified using standardized and hierarchical description of proteins provided by Gene Ontology combined with Wang's algorithm for calculating semantic similarity. Two approaches are proposed to express the quality of protein model-structures. One is a single model quality assessment method, the other is its modification, which provides a relative measure of model quality. Exhaustive evaluation is performed on data sets of model-structures submitted to the CASP8 and CASP9 contests. The validation shows that the method is able to discriminate between good and bad model-structures. The best of tested GOBA scores achieved 0.74 and 0.8 as a mean Pearson correlation to the observed quality of models in our CASP8 and CASP9-based validation sets. GOBA also obtained the best result for two targets of CASP8, and one of CASP9, compared to the contest participants. Consequently, GOBA offers a novel single model quality assessment program that addresses the practical needs of biologists. In conjunction with other Model Quality Assessment Programs (MQAPs), it would prove useful for the evaluation of single protein models.

  6. Dynamic New World: Refining Our View of Protein Structure, Function and Evolution

    PubMed Central

    Mannige, Ranjan V.

    2014-01-01

    Proteins are crucial to the functioning of all lifeforms. Traditional understanding posits that a single protein occupies a single structure (“fold”), which performs a single function. This view is radically challenged with the recognition that high structural dynamism—the capacity to be extra “floppy”—is more prevalent in functional proteins than previously assumed. As reviewed here, this dynamic take on proteins affects our understanding of protein “structure”, function, and evolution, and even gives us a glimpse into protein origination. Specifically, this review will discuss historical developments concerning protein structure, and important new relationships between dynamism and aspects of protein sequence, structure, binding modes, binding promiscuity, evolvability, and origination. Along the way, suggestions will be provided for how key parts of textbook definitions—that so far have excluded membership to intrinsically disordered proteins (IDPs)—could be modified to accommodate our more dynamic understanding of proteins. PMID:28250374

  7. An affinity-structure database of helix-turn-helix: DNA complexes with a universal coordinate system

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    AlQuraishi, Mohammed; Tang, Shengdong; Xia, Xide

    Molecular interactions between proteins and DNA molecules underlie many cellular processes, including transcriptional regulation, chromosome replication, and nucleosome positioning. Computational analyses of protein-DNA interactions rely on experimental data characterizing known protein-DNA interactions structurally and biochemically. While many databases exist that contain either structural or biochemical data, few integrate these two data sources in a unified fashion. Such integration is becoming increasingly critical with the rapid growth of structural and biochemical data, and the emergence of algorithms that rely on the synthesis of multiple data types to derive computational models of molecular interactions. We have developed an integrated affinity-structure database inmore » which the experimental and quantitative DNA binding affinities of helix-turn-helix proteins are mapped onto the crystal structures of the corresponding protein-DNA complexes. This database provides access to: (i) protein-DNA structures, (ii) quantitative summaries of protein-DNA binding affinities using position weight matrices, and (iii) raw experimental data of protein-DNA binding instances. Critically, this database establishes a correspondence between experimental structural data and quantitative binding affinity data at the single basepair level. Furthermore, we present a novel alignment algorithm that structurally aligns the protein-DNA complexes in the database and creates a unified residue-level coordinate system for comparing the physico-chemical environments at the interface between complexes. Using this unified coordinate system, we compute the statistics of atomic interactions at the protein-DNA interface of helix-turn-helix proteins. We provide an interactive website for visualization, querying, and analyzing this database, and a downloadable version to facilitate programmatic analysis. Lastly, this database will facilitate the analysis of protein-DNA interactions and the development of programmatic computational methods that capitalize on integration of structural and biochemical datasets. The database can be accessed at http://ProteinDNA.hms.harvard.edu.« less

  8. An affinity-structure database of helix-turn-helix: DNA complexes with a universal coordinate system

    DOE PAGES

    AlQuraishi, Mohammed; Tang, Shengdong; Xia, Xide

    2015-11-19

    Molecular interactions between proteins and DNA molecules underlie many cellular processes, including transcriptional regulation, chromosome replication, and nucleosome positioning. Computational analyses of protein-DNA interactions rely on experimental data characterizing known protein-DNA interactions structurally and biochemically. While many databases exist that contain either structural or biochemical data, few integrate these two data sources in a unified fashion. Such integration is becoming increasingly critical with the rapid growth of structural and biochemical data, and the emergence of algorithms that rely on the synthesis of multiple data types to derive computational models of molecular interactions. We have developed an integrated affinity-structure database inmore » which the experimental and quantitative DNA binding affinities of helix-turn-helix proteins are mapped onto the crystal structures of the corresponding protein-DNA complexes. This database provides access to: (i) protein-DNA structures, (ii) quantitative summaries of protein-DNA binding affinities using position weight matrices, and (iii) raw experimental data of protein-DNA binding instances. Critically, this database establishes a correspondence between experimental structural data and quantitative binding affinity data at the single basepair level. Furthermore, we present a novel alignment algorithm that structurally aligns the protein-DNA complexes in the database and creates a unified residue-level coordinate system for comparing the physico-chemical environments at the interface between complexes. Using this unified coordinate system, we compute the statistics of atomic interactions at the protein-DNA interface of helix-turn-helix proteins. We provide an interactive website for visualization, querying, and analyzing this database, and a downloadable version to facilitate programmatic analysis. Lastly, this database will facilitate the analysis of protein-DNA interactions and the development of programmatic computational methods that capitalize on integration of structural and biochemical datasets. The database can be accessed at http://ProteinDNA.hms.harvard.edu.« less

  9. Probing Protein Structure and Folding in the Gas Phase by Electron Capture Dissociation

    NASA Astrophysics Data System (ADS)

    Schennach, Moritz; Breuker, Kathrin

    2015-07-01

    The established methods for the study of atom-detailed protein structure in the condensed phases, X-ray crystallography and nuclear magnetic resonance spectroscopy, have recently been complemented by new techniques by which nearly or fully desolvated protein structures are probed in gas-phase experiments. Electron capture dissociation (ECD) is unique among these as it provides residue-specific, although indirect, structural information. In this Critical Insight article, we discuss the development of ECD for the structural probing of gaseous protein ions, its potential, and limitations.

  10. From non-random molecular structure to life and mind

    NASA Technical Reports Server (NTRS)

    Fox, S. W.

    1989-01-01

    The evolutionary hierarchy molecular structure-->macromolecular structure-->protobiological structure-->biological structure-->biological functions has been traced by experiments. The sequence always moves through protein. Extension of the experiments traces the formation of nucleic acids instructed by proteins. The proteins themselves were, in this picture, instructed by the self-sequencing of precursor amino acids. While the sequence indicated explains the thread of the emergence of life, protein in cellular membrane also provides the only known material basis for the emergence of mind in the context of emergence of life.

  11. Patent protection for structural genomics-related inventions.

    PubMed

    Vinarov, Sara D

    2003-01-01

    Recently there have been some important developments with respect to the patentability of inventions in the field of structural genomics. The leaders of the European Patent Office (EPO), Japan Patent Office (JPO) and the United States Patent Office (USPTO) came together for a trilateral meeting to conduct a comparative study on protein 3-dimensional (3-D) structure related claims in an effort to come to a mutual understanding about the examination of such inventions. The three patent offices were presented with eight different cases: 1) 3-D structural data of a protein per se; 2) computer-readable storage medium encoded with structural data of a protein; 3) protein defined by its tertiary structure; 4) crystals of known proteins; 5) binding pockets and protein domains; 6) and 7) are both directed to in silico screening methods directed to a specific protein; and 8) pharmacophores. The preliminary conclusions reached at the trilateral meeting provide clarity regarding the types of inventions that may be patentable given a specific set of scientific facts in a patent application. Therefore, the guidance provided by this study will help inventors, attorneys and other patent practitioners who file for patent protection on structural genomics-based inventions both here and abroad comply with the patentability requirements of each office.

  12. A structural-alphabet-based strategy for finding structural motifs across protein families

    PubMed Central

    Wu, Chih Yuan; Chen, Yao Chi; Lim, Carmay

    2010-01-01

    Proteins with insignificant sequence and overall structure similarity may still share locally conserved contiguous structural segments; i.e. structural/3D motifs. Most methods for finding 3D motifs require a known motif to search for other similar structures or functionally/structurally crucial residues. Here, without requiring a query motif or essential residues, a fully automated method for discovering 3D motifs of various sizes across protein families with different folds based on a 16-letter structural alphabet is presented. It was applied to structurally non-redundant proteins bound to DNA, RNA, obligate/non-obligate proteins as well as free DNA-binding proteins (DBPs) and proteins with known structures but unknown function. Its usefulness was illustrated by analyzing the 3D motifs found in DBPs. A non-specific motif was found with a ‘corner’ architecture that confers a stable scaffold and enables diverse interactions, making it suitable for binding not only DNA but also RNA and proteins. Furthermore, DNA-specific motifs present ‘only’ in DBPs were discovered. The motifs found can provide useful guidelines in detecting binding sites and computational protein redesign. PMID:20525797

  13. G23D: Online tool for mapping and visualization of genomic variants on 3D protein structures.

    PubMed

    Solomon, Oz; Kunik, Vered; Simon, Amos; Kol, Nitzan; Barel, Ortal; Lev, Atar; Amariglio, Ninette; Somech, Raz; Rechavi, Gidi; Eyal, Eran

    2016-08-26

    Evaluation of the possible implications of genomic variants is an increasingly important task in the current high throughput sequencing era. Structural information however is still not routinely exploited during this evaluation process. The main reasons can be attributed to the partial structural coverage of the human proteome and the lack of tools which conveniently convert genomic positions, which are the frequent output of genomic pipelines, to proteins and structure coordinates. We present G23D, a tool for conversion of human genomic coordinates to protein coordinates and protein structures. G23D allows mapping of genomic positions/variants on evolutionary related (and not only identical) protein three dimensional (3D) structures as well as on theoretical models. By doing so it significantly extends the space of variants for which structural insight is feasible. To facilitate interpretation of the variant consequence, pathogenic variants, functional sites and polymorphism sites are displayed on protein sequence and structure diagrams alongside the input variants. G23D also provides modeling of the mutant structure, analysis of intra-protein contacts and instant access to functional predictions and predictions of thermo-stability changes. G23D is available at http://www.sheba-cancer.org.il/G23D . G23D extends the fraction of variants for which structural analysis is applicable and provides better and faster accessibility for structural data to biologists and geneticists who routinely work with genomic information.

  14. Quantification of the Influence of Protein-Protein Interactions on Adsorbed Protein Structure and Bioactivity

    PubMed Central

    Wei, Yang; Thyparambil, Aby A.; Latour, Robert A.

    2013-01-01

    While protein-surface interactions have been widely studied, relatively little is understood at this time regarding how protein-surface interaction effects are influenced by protein-protein interactions and how these effects combine with the internal stability of a protein to influence its adsorbed-state structure and bioactivity. The objectives of this study were to develop a method to study these combined effects under widely varying protein-protein interaction conditions using hen egg-white lysozyme (HEWL) adsorbed on silica glass, poly(methyl methacrylate), and polyethylene as our model systems. In order to vary protein-protein interaction effects over a wide range, HEWL was first adsorbed to each surface type under widely varying protein solution concentrations for 2 h to saturate the surface, followed by immersion in pure buffer solution for 15 h to equilibrate the adsorbed protein layers in the absence of additionally adsorbing protein. Periodic measurements were made at selected time points of the areal density of the adsorbed protein layer as an indicator of the level of protein-protein interaction effects within the layer, and these values were then correlated with measurements of the adsorbed protein’s secondary structure and bioactivity. The results from these studies indicate that protein-protein interaction effects help stabilize the structure of HEWL adsorbed on silica glass, have little influence on the structural behavior of HEWL on HDPE, and actually serve to destabilize HEWL’s structure on PMMA. The bioactivity of HEWL on silica glass and HDPE was found to decrease in direct proportion to the degree of adsorption-induce protein unfolding. A direct correlation between bioactivity and the conformational state of adsorbed HEWL was less apparent on PMMA, thus suggesting that other factors influenced HEWL’s bioactivity on this surface, such as the accessibility of HEWL’s bioactive site being blocked by neighboring proteins or the surface itself. The developed methods provide an effective means to characterize the influence of protein-protein interaction effects and provide new molecular-level insights into how protein-protein interaction effects combine with protein-surface interaction and internal protein stability effects to influence the structure and bioactivity of adsorbed protein. PMID:23751416

  15. ELM: the status of the 2010 eukaryotic linear motif resource

    PubMed Central

    Gould, Cathryn M.; Diella, Francesca; Via, Allegra; Puntervoll, Pål; Gemünd, Christine; Chabanis-Davidson, Sophie; Michael, Sushama; Sayadi, Ahmed; Bryne, Jan Christian; Chica, Claudia; Seiler, Markus; Davey, Norman E.; Haslam, Niall; Weatheritt, Robert J.; Budd, Aidan; Hughes, Tim; Paś, Jakub; Rychlewski, Leszek; Travé, Gilles; Aasland, Rein; Helmer-Citterich, Manuela; Linding, Rune; Gibson, Toby J.

    2010-01-01

    Linear motifs are short segments of multidomain proteins that provide regulatory functions independently of protein tertiary structure. Much of intracellular signalling passes through protein modifications at linear motifs. Many thousands of linear motif instances, most notably phosphorylation sites, have now been reported. Although clearly very abundant, linear motifs are difficult to predict de novo in protein sequences due to the difficulty of obtaining robust statistical assessments. The ELM resource at http://elm.eu.org/ provides an expanding knowledge base, currently covering 146 known motifs, with annotation that includes >1300 experimentally reported instances. ELM is also an exploratory tool for suggesting new candidates of known linear motifs in proteins of interest. Information about protein domains, protein structure and native disorder, cellular and taxonomic contexts is used to reduce or deprecate false positive matches. Results are graphically displayed in a ‘Bar Code’ format, which also displays known instances from homologous proteins through a novel ‘Instance Mapper’ protocol based on PHI-BLAST. ELM server output provides links to the ELM annotation as well as to a number of remote resources. Using the links, researchers can explore the motifs, proteins, complex structures and associated literature to evaluate whether candidate motifs might be worth experimental investigation. PMID:19920119

  16. Mass spectrometry: Raw protein from the top down

    NASA Astrophysics Data System (ADS)

    Breuker, Kathrin

    2018-02-01

    Mass spectrometry is a powerful technique for analysing proteins, yet linking higher-order protein structure to amino acid sequence and post-translational modifications is far from simple. Now, a native top-down method has been developed that can provide information on higher-order protein structure and different proteoforms at the same time.

  17. Modeling disordered protein interactions from biophysical principles

    PubMed Central

    Christoffer, Charles; Terashi, Genki

    2017-01-01

    Disordered protein-protein interactions (PPIs), those involving a folded protein and an intrinsically disordered protein (IDP), are prevalent in the cell, including important signaling and regulatory pathways. IDPs do not adopt a single dominant structure in isolation but often become ordered upon binding. To aid understanding of the molecular mechanisms of disordered PPIs, it is crucial to obtain the tertiary structure of the PPIs. However, experimental methods have difficulty in solving disordered PPIs and existing protein-protein and protein-peptide docking methods are not able to model them. Here we present a novel computational method, IDP-LZerD, which models the conformation of a disordered PPI by considering the biophysical binding mechanism of an IDP to a structured protein, whereby a local segment of the IDP initiates the interaction and subsequently the remaining IDP regions explore and coalesce around the initial binding site. On a dataset of 22 disordered PPIs with IDPs up to 69 amino acids, successful predictions were made for 21 bound and 18 unbound receptors. The successful modeling provides additional support for biophysical principles. Moreover, the new technique significantly expands the capability of protein structure modeling and provides crucial insights into the molecular mechanisms of disordered PPIs. PMID:28394890

  18. Template-Based Modeling of Protein-RNA Interactions.

    PubMed

    Zheng, Jinfang; Kundrotas, Petras J; Vakser, Ilya A; Liu, Shiyong

    2016-09-01

    Protein-RNA complexes formed by specific recognition between RNA and RNA-binding proteins play an important role in biological processes. More than a thousand of such proteins in human are curated and many novel RNA-binding proteins are to be discovered. Due to limitations of experimental approaches, computational techniques are needed for characterization of protein-RNA interactions. Although much progress has been made, adequate methodologies reliably providing atomic resolution structural details are still lacking. Although protein-RNA free docking approaches proved to be useful, in general, the template-based approaches provide higher quality of predictions. Templates are key to building a high quality model. Sequence/structure relationships were studied based on a representative set of binary protein-RNA complexes from PDB. Several approaches were tested for pairwise target/template alignment. The analysis revealed a transition point between random and correct binding modes. The results showed that structural alignment is better than sequence alignment in identifying good templates, suitable for generating protein-RNA complexes close to the native structure, and outperforms free docking, successfully predicting complexes where the free docking fails, including cases of significant conformational change upon binding. A template-based protein-RNA interaction modeling protocol PRIME was developed and benchmarked on a representative set of complexes.

  19. Adjusting protein graphs based on graph entropy.

    PubMed

    Peng, Sheng-Lung; Tsay, Yu-Wei

    2014-01-01

    Measuring protein structural similarity attempts to establish a relationship of equivalence between polymer structures based on their conformations. In several recent studies, researchers have explored protein-graph remodeling, instead of looking a minimum superimposition for pairwise proteins. When graphs are used to represent structured objects, the problem of measuring object similarity become one of computing the similarity between graphs. Graph theory provides an alternative perspective as well as efficiency. Once a protein graph has been created, its structural stability must be verified. Therefore, a criterion is needed to determine if a protein graph can be used for structural comparison. In this paper, we propose a measurement for protein graph remodeling based on graph entropy. We extend the concept of graph entropy to determine whether a graph is suitable for representing a protein. The experimental results suggest that when applied, graph entropy helps a conformational on protein graph modeling. Furthermore, it indirectly contributes to protein structural comparison if a protein graph is solid.

  20. Adjusting protein graphs based on graph entropy

    PubMed Central

    2014-01-01

    Measuring protein structural similarity attempts to establish a relationship of equivalence between polymer structures based on their conformations. In several recent studies, researchers have explored protein-graph remodeling, instead of looking a minimum superimposition for pairwise proteins. When graphs are used to represent structured objects, the problem of measuring object similarity become one of computing the similarity between graphs. Graph theory provides an alternative perspective as well as efficiency. Once a protein graph has been created, its structural stability must be verified. Therefore, a criterion is needed to determine if a protein graph can be used for structural comparison. In this paper, we propose a measurement for protein graph remodeling based on graph entropy. We extend the concept of graph entropy to determine whether a graph is suitable for representing a protein. The experimental results suggest that when applied, graph entropy helps a conformational on protein graph modeling. Furthermore, it indirectly contributes to protein structural comparison if a protein graph is solid. PMID:25474347

  1. Sequential protein unfolding through a carbon nanotube pore

    NASA Astrophysics Data System (ADS)

    Xu, Zhonghe; Zhang, Shuang; Weber, Jeffrey K.; Luan, Binquan; Zhou, Ruhong; Li, Jingyuan

    2016-06-01

    An assortment of biological processes, like protein degradation and the transport of proteins across membranes, depend on protein unfolding events mediated by nanopore interfaces. In this work, we exploit fully atomistic simulations of an artificial, CNT-based nanopore to investigate the nature of ubiquitin unfolding. With one end of the protein subjected to an external force, we observe non-canonical unfolding behaviour as ubiquitin is pulled through the pore opening. Secondary structural elements are sequentially detached from the protein and threaded into the nanotube, interestingly, the remaining part maintains native-like characteristics. The constraints of the nanopore interface thus facilitate the formation of stable ``unfoldon'' motifs above the nanotube aperture that can exist in the absence of specific native contacts with the other secondary structure. Destruction of these unfoldons gives rise to distinct force peaks in our simulations, providing us with a sensitive probe for studying the kinetics of serial unfolding events. Our detailed analysis of nanopore-mediated protein unfolding events not only provides insight into how related processes might proceed in the cell, but also serves to deepen our understanding of structural arrangements which form the basis for protein conformational stability.An assortment of biological processes, like protein degradation and the transport of proteins across membranes, depend on protein unfolding events mediated by nanopore interfaces. In this work, we exploit fully atomistic simulations of an artificial, CNT-based nanopore to investigate the nature of ubiquitin unfolding. With one end of the protein subjected to an external force, we observe non-canonical unfolding behaviour as ubiquitin is pulled through the pore opening. Secondary structural elements are sequentially detached from the protein and threaded into the nanotube, interestingly, the remaining part maintains native-like characteristics. The constraints of the nanopore interface thus facilitate the formation of stable ``unfoldon'' motifs above the nanotube aperture that can exist in the absence of specific native contacts with the other secondary structure. Destruction of these unfoldons gives rise to distinct force peaks in our simulations, providing us with a sensitive probe for studying the kinetics of serial unfolding events. Our detailed analysis of nanopore-mediated protein unfolding events not only provides insight into how related processes might proceed in the cell, but also serves to deepen our understanding of structural arrangements which form the basis for protein conformational stability. Electronic supplementary information (ESI) available. See DOI: 10.1039/c6nr00410e

  2. Zebra: a web server for bioinformatic analysis of diverse protein families.

    PubMed

    Suplatov, Dmitry; Kirilin, Evgeny; Takhaveev, Vakil; Svedas, Vytas

    2014-01-01

    During evolution of proteins from a common ancestor, one functional property can be preserved while others can vary leading to functional diversity. A systematic study of the corresponding adaptive mutations provides a key to one of the most challenging problems of modern structural biology - understanding the impact of amino acid substitutions on protein function. The subfamily-specific positions (SSPs) are conserved within functional subfamilies but are different between them and, therefore, seem to be responsible for functional diversity in protein superfamilies. Consequently, a corresponding method to perform the bioinformatic analysis of sequence and structural data has to be implemented in the common laboratory practice to study the structure-function relationship in proteins and develop novel protein engineering strategies. This paper describes Zebra web server - a powerful remote platform that implements a novel bioinformatic analysis algorithm to study diverse protein families. It is the first application that provides specificity determinants at different levels of functional classification, therefore addressing complex functional diversity of large superfamilies. Statistical analysis is implemented to automatically select a set of highly significant SSPs to be used as hotspots for directed evolution or rational design experiments and analyzed studying the structure-function relationship. Zebra results are provided in two ways - (1) as a single all-in-one parsable text file and (2) as PyMol sessions with structural representation of SSPs. Zebra web server is available at http://biokinet.belozersky.msu.ru/zebra .

  3. A General Method for Targeted Quantitative Cross-Linking Mass Spectrometry.

    PubMed

    Chavez, Juan D; Eng, Jimmy K; Schweppe, Devin K; Cilia, Michelle; Rivera, Keith; Zhong, Xuefei; Wu, Xia; Allen, Terrence; Khurgel, Moshe; Kumar, Akhilesh; Lampropoulos, Athanasios; Larsson, Mårten; Maity, Shuvadeep; Morozov, Yaroslav; Pathmasiri, Wimal; Perez-Neut, Mathew; Pineyro-Ruiz, Coriness; Polina, Elizabeth; Post, Stephanie; Rider, Mark; Tokmina-Roszyk, Dorota; Tyson, Katherine; Vieira Parrine Sant'Ana, Debora; Bruce, James E

    2016-01-01

    Chemical cross-linking mass spectrometry (XL-MS) provides protein structural information by identifying covalently linked proximal amino acid residues on protein surfaces. The information gained by this technique is complementary to other structural biology methods such as x-ray crystallography, NMR and cryo-electron microscopy[1]. The extension of traditional quantitative proteomics methods with chemical cross-linking can provide information on the structural dynamics of protein structures and protein complexes. The identification and quantitation of cross-linked peptides remains challenging for the general community, requiring specialized expertise ultimately limiting more widespread adoption of the technique. We describe a general method for targeted quantitative mass spectrometric analysis of cross-linked peptide pairs. We report the adaptation of the widely used, open source software package Skyline, for the analysis of quantitative XL-MS data as a means for data analysis and sharing of methods. We demonstrate the utility and robustness of the method with a cross-laboratory study and present data that is supported by and validates previously published data on quantified cross-linked peptide pairs. This advance provides an easy to use resource so that any lab with access to a LC-MS system capable of performing targeted quantitative analysis can quickly and accurately measure dynamic changes in protein structure and protein interactions.

  4. A discrete search algorithm for finding the structure of protein backbones and side chains.

    PubMed

    Sallaume, Silas; Martins, Simone de Lima; Ochi, Luiz Satoru; Da Silva, Warley Gramacho; Lavor, Carlile; Liberti, Leo

    2013-01-01

    Some information about protein structure can be obtained by using Nuclear Magnetic Resonance (NMR) techniques, but they provide only a sparse set of distances between atoms in a protein. The Molecular Distance Geometry Problem (MDGP) consists in determining the three-dimensional structure of a molecule using a set of known distances between some atoms. Recently, a Branch and Prune (BP) algorithm was proposed to calculate the backbone of a protein, based on a discrete formulation for the MDGP. We present an extension of the BP algorithm that can calculate not only the protein backbone, but the whole three-dimensional structure of proteins.

  5. The Structures of Life

    ERIC Educational Resources Information Center

    National Institute of General Medical Sciences (NIGMS), 2007

    2007-01-01

    This booklet reveals how structural biology provides insight into health and disease and is useful in developing new medications. It contains a general introduction to proteins, coverage of the techniques used to determine protein structures, and a chapter on structure-based drug design. The booklet features "Student Snapshots," designed to…

  6. The structure of the bacteriophage PRD1 spike sheds light on the evolution of viral capsid architecture.

    PubMed

    Merckel, Michael C; Huiskonen, Juha T; Bamford, Dennis H; Goldman, Adrian; Tuma, Roman

    2005-04-15

    Comparisons of bacteriophage PRD1 and adenovirus protein structures and virion architectures have been instrumental in unraveling an evolutionary relationship and have led to a proposal of a phylogeny-based virus classification. The structure of the PRD1 spike protein P5 provides further insight into the evolution of viral proteins. The crystallized P5 fragment comprises two structural domains: a globular knob and a fibrous shaft. The head folds into a ten-stranded jelly roll beta barrel, which is structurally related to the tumor necrosis factor (TNF) and the PRD1 coat protein domains. The shaft domain is a structural counterpart to the adenovirus spike shaft. The structural relationships between PRD1, TNF, and adenovirus proteins suggest that the vertex proteins may have originated from an ancestral TNF-like jelly roll coat protein via a combination of gene duplication and deletion.

  7. SSEP: secondary structural elements of proteins

    PubMed Central

    Shanthi, V.; Selvarani, P.; Kiran Kumar, Ch.; Mohire, C. S.; Sekar, K.

    2003-01-01

    SSEP is a comprehensive resource for accessing information related to the secondary structural elements present in the 25 and 90% non-redundant protein chains. The database contains 1771 protein chains from 1670 protein structures and 6182 protein chains from 5425 protein structures in 25 and 90% non-redundant protein chains, respectively. The current version provides information about the α-helical segments and β-strand fragments of varying lengths. In addition, it also contains the information about 310-helix, β- and ν-turns and hairpin loops. The free graphics program RASMOL has been interfaced with the search engine to visualize the three-dimensional structures of the user queried secondary structural fragment. The database is updated regularly and is available through Bioinformatics web server at http://cluster.physics.iisc.ernet.in/ssep/ or http://144.16.71.148/ssep/. PMID:12824336

  8. The proteome: structure, function and evolution

    PubMed Central

    Fleming, Keiran; Kelley, Lawrence A; Islam, Suhail A; MacCallum, Robert M; Muller, Arne; Pazos, Florencio; Sternberg, Michael J.E

    2006-01-01

    This paper reports two studies to model the inter-relationships between protein sequence, structure and function. First, an automated pipeline to provide a structural annotation of proteomes in the major genomes is described. The results are stored in a database at Imperial College, London (3D-GENOMICS) that can be accessed at www.sbg.bio.ic.ac.uk. Analysis of the assignments to structural superfamilies provides evolutionary insights. 3D-GENOMICS is being integrated with related proteome annotation data at University College London and the European Bioinformatics Institute in a project known as e-protein (http://www.e-protein.org/). The second topic is motivated by the developments in structural genomics projects in which the structure of a protein is determined prior to knowledge of its function. We have developed a new approach PHUNCTIONER that uses the gene ontology (GO) classification to supervise the extraction of the sequence signal responsible for protein function from a structure-based sequence alignment. Using GO we can obtain profiles for a range of specificities described in the ontology. In the region of low sequence similarity (around 15%), our method is more accurate than assignment from the closest structural homologue. The method is also able to identify the specific residues associated with the function of the protein family. PMID:16524832

  9. Objective identification of residue ranges for the superposition of protein structures

    PubMed Central

    2011-01-01

    Background The automation of objectively selecting amino acid residue ranges for structure superpositions is important for meaningful and consistent protein structure analyses. So far there is no widely-used standard for choosing these residue ranges for experimentally determined protein structures, where the manual selection of residue ranges or the use of suboptimal criteria remain commonplace. Results We present an automated and objective method for finding amino acid residue ranges for the superposition and analysis of protein structures, in particular for structure bundles resulting from NMR structure calculations. The method is implemented in an algorithm, CYRANGE, that yields, without protein-specific parameter adjustment, appropriate residue ranges in most commonly occurring situations, including low-precision structure bundles, multi-domain proteins, symmetric multimers, and protein complexes. Residue ranges are chosen to comprise as many residues of a protein domain that increasing their number would lead to a steep rise in the RMSD value. Residue ranges are determined by first clustering residues into domains based on the distance variance matrix, and then refining for each domain the initial choice of residues by excluding residues one by one until the relative decrease of the RMSD value becomes insignificant. A penalty for the opening of gaps favours contiguous residue ranges in order to obtain a result that is as simple as possible, but not simpler. Results are given for a set of 37 proteins and compared with those of commonly used protein structure validation packages. We also provide residue ranges for 6351 NMR structures in the Protein Data Bank. Conclusions The CYRANGE method is capable of automatically determining residue ranges for the superposition of protein structure bundles for a large variety of protein structures. The method correctly identifies ordered regions. Global structure superpositions based on the CYRANGE residue ranges allow a clear presentation of the structure, and unnecessary small gaps within the selected ranges are absent. In the majority of cases, the residue ranges from CYRANGE contain fewer gaps and cover considerably larger parts of the sequence than those from other methods without significantly increasing the RMSD values. CYRANGE thus provides an objective and automatic method for standardizing the choice of residue ranges for the superposition of protein structures. PMID:21592348

  10. Fold independent structural comparisons of protein-ligand binding sites for exploring functional relationships.

    PubMed

    Gold, Nicola D; Jackson, Richard M

    2006-02-03

    The rapid growth in protein structural data and the emergence of structural genomics projects have increased the need for automatic structure analysis and tools for function prediction. Small molecule recognition is critical to the function of many proteins; therefore, determination of ligand binding site similarity is important for understanding ligand interactions and may allow their functional classification. Here, we present a binding sites database (SitesBase) that given a known protein-ligand binding site allows rapid retrieval of other binding sites with similar structure independent of overall sequence or fold similarity. However, each match is also annotated with sequence similarity and fold information to aid interpretation of structure and functional similarity. Similarity in ligand binding sites can indicate common binding modes and recognition of similar molecules, allowing potential inference of function for an uncharacterised protein or providing additional evidence of common function where sequence or fold similarity is already known. Alternatively, the resource can provide valuable information for detailed studies of molecular recognition including structure-based ligand design and in understanding ligand cross-reactivity. Here, we show examples of atomic similarity between superfamily or more distant fold relatives as well as between seemingly unrelated proteins. Assignment of unclassified proteins to structural superfamiles is also undertaken and in most cases substantiates assignments made using sequence similarity. Correct assignment is also possible where sequence similarity fails to find significant matches, illustrating the potential use of binding site comparisons for newly determined proteins.

  11. Insights into Fanconi Anaemia from the structure of human FANCE

    PubMed Central

    Nookala, Ravi K.; Hussain, Shobbir; Pellegrini, Luca

    2007-01-01

    Fanconi Anaemia (FA) is a cancer predisposition disorder characterized by spontaneous chromosome breakage and high cellular sensitivity to genotoxic agents. In response to DNA damage, a multi-subunit assembly of FA proteins, the FA core complex, monoubiquitinates the downstream FANCD2 protein. The FANCE protein plays an essential role in the FA process of DNA repair as the FANCD2-binding component of the FA core complex. Here we report a crystallographic and biological study of human FANCE. The first structure of a FA protein reveals the presence of a repeated helical motif that provides a template for the structural rationalization of other proteins defective in Fanconi Anaemia. The portion of FANCE defined by our crystallographic analysis is sufficient for interaction with FANCD2, yielding structural information into the mode of FANCD2 recruitment to the FA core complex. Disease-associated mutations disrupt the FANCE–FANCD2 interaction, providing structural insight into the molecular mechanisms of FA pathogenesis. PMID:17308347

  12. GAP Final Technical Report 12-14-04

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Andrew J. Bordner, PhD, Senior Research Scientist

    2004-12-14

    The Genomics Annotation Platform (GAP) was designed to develop new tools for high throughput functional annotation and characterization of protein sequences and structures resulting from genomics and structural proteomics, benchmarking and application of those tools. Furthermore, this platform integrated the genomic scale sequence and structural analysis and prediction tools with the advanced structure prediction and bioinformatics environment of ICM. The development of GAP was primarily oriented towards the annotation of new biomolecular structures using both structural and sequence data. Even though the amount of protein X-ray crystal data is growing exponentially, the volume of sequence data is growing even moremore » rapidly. This trend was exploited by leveraging the wealth of sequence data to provide functional annotation for protein structures. The additional information provided by GAP is expected to assist the majority of the commercial users of ICM, who are involved in drug discovery, in identifying promising drug targets as well in devising strategies for the rational design of therapeutics directed at the protein of interest. The GAP also provided valuable tools for biochemistry education, and structural genomics centers. In addition, GAP incorporates many novel prediction and analysis methods not available in other molecular modeling packages. This development led to signing the first Molsoft agreement in the structural genomics annotation area with the University of oxford Structural Genomics Center. This commercial agreement validated the Molsoft efforts under the GAP project and provided the basis for further development of the large scale functional annotation platform.« less

  13. 3Drefine: an interactive web server for efficient protein structure refinement

    PubMed Central

    Bhattacharya, Debswapna; Nowotny, Jackson; Cao, Renzhi; Cheng, Jianlin

    2016-01-01

    3Drefine is an interactive web server for consistent and computationally efficient protein structure refinement with the capability to perform web-based statistical and visual analysis. The 3Drefine refinement protocol utilizes iterative optimization of hydrogen bonding network combined with atomic-level energy minimization on the optimized model using a composite physics and knowledge-based force fields for efficient protein structure refinement. The method has been extensively evaluated on blind CASP experiments as well as on large-scale and diverse benchmark datasets and exhibits consistent improvement over the initial structure in both global and local structural quality measures. The 3Drefine web server allows for convenient protein structure refinement through a text or file input submission, email notification, provided example submission and is freely available without any registration requirement. The server also provides comprehensive analysis of submissions through various energy and statistical feedback and interactive visualization of multiple refined models through the JSmol applet that is equipped with numerous protein model analysis tools. The web server has been extensively tested and used by many users. As a result, the 3Drefine web server conveniently provides a useful tool easily accessible to the community. The 3Drefine web server has been made publicly available at the URL: http://sysbio.rnet.missouri.edu/3Drefine/. PMID:27131371

  14. The neuronal porosome complex in health and disease

    PubMed Central

    Naik, Akshata R; Lewis, Kenneth T

    2015-01-01

    Cup-shaped secretory portals at the cell plasma membrane called porosomes mediate the precision release of intravesicular material from cells. Membrane-bound secretory vesicles transiently dock and fuse at the base of porosomes facing the cytosol to expel pressurized intravesicular contents from the cell during secretion. The structure, isolation, composition, and functional reconstitution of the neuronal porosome complex have greatly progressed, providing a molecular understanding of its function in health and disease. Neuronal porosomes are 15 nm cup-shaped lipoprotein structures composed of nearly 40 proteins, compared to the 120 nm nuclear pore complex composed of >500 protein molecules. Membrane proteins compose the porosome complex, making it practically impossible to solve its atomic structure. However, atomic force microscopy and small-angle X-ray solution scattering studies have provided three-dimensional structural details of the native neuronal porosome at sub-nanometer resolution, providing insights into the molecular mechanism of its function. The participation of several porosome proteins previously implicated in neurotransmission and neurological disorders, further attest to the crosstalk between porosome proteins and their coordinated involvement in release of neurotransmitter at the synapse. PMID:26264442

  15. Design of structurally distinct proteins using strategies inspired by evolution

    DOE PAGES

    Jacobs, T. M.; Williams, B.; Williams, T.; ...

    2016-05-06

    Natural recombination combines pieces of preexisting proteins to create new tertiary structures and functions. In this paper, we describe a computational protocol, called SEWING, which is inspired by this process and builds new proteins from connected or disconnected pieces of existing structures. Helical proteins designed with SEWING contain structural features absent from other de novo designed proteins and, in some cases, remain folded at more than 100°C. High-resolution structures of the designed proteins CA01 and DA05R1 were solved by x-ray crystallography (2.2 angstrom resolution) and nuclear magnetic resonance, respectively, and there was excellent agreement with the design models. Finally, thismore » method provides a new strategy to rapidly create large numbers of diverse and designable protein scaffolds.« less

  16. The structure of human ADP-ribosylhydrolase 3 (ARH3) provides insights into the reversibility of protein ADP-ribosylation

    PubMed Central

    Mueller-Dieckmann, Christoph; Kernstock, Stefan; Lisurek, Michael; von Kries, Jens Peter; Haag, Friedrich; Weiss, Manfred S.; Koch-Nolte, Friedrich

    2006-01-01

    Posttranslational modifications are used by cells from all kingdoms of life to control enzymatic activity and to regulate protein function. For many cellular processes, including DNA repair, spindle function, and apoptosis, reversible mono- and polyADP-ribosylation constitutes a very important regulatory mechanism. Moreover, many pathogenic bacteria secrete toxins which ADP-ribosylate human proteins, causing diseases such as whooping cough, cholera, and diphtheria. Whereas the 3D structures of numerous ADP-ribosylating toxins and related mammalian enzymes have been elucidated, virtually nothing is known about the structure of protein de-ADP-ribosylating enzymes. Here, we report the 3Dstructure of human ADP-ribosylhydrolase 3 (hARH3). The molecular architecture of hARH3 constitutes the archetype of an all-α-helical protein fold and provides insights into the reversibility of protein ADP-ribosylation. Two magnesium ions flanked by highly conserved amino acids pinpoint the active-site crevice. Recombinant hARH3 binds free ADP-ribose with micromolar affinity and efficiently de-ADP-ribosylates poly- but not monoADP-ribosylated proteins. Docking experiments indicate a possible binding mode for ADP-ribose polymers and suggest a reaction mechanism. Our results underscore the importance of endogenous ADP-ribosylation cycles and provide a basis for structure-based design of ADP-ribosylhydrolase inhibitors. PMID:17015823

  17. Gi- and Gs-coupled GPCRs show different modes of G-protein binding.

    PubMed

    Van Eps, Ned; Altenbach, Christian; Caro, Lydia N; Latorraca, Naomi R; Hollingsworth, Scott A; Dror, Ron O; Ernst, Oliver P; Hubbell, Wayne L

    2018-03-06

    More than two decades ago, the activation mechanism for the membrane-bound photoreceptor and prototypical G protein-coupled receptor (GPCR) rhodopsin was uncovered. Upon light-induced changes in ligand-receptor interaction, movement of specific transmembrane helices within the receptor opens a crevice at the cytoplasmic surface, allowing for coupling of heterotrimeric guanine nucleotide-binding proteins (G proteins). The general features of this activation mechanism are conserved across the GPCR superfamily. Nevertheless, GPCRs have selectivity for distinct G-protein family members, but the mechanism of selectivity remains elusive. Structures of GPCRs in complex with the stimulatory G protein, G s , and an accessory nanobody to stabilize the complex have been reported, providing information on the intermolecular interactions. However, to reveal the structural selectivity filters, it will be necessary to determine GPCR-G protein structures involving other G-protein subtypes. In addition, it is important to obtain structures in the absence of a nanobody that may influence the structure. Here, we present a model for a rhodopsin-G protein complex derived from intermolecular distance constraints between the activated receptor and the inhibitory G protein, G i , using electron paramagnetic resonance spectroscopy and spin-labeling methodologies. Molecular dynamics simulations demonstrated the overall stability of the modeled complex. In the rhodopsin-G i complex, G i engages rhodopsin in a manner distinct from previous GPCR-G s structures, providing insight into specificity determinants. Copyright © 2018 the Author(s). Published by PNAS.

  18. Crystal Structure of Menin Reveals Binding Site for Mixed Lineage Leukemia (MLL) Protein

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Murai, Marcelo J.; Chruszcz, Maksymilian; Reddy, Gireesh

    2014-10-02

    Menin is a tumor suppressor protein that is encoded by the MEN1 (multiple endocrine neoplasia 1) gene and controls cell growth in endocrine tissues. Importantly, menin also serves as a critical oncogenic cofactor of MLL (mixed lineage leukemia) fusion proteins in acute leukemias. Direct association of menin with MLL fusion proteins is required for MLL fusion protein-mediated leukemogenesis in vivo, and this interaction has been validated as a new potential therapeutic target for development of novel anti-leukemia agents. Here, we report the first crystal structure of menin homolog from Nematostella vectensis. Due to a very high sequence similarity, the Nematostellamore » menin is a close homolog of human menin, and these two proteins likely have very similar structures. Menin is predominantly an {alpha}-helical protein with the protein core comprising three tetratricopeptide motifs that are flanked by two {alpha}-helical bundles and covered by a {beta}-sheet motif. A very interesting feature of menin structure is the presence of a large central cavity that is highly conserved between Nematostella and human menin. By employing site-directed mutagenesis, we have demonstrated that this cavity constitutes the binding site for MLL. Our data provide a structural basis for understanding the role of menin as a tumor suppressor protein and as an oncogenic co-factor of MLL fusion proteins. It also provides essential structural information for development of inhibitors targeting the menin-MLL interaction as a novel therapeutic strategy in MLL-related leukemias.« less

  19. Structural Assembly of Multidomain Proteins and Protein Complexes Guided by the Overall Rotational Diffusion Tensor

    PubMed Central

    Ryabov, Yaroslav; Fushman, David

    2008-01-01

    We present a simple and robust approach that uses the overall rotational diffusion tensor as a structural constraint for domain positioning in multidomain proteins and protein-protein complexes. This method offers the possibility to use NMR relaxation data for detailed structure characterization of such systems provided the structures of individual domains are available. The proposed approach extends the concept of using long-range information contained in the overall rotational diffusion tensor. In contrast to the existing approaches, we use both the principal axes and principal values of protein’s rotational diffusion tensor to determine not only the orientation but also the relative positioning of the individual domains in a protein. This is achieved by finding the domain arrangement in a molecule that provides the best possible agreement with all components of the overall rotational diffusion tensor derived from experimental data. The accuracy of the proposed approach is demonstrated for two protein systems with known domain arrangement and parameters of the overall tumbling: the HIV-1 protease homodimer and Maltose Binding Protein. The accuracy of the method and its sensitivity to domain positioning is also tested using computer-generated data for three protein complexes, for which the experimental diffusion tensors are not available. In addition, the proposed method is applied here to determine, for the first time, the structure of both open and closed conformations of Lys48-linked di-ubiquitin chain, where domain motions render impossible accurate structure determination by other methods. The proposed method opens new avenues for improving structure characterization of proteins in solution. PMID:17550252

  20. NMRe: a web server for NMR protein structure refinement with high-quality structure validation scores.

    PubMed

    Ryu, Hyojung; Lim, GyuTae; Sung, Bong Hyun; Lee, Jinhyuk

    2016-02-15

    Protein structure refinement is a necessary step for the study of protein function. In particular, some nuclear magnetic resonance (NMR) structures are of lower quality than X-ray crystallographic structures. Here, we present NMRe, a web-based server for NMR structure refinement. The previously developed knowledge-based energy function STAP (Statistical Torsion Angle Potential) was used for NMRe refinement. With STAP, NMRe provides two refinement protocols using two types of distance restraints. If a user provides NOE (Nuclear Overhauser Effect) data, the refinement is performed with the NOE distance restraints as a conventional NMR structure refinement. Additionally, NMRe generates NOE-like distance restraints based on the inter-hydrogen distances derived from the input structure. The efficiency of NMRe refinement was validated on 20 NMR structures. Most of the quality assessment scores of the refined NMR structures were better than those of the original structures. The refinement results are provided as a three-dimensional structure view, a secondary structure scheme, and numerical and graphical structure validation scores. NMRe is available at http://psb.kobic.re.kr/nmre/. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  1. Structures of membrane proteins

    PubMed Central

    Vinothkumar, Kutti R.; Henderson, Richard

    2010-01-01

    In reviewing the structures of membrane proteins determined up to the end of 2009, we present in words and pictures the most informative examples from each family. We group the structures together according to their function and architecture to provide an overview of the major principles and variations on the most common themes. The first structures, determined 20 years ago, were those of naturally abundant proteins with limited conformational variability, and each membrane protein structure determined was a major landmark. With the advent of complete genome sequences and efficient expression systems, there has been an explosion in the rate of membrane protein structure determination, with many classes represented. New structures are published every month and more than 150 unique membrane protein structures have been determined. This review analyses the reasons for this success, discusses the challenges that still lie ahead, and presents a concise summary of the key achievements with illustrated examples selected from each class. PMID:20667175

  2. Sequence, structure and function relationships in flaviviruses as assessed by evolutive aspects of its conserved non-structural protein domains.

    PubMed

    da Fonseca, Néli José; Lima Afonso, Marcelo Querino; Pedersolli, Natan Gonçalves; de Oliveira, Lucas Carrijo; Andrade, Dhiego Souto; Bleicher, Lucas

    2017-10-28

    Flaviviruses are responsible for serious diseases such as dengue, yellow fever, and zika fever. Their genomes encode a polyprotein which, after cleavage, results in three structural and seven non-structural proteins. Homologous proteins can be studied by conservation and coevolution analysis as detected in multiple sequence alignments, usually reporting positions which are strictly necessary for the structure and/or function of all members in a protein family or which are involved in a specific sub-class feature requiring the coevolution of residue sets. This study provides a complete conservation and coevolution analysis on all flaviviruses non-structural proteins, with results mapped on all well-annotated available sequences. A literature review on the residues found in the analysis enabled us to compile available information on their roles and distribution among different flaviviruses. Also, we provide the mapping of conserved and coevolved residues for all sequences currently in SwissProt as a supplementary material, so that particularities in different viruses can be easily analyzed. Copyright © 2017 Elsevier Inc. All rights reserved.

  3. The synthesis of recombinant membrane proteins in yeast for structural studies.

    PubMed

    Routledge, Sarah J; Mikaliunaite, Lina; Patel, Anjana; Clare, Michelle; Cartwright, Stephanie P; Bawa, Zharain; Wilks, Martin D B; Low, Floren; Hardy, David; Rothnie, Alice J; Bill, Roslyn M

    2016-02-15

    Historically, recombinant membrane protein production has been a major challenge meaning that many fewer membrane protein structures have been published than those of soluble proteins. However, there has been a recent, almost exponential increase in the number of membrane protein structures being deposited in the Protein Data Bank. This suggests that empirical methods are now available that can ensure the required protein supply for these difficult targets. This review focuses on methods that are available for protein production in yeast, which is an important source of recombinant eukaryotic membrane proteins. We provide an overview of approaches to optimize the expression plasmid, host cell and culture conditions, as well as the extraction and purification of functional protein for crystallization trials in preparation for structural studies. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.

  4. Comprehensive inventory of protein complexes in the Protein Data Bank from consistent classification of interfaces.

    PubMed

    Bordner, Andrew J; Gorin, Andrey A

    2008-05-12

    Protein-protein interactions are ubiquitous and essential for all cellular processes. High-resolution X-ray crystallographic structures of protein complexes can reveal the details of their function and provide a basis for many computational and experimental approaches. Differentiation between biological and non-biological contacts and reconstruction of the intact complex is a challenging computational problem. A successful solution can provide additional insights into the fundamental principles of biological recognition and reduce errors in many algorithms and databases utilizing interaction information extracted from the Protein Data Bank (PDB). We have developed a method for identifying protein complexes in the PDB X-ray structures by a four step procedure: (1) comprehensively collecting all protein-protein interfaces; (2) clustering similar protein-protein interfaces together; (3) estimating the probability that each cluster is relevant based on a diverse set of properties; and (4) combining these scores for each PDB entry in order to predict the complex structure. The resulting clusters of biologically relevant interfaces provide a reliable catalog of evolutionary conserved protein-protein interactions. These interfaces, as well as the predicted protein complexes, are available from the Protein Interface Server (PInS) website (see Availability and requirements section). Our method demonstrates an almost two-fold reduction of the annotation error rate as evaluated on a large benchmark set of complexes validated from the literature. We also estimate relative contributions of each interface property to the accurate discrimination of biologically relevant interfaces and discuss possible directions for further improving the prediction method.

  5. STUDIES OF METABOLITE-PROTEIN INTERACTIONS: A REVIEW

    PubMed Central

    Matsuda, Ryan; Bi, Cong; Anguizola, Jeanethe; Sobansky, Matthew; Rodriquez, Elliot; Badilla, John Vargas; Zheng, Xiwei; Hage, Benjamin; Hage, David S.

    2014-01-01

    The study of metabolomics can provide valuable information about biochemical pathways and processes at the molecular level. There have been many reports that have examined the structure, identity and concentrations of metabolites in biological systems. However, the binding of metabolites with proteins is also of growing interest. This review examines past reports that have looked at the binding of various types of metabolites with proteins. An overview of the techniques that have been used to characterize and study metabolite-protein binding is first provided. This is followed by examples of studies that have investigated the binding of hormones, fatty acids, drugs or other xenobiotics, and their metabolites with transport proteins and receptors. These examples include reports that have considered the structure of the resulting solute-protein complexes, the nature of the binding sites, the strength of these interactions, the variations in these interactions with solute structure, and the kinetics of these reactions. The possible effects of metabolic diseases on these processes, including the impact of alterations in the structure and function of proteins, are also considered. PMID:24321277

  6. Modeling Protein Expression and Protein Signaling Pathways

    PubMed Central

    Telesca, Donatello; Müller, Peter; Kornblau, Steven M.; Suchard, Marc A.; Ji, Yuan

    2015-01-01

    High-throughput functional proteomic technologies provide a way to quantify the expression of proteins of interest. Statistical inference centers on identifying the activation state of proteins and their patterns of molecular interaction formalized as dependence structure. Inference on dependence structure is particularly important when proteins are selected because they are part of a common molecular pathway. In that case, inference on dependence structure reveals properties of the underlying pathway. We propose a probability model that represents molecular interactions at the level of hidden binary latent variables that can be interpreted as indicators for active versus inactive states of the proteins. The proposed approach exploits available expert knowledge about the target pathway to define an informative prior on the hidden conditional dependence structure. An important feature of this prior is that it provides an instrument to explicitly anchor the model space to a set of interactions of interest, favoring a local search approach to model determination. We apply our model to reverse-phase protein array data from a study on acute myeloid leukemia. Our inference identifies relevant subpathways in relation to the unfolding of the biological process under study. PMID:26246646

  7. Neutron protein crystallography: A complementary tool for locating hydrogens in proteins.

    PubMed

    O'Dell, William B; Bodenheimer, Annette M; Meilleur, Flora

    2016-07-15

    Neutron protein crystallography is a powerful tool for investigating protein chemistry because it directly locates hydrogen atom positions in a protein structure. The visibility of hydrogen and deuterium atoms arises from the strong interaction of neutrons with the nuclei of these isotopes. Positions can be unambiguously assigned from diffraction at resolutions typical of protein crystals. Neutrons have the additional benefit to structural biology of not inducing radiation damage in protein crystals. The same crystal could be measured multiple times for parametric studies. Here, we review the basic principles of neutron protein crystallography. The information that can be gained from a neutron structure is presented in balance with practical considerations. Methods to produce isotopically-substituted proteins and to grow large crystals are provided in the context of neutron structures reported in the literature. Available instruments for data collection and software for data processing and structure refinement are described along with technique-specific strategies including joint X-ray/neutron structure refinement. Examples are given to illustrate, ultimately, the unique scientific value of neutron protein crystal structures. Copyright © 2015 Elsevier Inc. All rights reserved.

  8. Geometry motivated alternative view on local protein backbone structures.

    PubMed

    Zacharias, Jan; Knapp, Ernst Walter

    2013-11-01

    We present an alternative to the classical Ramachandran plot (R-plot) to display local protein backbone structure. Instead of the (φ, ψ)-backbone angles relating to the chemical architecture of polypeptides generic helical parameters are used. These are the rotation or twist angle ϑ and the helical rise parameter d. Plots with these parameters provide a different view on the nature of local protein backbone structures. It allows to display the local structures in polar (d, ϑ)-coordinates, which is not possible for an R-plot, where structural regimes connected by periodicity appear disconnected. But there are other advantages, like a clear discrimination of the handedness of a local structure, a larger spread of the different local structure domains--the latter can yield a better separation of different local secondary structure motives--and many more. Compared to the R-plot we are not aware of any major disadvantage to classify local polypeptide structures with the (d, ϑ)-plot, except that it requires some elementary computations. To facilitate usage of the new (d, ϑ)-plot for protein structures we provide a web application (http://agknapp.chemie.fu-berlin.de/secsass), which shows the (d, ϑ)-plot side-by-side with the R-plot. © 2013 The Protein Society.

  9. Method for protein structure alignment

    DOEpatents

    Blankenbecler, Richard; Ohlsson, Mattias; Peterson, Carsten; Ringner, Markus

    2005-02-22

    This invention provides a method for protein structure alignment. More particularly, the present invention provides a method for identification, classification and prediction of protein structures. The present invention involves two key ingredients. First, an energy or cost function formulation of the problem simultaneously in terms of binary (Potts) assignment variables and real-valued atomic coordinates. Second, a minimization of the energy or cost function by an iterative method, where in each iteration (1) a mean field method is employed for the assignment variables and (2) exact rotation and/or translation of atomic coordinates is performed, weighted with the corresponding assignment variables.

  10. Trends in structural coverage of the protein universe and the impact of the Protein Structure Initiative.

    PubMed

    Khafizov, Kamil; Madrid-Aliste, Carlos; Almo, Steven C; Fiser, Andras

    2014-03-11

    The exponential growth of protein sequence data provides an ever-expanding body of unannotated and misannotated proteins. The National Institutes of Health-supported Protein Structure Initiative and related worldwide structural genomics efforts facilitate functional annotation of proteins through structural characterization. Recently there have been profound changes in the taxonomic composition of sequence databases, which are effectively redefining the scope and contribution of these large-scale structure-based efforts. The faster-growing bacterial genomic entries have overtaken the eukaryotic entries over the last 5 y, but also have become more redundant. Despite the enormous increase in the number of sequences, the overall structural coverage of proteins--including proteins for which reliable homology models can be generated--on the residue level has increased from 30% to 40% over the last 10 y. Structural genomics efforts contributed ∼50% of this new structural coverage, despite determining only ∼10% of all new structures. Based on current trends, it is expected that ∼55% structural coverage (the level required for significant functional insight) will be achieved within 15 y, whereas without structural genomics efforts, realizing this goal will take approximately twice as long.

  11. Structure and orientation of interfacial proteins determined by sum frequency generation vibrational spectroscopy: method and application.

    PubMed

    Ye, Shuji; Wei, Feng; Li, Hongchun; Tian, Kangzhen; Luo, Yi

    2013-01-01

    In situ and real-time characterization of molecular structures and orientation of proteins at interfaces is essential to understand the nature of interfacial protein interaction. Such work will undoubtedly provide important clues to control biointerface in a desired manner. Sum frequency generation vibrational spectroscopy (SFG-VS) has been demonstrated to be a powerful technique to study the interfacial structures and interactions at the molecular level. This paper first systematically introduced the methods for the calculation of the Raman polarizability tensor, infrared transition dipole moment, and SFG molecular hyperpolarizability tensor elements of proteins/peptides with the secondary structures of α-helix, 310-helix, antiparallel β-sheet, and parallel β-sheet, as well as the methodology to determine the orientation of interfacial protein secondary structures using SFG amide I spectra. After that, recent progresses on the determination of protein structure and orientation at different interfaces by SFG-VS were then reviewed, which provides a molecular-level understanding of the structures and interactions of interfacial proteins, specially understanding the nature of driving force behind such interactions. Although this review has focused on analysis of amide I spectra, it will be expected to offer a basic idea for the spectral analysis of amide III SFG signals and other complicated molecular systems such as RNA and DNA. Copyright © 2013 Elsevier Inc. All rights reserved.

  12. JavaProtein Dossier: a novel web-based data visualization tool for comprehensive analysis of protein structure

    PubMed Central

    Neshich, Goran; Rocchia, Walter; Mancini, Adauto L.; Yamagishi, Michel E. B.; Kuser, Paula R.; Fileto, Renato; Baudet, Christian; Pinto, Ivan P.; Montagner, Arnaldo J.; Palandrani, Juliana F.; Krauchenco, Joao N.; Torres, Renato C.; Souza, Savio; Togawa, Roberto C.; Higa, Roberto H.

    2004-01-01

    JavaProtein Dossier (JPD) is a new concept, database and visualization tool providing one of the largest collections of the physicochemical parameters describing proteins' structure, stability, function and interaction with other macromolecules. By collecting as many descriptors/parameters as possible within a single database, we can achieve a better use of the available data and information. Furthermore, data grouping allows us to generate different parameters with the potential to provide new insights into the sequence–structure–function relationship. In JPD, residue selection can be performed according to multiple criteria. JPD can simultaneously display and analyze all the physicochemical parameters of any pair of structures, using precalculated structural alignments, allowing direct parameter comparison at corresponding amino acid positions among homologous structures. In order to focus on the physicochemical (and consequently pharmacological) profile of proteins, visualization tools (showing the structure and structural parameters) also had to be optimized. Our response to this challenge was the use of Java technology with its exceptional level of interactivity. JPD is freely accessible (within the Gold Sting Suite) at http://sms.cbi.cnptia.embrapa.br, http://mirrors.rcsb.org/SMS, http://trantor.bioc.columbia.edu/SMS and http://www.es.embnet.org/SMS/ (Option: JavaProtein Dossier). PMID:15215458

  13. Structure of the SPRY domain of the human RNA helicase DDX1, a putative interaction platform within a DEAD-box protein

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kellner, Julian N.; Meinhart, Anton, E-mail: anton.meinhart@mpimf-heidelberg.mpg.de

    The structure of the SPRY domain of the human RNA helicase DDX1 was determined at 2.0 Å resolution. The SPRY domain provides a putative protein–protein interaction platform within DDX1 that differs from other SPRY domains in its structure and conserved regions. The human RNA helicase DDX1 in the DEAD-box family plays an important role in RNA processing and has been associated with HIV-1 replication and tumour progression. Whereas previously described DEAD-box proteins have a structurally conserved core, DDX1 shows a unique structural feature: a large SPRY-domain insertion in its RecA-like consensus fold. SPRY domains are known to function as protein–proteinmore » interaction platforms. Here, the crystal structure of the SPRY domain of human DDX1 (hDSPRY) is reported at 2.0 Å resolution. The structure reveals two layers of concave, antiparallel β-sheets that stack onto each other and a third β-sheet beneath the β-sandwich. A comparison with SPRY-domain structures from other eukaryotic proteins showed that the general β-sandwich fold is conserved; however, differences were detected in the loop regions, which were identified in other SPRY domains to be essential for interaction with cognate partners. In contrast, in hDSPRY these loop regions are not strictly conserved across species. Interestingly, though, a conserved patch of positive surface charge is found that may replace the connecting loops as a protein–protein interaction surface. The data presented here comprise the first structural information on DDX1 and provide insights into the unique domain architecture of this DEAD-box protein. By providing the structure of a putative interaction domain of DDX1, this work will serve as a basis for further studies of the interaction network within the hetero-oligomeric complexes of DDX1 and of its recruitment to the HIV-1 Rev protein as a viral replication factor.« less

  14. What determines the spectrum of protein native state structures?

    PubMed

    Lezon, Timothy R; Banavar, Jayanth R; Lesk, Arthur M; Maritan, Amos

    2006-05-01

    We present a brief summary of the key factors underlying protein structure, as developed in the investigations of Pauling, Ramachandran, and Rose. We then outline a simplified physical model of proteins that focuses on geometry and symmetry. Although this model superficially appears unrelated to the detailed chemical descriptions commonly applied to proteins, we show that it captures the essential elements of the chemistry and provides a unified framework for understanding the common characteristics of folded proteins. We suggest that the spectrum of protein native state structures is determined by geometry and symmetry and the role of the sequence is to choose its native state structure from this predetermined menu. 2006 Wiley-Liss, Inc.

  15. Structural hot spots for the solubility of globular proteins

    PubMed Central

    Ganesan, Ashok; Siekierska, Aleksandra; Beerten, Jacinte; Brams, Marijke; Van Durme, Joost; De Baets, Greet; Van der Kant, Rob; Gallardo, Rodrigo; Ramakers, Meine; Langenberg, Tobias; Wilkinson, Hannah; De Smet, Frederik; Ulens, Chris; Rousseau, Frederic; Schymkowitz, Joost

    2016-01-01

    Natural selection shapes protein solubility to physiological requirements and recombinant applications that require higher protein concentrations are often problematic. This raises the question whether the solubility of natural protein sequences can be improved. We here show an anti-correlation between the number of aggregation prone regions (APRs) in a protein sequence and its solubility, suggesting that mutational suppression of APRs provides a simple strategy to increase protein solubility. We show that mutations at specific positions within a protein structure can act as APR suppressors without affecting protein stability. These hot spots for protein solubility are both structure and sequence dependent but can be computationally predicted. We demonstrate this by reducing the aggregation of human α-galactosidase and protective antigen of Bacillus anthracis through mutation. Our results indicate that many proteins possess hot spots allowing to adapt protein solubility independently of structure and function. PMID:26905391

  16. Quality assessment of protein model-structures based on structural and functional similarities

    PubMed Central

    2012-01-01

    Background Experimental determination of protein 3D structures is expensive, time consuming and sometimes impossible. A gap between number of protein structures deposited in the World Wide Protein Data Bank and the number of sequenced proteins constantly broadens. Computational modeling is deemed to be one of the ways to deal with the problem. Although protein 3D structure prediction is a difficult task, many tools are available. These tools can model it from a sequence or partial structural information, e.g. contact maps. Consequently, biologists have the ability to generate automatically a putative 3D structure model of any protein. However, the main issue becomes evaluation of the model quality, which is one of the most important challenges of structural biology. Results GOBA - Gene Ontology-Based Assessment is a novel Protein Model Quality Assessment Program. It estimates the compatibility between a model-structure and its expected function. GOBA is based on the assumption that a high quality model is expected to be structurally similar to proteins functionally similar to the prediction target. Whereas DALI is used to measure structure similarity, protein functional similarity is quantified using standardized and hierarchical description of proteins provided by Gene Ontology combined with Wang's algorithm for calculating semantic similarity. Two approaches are proposed to express the quality of protein model-structures. One is a single model quality assessment method, the other is its modification, which provides a relative measure of model quality. Exhaustive evaluation is performed on data sets of model-structures submitted to the CASP8 and CASP9 contests. Conclusions The validation shows that the method is able to discriminate between good and bad model-structures. The best of tested GOBA scores achieved 0.74 and 0.8 as a mean Pearson correlation to the observed quality of models in our CASP8 and CASP9-based validation sets. GOBA also obtained the best result for two targets of CASP8, and one of CASP9, compared to the contest participants. Consequently, GOBA offers a novel single model quality assessment program that addresses the practical needs of biologists. In conjunction with other Model Quality Assessment Programs (MQAPs), it would prove useful for the evaluation of single protein models. PMID:22998498

  17. Structure of the Newcastle disease virus F protein in the post-fusion conformation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Swanson, Kurt; Wen, Xiaolin; Leser, George P.

    2010-11-17

    The paramyxovirus F protein is a class I viral membrane fusion protein which undergoes a significant refolding transition during virus entry. Previous studies of the Newcastle disease virus, human parainfluenza virus 3 and parainfluenza virus 5 F proteins revealed differences in the pre- and post-fusion structures. The NDV Queensland (Q) F structure lacked structural elements observed in the other two structures, which are key to the refolding and fusogenic activity of F. Here we present the NDV Australia-Victoria (AV) F protein post-fusion structure and provide EM evidence for its folding to a pre-fusion form. The NDV AV F structure containsmore » heptad repeat elements missing in the previous NDV Q F structure, forming a post-fusion six-helix bundle (6HB) similar to the post-fusion hPIV3 F structure. Electrostatic and temperature factor analysis of the F structures points to regions of these proteins that may be functionally important in their membrane fusion activity.« less

  18. Template-Based Modeling of Protein-RNA Interactions

    PubMed Central

    Zheng, Jinfang; Kundrotas, Petras J.; Vakser, Ilya A.

    2016-01-01

    Protein-RNA complexes formed by specific recognition between RNA and RNA-binding proteins play an important role in biological processes. More than a thousand of such proteins in human are curated and many novel RNA-binding proteins are to be discovered. Due to limitations of experimental approaches, computational techniques are needed for characterization of protein-RNA interactions. Although much progress has been made, adequate methodologies reliably providing atomic resolution structural details are still lacking. Although protein-RNA free docking approaches proved to be useful, in general, the template-based approaches provide higher quality of predictions. Templates are key to building a high quality model. Sequence/structure relationships were studied based on a representative set of binary protein-RNA complexes from PDB. Several approaches were tested for pairwise target/template alignment. The analysis revealed a transition point between random and correct binding modes. The results showed that structural alignment is better than sequence alignment in identifying good templates, suitable for generating protein-RNA complexes close to the native structure, and outperforms free docking, successfully predicting complexes where the free docking fails, including cases of significant conformational change upon binding. A template-based protein-RNA interaction modeling protocol PRIME was developed and benchmarked on a representative set of complexes. PMID:27662342

  19. PDBsum new things.

    PubMed

    Laskowski, Roman A

    2009-01-01

    PDBsum (http://www.ebi.ac.uk/pdbsum) provides summary information about each experimentally determined structural model in the Protein Data Bank (PDB). Here we describe some of its most recent features, including figures from the structure's key reference, citation data, Pfam domain diagrams, topology diagrams and protein-protein interactions. Furthermore, it now accepts users' own PDB format files and generates a private set of analyses for each uploaded structure.

  20. ssbio: a Python framework for structural systems biology.

    PubMed

    Mih, Nathan; Brunk, Elizabeth; Chen, Ke; Catoiu, Edward; Sastry, Anand; Kavvas, Erol; Monk, Jonathan M; Zhang, Zhen; Palsson, Bernhard O

    2018-06-15

    Working with protein structures at the genome-scale has been challenging in a variety of ways. Here, we present ssbio, a Python package that provides a framework to easily work with structural information in the context of genome-scale network reconstructions, which can contain thousands of individual proteins. The ssbio package provides an automated pipeline to construct high quality genome-scale models with protein structures (GEM-PROs), wrappers to popular third-party programs to compute associated protein properties, and methods to visualize and annotate structures directly in Jupyter notebooks, thus lowering the barrier of linking 3D structural data with established systems workflows. ssbio is implemented in Python and available to download under the MIT license at http://github.com/SBRG/ssbio. Documentation and Jupyter notebook tutorials are available at http://ssbio.readthedocs.io/en/latest/. Interactive notebooks can be launched using Binder at https://mybinder.org/v2/gh/SBRG/ssbio/master?filepath=Binder.ipynb. Supplementary data are available at Bioinformatics online.

  1. CCBuilder: an interactive web-based tool for building, designing and assessing coiled-coil protein assemblies.

    PubMed

    Wood, Christopher W; Bruning, Marc; Ibarra, Amaurys Á; Bartlett, Gail J; Thomson, Andrew R; Sessions, Richard B; Brady, R Leo; Woolfson, Derek N

    2014-11-01

    The ability to accurately model protein structures at the atomistic level underpins efforts to understand protein folding, to engineer natural proteins predictably and to design proteins de novo. Homology-based methods are well established and produce impressive results. However, these are limited to structures presented by and resolved for natural proteins. Addressing this problem more widely and deriving truly ab initio models requires mathematical descriptions for protein folds; the means to decorate these with natural, engineered or de novo sequences; and methods to score the resulting models. We present CCBuilder, a web-based application that tackles the problem for a defined but large class of protein structure, the α-helical coiled coils. CCBuilder generates coiled-coil backbones, builds side chains onto these frameworks and provides a range of metrics to measure the quality of the models. Its straightforward graphical user interface provides broad functionality that allows users to build and assess models, in which helix geometry, coiled-coil architecture and topology and protein sequence can be varied rapidly. We demonstrate the utility of CCBuilder by assembling models for 653 coiled-coil structures from the PDB, which cover >96% of the known coiled-coil types, and by generating models for rarer and de novo coiled-coil structures. CCBuilder is freely available, without registration, at http://coiledcoils.chm.bris.ac.uk/app/cc_builder/. © The Author 2014. Published by Oxford University Press.

  2. Contact-assisted protein structure modeling by global optimization in CASP11.

    PubMed

    Joo, Keehyoung; Joung, InSuk; Cheng, Qianyi; Lee, Sung Jong; Lee, Jooyoung

    2016-09-01

    We have applied the conformational space annealing method to the contact-assisted protein structure modeling in CASP11. For Tp targets, where predicted residue-residue contact information was provided, the contact energy term in the form of the Lorentzian function was implemented together with the physical energy terms used in our template-free modeling of proteins. Although we observed some structural improvement of Tp models over the models predicted without the Tp information, the improvement was not substantial on average. This is partly due to the inaccuracy of the provided contact information, where only about 18% of it was correct. For Ts targets, where the information of ambiguous NOE (Nuclear Overhauser Effect) restraints was provided, we formulated the modeling in terms of the two-tier optimization problem, which covers: (1) the assignment of NOE peaks and (2) the three-dimensional (3D) model generation based on the assigned NOEs. Although solving the problem in a direct manner appears to be intractable at first glance, we demonstrate through CASP11 that remarkably accurate protein 3D modeling is possible by brute force optimization of a relevant energy function. For 19 Ts targets of the average size of 224 residues, generated protein models were of about 3.6 Å Cα atom accuracy. Even greater structural improvement was observed when additional Tc contact information was provided. For 20 out of the total 24 Tc targets, we were able to generate protein structures which were better than the best model from the rest of the CASP11 groups in terms of GDT-TS. Proteins 2016; 84(Suppl 1):189-199. © 2015 Wiley Periodicals, Inc. © 2015 Wiley Periodicals, Inc.

  3. Structures of BIR domains from human NAIP and cIAP2.

    PubMed

    Herman, Maria Dolores; Moche, Martin; Flodin, Susanne; Welin, Martin; Trésaugues, Lionel; Johansson, Ida; Nilsson, Martina; Nordlund, Pär; Nyman, Tomas

    2009-11-01

    The inhibitor of apoptosis (IAP) family of proteins contains key modulators of apoptosis and inflammation that interact with caspases through baculovirus IAP-repeat (BIR) domains. Overexpression of IAP proteins frequently occurs in cancer cells, thus counteracting the activated apoptotic program. The IAP proteins have therefore emerged as promising targets for cancer therapy. In this work, X-ray crystallography was used to determine the first structures of BIR domains from human NAIP and cIAP2. Both structures harbour an N-terminal tetrapeptide in the conserved peptide-binding groove. The structures reveal that these two proteins bind the tetrapeptides in a similar mode as do other BIR domains. Detailed interactions are described for the P1'-P4' side chains of the peptide, providing a structural basis for peptide-specific recognition. An arginine side chain in the P3' position reveals favourable interactions with its hydrophobic moiety in the binding pocket, while hydrophobic residues in the P2' and P4' pockets make similar interactions to those seen in other BIR domain-peptide complexes. The structures also reveal how a serine in the P1' position is accommodated in the binding pockets of NAIP and cIAP2. In addition to shedding light on the specificity determinants of these two proteins, the structures should now also provide a framework for future structure-based work targeting these proteins.

  4. Structures of BIR domains from human NAIP and cIAP2

    PubMed Central

    Herman, Maria Dolores; Moche, Martin; Flodin, Susanne; Welin, Martin; Trésaugues, Lionel; Johansson, Ida; Nilsson, Martina; Nordlund, Pär; Nyman, Tomas

    2009-01-01

    The inhibitor of apoptosis (IAP) family of proteins contains key modulators of apoptosis and inflammation that interact with caspases through baculovirus IAP-repeat (BIR) domains. Overexpression of IAP proteins frequently occurs in cancer cells, thus counteracting the activated apoptotic program. The IAP proteins have therefore emerged as promising targets for cancer therapy. In this work, X-ray crystallography was used to determine the first structures of BIR domains from human NAIP and cIAP2. Both structures harbour an N-terminal tetrapeptide in the conserved peptide-binding groove. The structures reveal that these two proteins bind the tetrapeptides in a similar mode as do other BIR domains. Detailed interactions are described for the P1′–P4′ side chains of the peptide, providing a structural basis for peptide-specific recognition. An arginine side chain in the P3′ position reveals favourable interactions with its hydrophobic moiety in the binding pocket, while hydrophobic residues in the P2′ and P4′ pockets make similar interactions to those seen in other BIR domain–peptide complexes. The structures also reveal how a serine in the P1′ position is accommodated in the binding pockets of NAIP and cIAP2. In addition to shedding light on the specificity determinants of these two proteins, the structures should now also provide a framework for future structure-based work targeting these proteins. PMID:19923725

  5. Co-operative intra-protein structural response due to protein-protein complexation revealed through thermodynamic quantification: study of MDM2-p53 binding

    NASA Astrophysics Data System (ADS)

    Samanta, Sudipta; Mukherjee, Sanchita

    2017-10-01

    The p53 protein activation protects the organism from propagation of cells with damaged DNA having oncogenic mutations. In normal cells, activity of p53 is controlled by interaction with MDM2. The well understood p53-MDM2 interaction facilitates design of ligands that could potentially disrupt or prevent the complexation owing to its emergence as an important objective for cancer therapy. However, thermodynamic quantification of the p53-peptide induced structural changes of the MDM2-protein remains an area to be explored. This study attempts to understand the conformational free energy and entropy costs due to this complex formation from the histograms of dihedral angles generated from molecular dynamics simulations. Residue-specific quantification illustrates that, hydrophobic residues of the protein contribute maximum to the conformational thermodynamic changes. Thermodynamic quantification of structural changes of the protein unfold the fact that, p53 binding provides a source of inter-element cooperativity among the protein secondary structural elements, where the highest affected structural elements (α2 and α4) found at the binding site of the protein affects faraway structural elements (β1 and Loop1) of the protein. The communication perhaps involves water mediated hydrogen bonded network formation. Further, we infer that in inhibitory F19A mutation of P53, though Phe19 is important in the recognition process, it has less prominent contribution in the stability of the complex. Collectively, this study provides vivid microscopic understanding of the interaction within the protein complex along with exploring mutation sites, which will contribute further to engineer the protein function and binding affinity.

  6. Structural Transition and Antibody Binding of EBOV GP and ZIKV E Proteins from Pre-Fusion to Fusion-Initiation State.

    PubMed

    Lappala, Anna; Nishima, Wataru; Miner, Jacob; Fenimore, Paul; Fischer, Will; Hraber, Peter; Zhang, Ming; McMahon, Benjamin; Tung, Chang-Shung

    2018-05-10

    Membrane fusion proteins are responsible for viral entry into host cells—a crucial first step in viral infection. These proteins undergo large conformational changes from pre-fusion to fusion-initiation structures, and, despite differences in viral genomes and disease etiology, many fusion proteins are arranged as trimers. Structural information for both pre-fusion and fusion-initiation states is critical for understanding virus neutralization by the host immune system. In the case of Ebola virus glycoprotein (EBOV GP) and Zika virus envelope protein (ZIKV E), pre-fusion state structures have been identified experimentally, but only partial structures of fusion-initiation states have been described. While the fusion-initiation structure is in an energetically unfavorable state that is difficult to solve experimentally, the existing structural information combined with computational approaches enabled the modeling of fusion-initiation state structures of both proteins. These structural models provide an improved understanding of four different neutralizing antibodies in the prevention of viral host entry.

  7. Time to face the fats: what can mass spectrometry reveal about the structure of lipids and their interactions with proteins?

    PubMed

    Brown, Simon H J; Mitchell, Todd W; Oakley, Aaron J; Pham, Huong T; Blanksby, Stephen J

    2012-09-01

    Since the 1950s, X-ray crystallography has been the mainstay of structural biology, providing detailed atomic-level structures that continue to revolutionize our understanding of protein function. From recent advances in this discipline, a picture has emerged of intimate and specific interactions between lipids and proteins that has driven renewed interest in the structure of lipids themselves and raised intriguing questions as to the specificity and stoichiometry in lipid-protein complexes. Herein we demonstrate some of the limitations of crystallography in resolving critical structural features of ligated lipids and thus determining how these motifs impact protein binding. As a consequence, mass spectrometry must play an important and complementary role in unraveling the complexities of lipid-protein interactions. We evaluate recent advances and highlight ongoing challenges towards the twin goals of (1) complete structure elucidation of low, abundant, and structurally diverse lipids by mass spectrometry alone, and (2) assignment of stoichiometry and specificity of lipid interactions within protein complexes.

  8. Time to Face the Fats: What Can Mass Spectrometry Reveal about the Structure of Lipids and Their Interactions with Proteins?

    NASA Astrophysics Data System (ADS)

    Brown, Simon H. J.; Mitchell, Todd W.; Oakley, Aaron J.; Pham, Huong T.; Blanksby, Stephen J.

    2012-09-01

    Since the 1950s, X-ray crystallography has been the mainstay of structural biology, providing detailed atomic-level structures that continue to revolutionize our understanding of protein function. From recent advances in this discipline, a picture has emerged of intimate and specific interactions between lipids and proteins that has driven renewed interest in the structure of lipids themselves and raised intriguing questions as to the specificity and stoichiometry in lipid-protein complexes. Herein we demonstrate some of the limitations of crystallography in resolving critical structural features of ligated lipids and thus determining how these motifs impact protein binding. As a consequence, mass spectrometry must play an important and complementary role in unraveling the complexities of lipid-protein interactions. We evaluate recent advances and highlight ongoing challenges towards the twin goals of (1) complete structure elucidation of low, abundant, and structurally diverse lipids by mass spectrometry alone, and (2) assignment of stoichiometry and specificity of lipid interactions within protein complexes.

  9. The Structures of Life.

    ERIC Educational Resources Information Center

    National Inst. of General Medical Sciences (NIH), Bethesda, MD.

    This booklet, geared toward an advanced high school or early college-level audience, explains how structural biology provides insight into health and disease and is useful in developing new medications. This publication contains a general introduction to proteins, coverage of the techniques used to determine protein structures, and a chapter on…

  10. Modularity in protein structures: study on all-alpha proteins.

    PubMed

    Khan, Taushif; Ghosh, Indira

    2015-01-01

    Modularity is known as one of the most important features of protein's robust and efficient design. The architecture and topology of proteins play a vital role by providing necessary robust scaffolds to support organism's growth and survival in constant evolutionary pressure. These complex biomolecules can be represented by several layers of modular architecture, but it is pivotal to understand and explore the smallest biologically relevant structural component. In the present study, we have developed a component-based method, using protein's secondary structures and their arrangements (i.e. patterns) in order to investigate its structural space. Our result on all-alpha protein shows that the known structural space is highly populated with limited set of structural patterns. We have also noticed that these frequently observed structural patterns are present as modules or "building blocks" in large proteins (i.e. higher secondary structure content). From structural descriptor analysis, observed patterns are found to be within similar deviation; however, frequent patterns are found to be distinctly occurring in diverse functions e.g. in enzymatic classes and reactions. In this study, we are introducing a simple approach to explore protein structural space using combinatorial- and graph-based geometry methods, which can be used to describe modularity in protein structures. Moreover, analysis indicates that protein function seems to be the driving force that shapes the known structure space.

  11. Rapid analysis of protein backbone resonance assignments using cryogenic probes, a distributed Linux-based computing architecture, and an integrated set of spectral analysis tools.

    PubMed

    Monleón, Daniel; Colson, Kimberly; Moseley, Hunter N B; Anklin, Clemens; Oswald, Robert; Szyperski, Thomas; Montelione, Gaetano T

    2002-01-01

    Rapid data collection, spectral referencing, processing by time domain deconvolution, peak picking and editing, and assignment of NMR spectra are necessary components of any efficient integrated system for protein NMR structure analysis. We have developed a set of software tools designated AutoProc, AutoPeak, and AutoAssign, which function together with the data processing and peak-picking programs NMRPipe and Sparky, to provide an integrated software system for rapid analysis of protein backbone resonance assignments. In this paper we demonstrate that these tools, together with high-sensitivity triple resonance NMR cryoprobes for data collection and a Linux-based computer cluster architecture, can be combined to provide nearly complete backbone resonance assignments and secondary structures (based on chemical shift data) for a 59-residue protein in less than 30 hours of data collection and processing time. In this optimum case of a small protein providing excellent spectra, extensive backbone resonance assignments could also be obtained using less than 6 hours of data collection and processing time. These results demonstrate the feasibility of high throughput triple resonance NMR for determining resonance assignments and secondary structures of small proteins, and the potential for applying NMR in large scale structural proteomics projects.

  12. Structural basis of viral invasion: lessons from paramyxovirus F

    PubMed Central

    Lamb, Robert A.; Jardetzky, Theodore S.

    2007-01-01

    Summary The structures of glycoproteins that mediate enveloped virus entry into cells have revealed dramatic structural changes that accompany membrane fusion and provided mechanistic insights into this process. The group of class I viral fusion proteins includes the influenza hemagglutinin, paramyxovirus F, HIV env and other mechanistically related fusogens, but these proteins are unrelated in sequence and exhibit clearly distinct structural features. Recently determined crystal structures of the paramyxovirus F protein in two conformations, representing prefusion and postfusion states, reveal a novel protein architecture that undergoes large-scale, irreversible refolding during membrane fusion, extending our understanding of this diverse group of membrane fusion machines. PMID:17870467

  13. 3Drefine: an interactive web server for efficient protein structure refinement.

    PubMed

    Bhattacharya, Debswapna; Nowotny, Jackson; Cao, Renzhi; Cheng, Jianlin

    2016-07-08

    3Drefine is an interactive web server for consistent and computationally efficient protein structure refinement with the capability to perform web-based statistical and visual analysis. The 3Drefine refinement protocol utilizes iterative optimization of hydrogen bonding network combined with atomic-level energy minimization on the optimized model using a composite physics and knowledge-based force fields for efficient protein structure refinement. The method has been extensively evaluated on blind CASP experiments as well as on large-scale and diverse benchmark datasets and exhibits consistent improvement over the initial structure in both global and local structural quality measures. The 3Drefine web server allows for convenient protein structure refinement through a text or file input submission, email notification, provided example submission and is freely available without any registration requirement. The server also provides comprehensive analysis of submissions through various energy and statistical feedback and interactive visualization of multiple refined models through the JSmol applet that is equipped with numerous protein model analysis tools. The web server has been extensively tested and used by many users. As a result, the 3Drefine web server conveniently provides a useful tool easily accessible to the community. The 3Drefine web server has been made publicly available at the URL: http://sysbio.rnet.missouri.edu/3Drefine/. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  14. Improved protein surface comparison and application to low-resolution protein structure data.

    PubMed

    Sael, Lee; Kihara, Daisuke

    2010-12-14

    Recent advancements of experimental techniques for determining protein tertiary structures raise significant challenges for protein bioinformatics. With the number of known structures of unknown function expanding at a rapid pace, an urgent task is to provide reliable clues to their biological function on a large scale. Conventional approaches for structure comparison are not suitable for a real-time database search due to their slow speed. Moreover, a new challenge has arisen from recent techniques such as electron microscopy (EM), which provide low-resolution structure data. Previously, we have introduced a method for protein surface shape representation using the 3D Zernike descriptors (3DZDs). The 3DZD enables fast structure database searches, taking advantage of its rotation invariance and compact representation. The search results of protein surface represented with the 3DZD has showngood agreement with the existing structure classifications, but some discrepancies were also observed. The three new surface representations of backbone atoms, originally devised all-atom-surface representation, and the combination of all-atom surface with the backbone representation are examined. All representations are encoded with the 3DZD. Also, we have investigated the applicability of the 3DZD for searching protein EM density maps of varying resolutions. The surface representations are evaluated on structure retrieval using two existing classifications, SCOP and the CE-based classification. Overall, the 3DZDs representing backbone atoms show better retrieval performance than the original all-atom surface representation. The performance further improved when the two representations are combined. Moreover, we observed that the 3DZD is also powerful in comparing low-resolution structures obtained by electron microscopy.

  15. Homochiral stereochemistry: the missing link of structure to energetics in protein folding.

    PubMed

    Kumar, Anil; Ramakrishnan, Vibin; Ranbhor, Ranjit; Patel, Kirti; Durani, Susheel

    2009-12-24

    The notion is tested that homochiral stereochemistry being ubiquitous to protein structure could be critical to protein folding as well, causing it to become frustrated energetically providing the basis for its solvent- and sequence-mediated control. The proof in support of the notion is found in a consensus of experiment and computation according to which suitable oligopeptides are in their folding-unfolding equilibria, at both macrostate and microstate levels, susceptible to dielectric because of the conflict of peptide-chain electrostatics with interpeptide hydrogen bonds when the structure is poly-L but not when it is alternating-L,D. The argument is thus made that homochiral stereochemistry may in protein folding provide the unifying basis for its solvent- and sequence-mediated control based on screening of peptide-chain electrostatics under conflict with folding of the chain due to homochiral stereochemistry. Dielectric is brought into spotlight as the effect comparatively obscure but presumably critical to the folding in protein structure for its control.

  16. Looking at the Disordered Proteins through the Computational Microscope.

    PubMed

    Das, Payel; Matysiak, Silvina; Mittal, Jeetain

    2018-05-23

    Intrinsically disordered proteins (IDPs) have attracted wide interest over the past decade due to their surprising prevalence in the proteome and versatile roles in cell physiology and pathology. A large selection of IDPs has been identified as potential targets for therapeutic intervention. Characterizing the structure-function relationship of disordered proteins is therefore an essential but daunting task, as these proteins can adapt transient structure, necessitating a new paradigm for connecting structural disorder to function. Molecular simulation has emerged as a natural complement to experiments for atomic-level characterizations and mechanistic investigations of this intriguing class of proteins. The diverse range of length and time scales involved in IDP function requires performing simulations at multiple levels of resolution. In this Outlook, we focus on summarizing available simulation methods, along with a few interesting example applications. We also provide an outlook on how these simulation methods can be further improved in order to provide a more accurate description of IDP structure, binding, and assembly.

  17. A New Protein Architecture for Processing Alkylation Damaged DNA: The Crystal Structure of DNA Glycosylase AlkD

    PubMed Central

    Rubinson, Emily H.; Metz, Audrey H.; O'Quin, Jami; Eichman, Brandt F.

    2013-01-01

    Summary DNA glycosylases safeguard the genome by locating and excising chemically modified bases from DNA. AlkD is a recently discovered bacterial DNA glycosylase that removes positively charged methylpurines from DNA, and was predicted to adopt a protein fold distinct from other DNA repair proteins. The crystal structure of Bacillus cereus AlkD presented here shows that the protein is composed exclusively of helical HEAT-like repeats, which form a solenoid perfectly shaped to accommodate a DNA duplex on the concave surface. Structural analysis of the variant HEAT repeats in AlkD provides a rationale for how this protein scaffolding motif has been modified to bind DNA. We report 7mG excision and DNA binding activities of AlkD mutants, along with a comparison of alkylpurine DNA glycosylase structures. Together, these data provide important insight into the requirements for alkylation repair within DNA and suggest that AlkD utilizes a novel strategy to manipulate DNA in its search for alkylpurine bases. PMID:18585735

  18. Fitting Multimeric Protein Complexes into Electron Microscopy Maps Using 3D Zernike Descriptors

    PubMed Central

    Esquivel-Rodríguez, Juan; Kihara, Daisuke

    2012-01-01

    A novel computational method for fitting high-resolution structures of multiple proteins into a cryoelectron microscopy map is presented. The method named EMLZerD generates a pool of candidate multiple protein docking conformations of component proteins, which are later compared with a provided electron microscopy (EM) density map to select the ones that fit well into the EM map. The comparison of docking conformations and the EM map is performed using the 3D Zernike descriptor (3DZD), a mathematical series expansion of three-dimensional functions. The 3DZD provides a unified representation of the surface shape of multimeric protein complex models and EM maps, which allows a convenient, fast quantitative comparison of the three dimensional structural data. Out of 19 multimeric complexes tested, near native complex structures with a root mean square deviation of less than 2.5 Å were obtained for 14 cases while medium range resolution structures with correct topology were computed for the additional 5 cases. PMID:22417139

  19. Fitting multimeric protein complexes into electron microscopy maps using 3D Zernike descriptors.

    PubMed

    Esquivel-Rodríguez, Juan; Kihara, Daisuke

    2012-06-14

    A novel computational method for fitting high-resolution structures of multiple proteins into a cryoelectron microscopy map is presented. The method named EMLZerD generates a pool of candidate multiple protein docking conformations of component proteins, which are later compared with a provided electron microscopy (EM) density map to select the ones that fit well into the EM map. The comparison of docking conformations and the EM map is performed using the 3D Zernike descriptor (3DZD), a mathematical series expansion of three-dimensional functions. The 3DZD provides a unified representation of the surface shape of multimeric protein complex models and EM maps, which allows a convenient, fast quantitative comparison of the three-dimensional structural data. Out of 19 multimeric complexes tested, near native complex structures with a root-mean-square deviation of less than 2.5 Å were obtained for 14 cases while medium range resolution structures with correct topology were computed for the additional 5 cases.

  20. G Protein-Coupled Receptor Rhodopsin: A Prospectus

    PubMed Central

    Filipek, Sławomir; Stenkamp, Ronald E.; Teller, David C.; Palczewski, Krzysztof

    2006-01-01

    Rhodopsin is a retinal photoreceptor protein of bipartite structure consisting of the transmembrane protein opsin and a light-sensitive chromophore 11-cis-retinal, linked to opsin via a protonated Schiff base. Studies on rhodopsin have unveiled many structural and functional features that are common to a large and pharmacologically important group of proteins from the G protein-coupled receptor (GPCR) superfamily, of which rhodopsin is the best-studied member. In this work, we focus on structural features of rhodopsin as revealed by many biochemical and structural investigations. In particular, the high-resolution structure of bovine rhodopsin provides a template for understanding how GPCRs work. We describe the sensitivity and complexity of rhodopsin that lead to its important role in vision. PMID:12471166

  1. Structural Basis for the Acyltransferase Activity of Lecithin:Retinol Acyltransferase-like Proteins*

    PubMed Central

    Golczak, Marcin; Kiser, Philip D.; Sears, Avery E.; Lodowski, David T.; Blaner, William S.; Palczewski, Krzysztof

    2012-01-01

    Lecithin:retinol acyltransferase-like proteins, also referred to as HRAS-like tumor suppressors, comprise a vertebrate subfamily of papain-like or NlpC/P60 thiol proteases that function as phospholipid-metabolizing enzymes. HRAS-like tumor suppressor 3, a representative member of this group, plays a key role in regulating triglyceride accumulation and energy expenditure in adipocytes and therefore constitutes a novel pharmacological target for treatment of metabolic disorders causing obesity. Here, we delineate a catalytic mechanism common to lecithin:retinol acyltransferase-like proteins and provide evidence for their alternative robust lipid-dependent acyltransferase enzymatic activity. We also determined high resolution crystal structures of HRAS-like tumor suppressor 2 and 3 to gain insight into their active site architecture. Based on this structural analysis, two conformational states of the catalytic Cys-113 were identified that differ in reactivity and thus could define the catalytic properties of these two proteins. Finally, these structures provide a model for the topology of these enzymes and allow identification of the protein-lipid bilayer interface. This study contributes to the enzymatic and structural understanding of HRAS-like tumor suppressor enzymes. PMID:22605381

  2. Protein 3D Structure Computed from Evolutionary Sequence Variation

    PubMed Central

    Sheridan, Robert; Hopf, Thomas A.; Pagnani, Andrea; Zecchina, Riccardo; Sander, Chris

    2011-01-01

    The evolutionary trajectory of a protein through sequence space is constrained by its function. Collections of sequence homologs record the outcomes of millions of evolutionary experiments in which the protein evolves according to these constraints. Deciphering the evolutionary record held in these sequences and exploiting it for predictive and engineering purposes presents a formidable challenge. The potential benefit of solving this challenge is amplified by the advent of inexpensive high-throughput genomic sequencing. In this paper we ask whether we can infer evolutionary constraints from a set of sequence homologs of a protein. The challenge is to distinguish true co-evolution couplings from the noisy set of observed correlations. We address this challenge using a maximum entropy model of the protein sequence, constrained by the statistics of the multiple sequence alignment, to infer residue pair couplings. Surprisingly, we find that the strength of these inferred couplings is an excellent predictor of residue-residue proximity in folded structures. Indeed, the top-scoring residue couplings are sufficiently accurate and well-distributed to define the 3D protein fold with remarkable accuracy. We quantify this observation by computing, from sequence alone, all-atom 3D structures of fifteen test proteins from different fold classes, ranging in size from 50 to 260 residues., including a G-protein coupled receptor. These blinded inferences are de novo, i.e., they do not use homology modeling or sequence-similar fragments from known structures. The co-evolution signals provide sufficient information to determine accurate 3D protein structure to 2.7–4.8 Å Cα-RMSD error relative to the observed structure, over at least two-thirds of the protein (method called EVfold, details at http://EVfold.org). This discovery provides insight into essential interactions constraining protein evolution and will facilitate a comprehensive survey of the universe of protein structures, new strategies in protein and drug design, and the identification of functional genetic variants in normal and disease genomes. PMID:22163331

  3. Trends in structural coverage of the protein universe and the impact of the Protein Structure Initiative

    PubMed Central

    Khafizov, Kamil; Madrid-Aliste, Carlos; Almo, Steven C.; Fiser, Andras

    2014-01-01

    The exponential growth of protein sequence data provides an ever-expanding body of unannotated and misannotated proteins. The National Institutes of Health-supported Protein Structure Initiative and related worldwide structural genomics efforts facilitate functional annotation of proteins through structural characterization. Recently there have been profound changes in the taxonomic composition of sequence databases, which are effectively redefining the scope and contribution of these large-scale structure-based efforts. The faster-growing bacterial genomic entries have overtaken the eukaryotic entries over the last 5 y, but also have become more redundant. Despite the enormous increase in the number of sequences, the overall structural coverage of proteins—including proteins for which reliable homology models can be generated—on the residue level has increased from 30% to 40% over the last 10 y. Structural genomics efforts contributed ∼50% of this new structural coverage, despite determining only ∼10% of all new structures. Based on current trends, it is expected that ∼55% structural coverage (the level required for significant functional insight) will be achieved within 15 y, whereas without structural genomics efforts, realizing this goal will take approximately twice as long. PMID:24567391

  4. Why fibrous proteins are romantic.

    PubMed

    Cohen, C

    1998-01-01

    Here I give a personal account of the great history of fibrous protein structure. I describe how Astbury first recognized the essential simplicity of fibrous proteins and their paradigmatic role in protein structure. The poor diffraction patterns yielded by these proteins were then deciphered by Pauling, Crick, Ramachandran and others (in part by model building) to reveal alpha-helical coiled coils, beta-sheets, and the collagen triple helical coiled coil-all characterized by different local sequence periodicities. Longer-range sequence periodicities (or "magic numbers") present in diverse fibrous proteins, such as collagen, tropomyosin, paramyosin, myosin, and were then shown to account for the characteristic axial repeats observed in filaments of these proteins. More recently, analysis of fibrous protein structure has been extended in many cases to atomic resolution, and some systems, such as "leucine zippers," are providing a deeper understanding of protein design than similar studies of globular proteins. In the last sections, I provide some dramatic examples of fibrous protein dynamics. One example is the so-called "spring-loaded" mechanism for viral fusion by the hemagglutinin protein of influenza. Another is the possible conformational changes in prion proteins, implicated in "mad cow disease," which may be related to similar transitions in a variety of globular and fibrous proteins. Copyright 1998 Academic Press.

  5. AIDA: ab initio domain assembly for automated multi-domain protein structure prediction and domain–domain interaction prediction

    PubMed Central

    Xu, Dong; Jaroszewski, Lukasz; Li, Zhanwen; Godzik, Adam

    2015-01-01

    Motivation: Most proteins consist of multiple domains, independent structural and evolutionary units that are often reshuffled in genomic rearrangements to form new protein architectures. Template-based modeling methods can often detect homologous templates for individual domains, but templates that could be used to model the entire query protein are often not available. Results: We have developed a fast docking algorithm ab initio domain assembly (AIDA) for assembling multi-domain protein structures, guided by the ab initio folding potential. This approach can be extended to discontinuous domains (i.e. domains with ‘inserted’ domains). When tested on experimentally solved structures of multi-domain proteins, the relative domain positions were accurately found among top 5000 models in 86% of cases. AIDA server can use domain assignments provided by the user or predict them from the provided sequence. The latter approach is particularly useful for automated protein structure prediction servers. The blind test consisting of 95 CASP10 targets shows that domain boundaries could be successfully determined for 97% of targets. Availability and implementation: The AIDA package as well as the benchmark sets used here are available for download at http://ffas.burnham.org/AIDA/. Contact: adam@sanfordburnham.org Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25701568

  6. Improved in-cell structure determination of proteins at near-physiological concentration

    PubMed Central

    Ikeya, Teppei; Hanashima, Tomomi; Hosoya, Saori; Shimazaki, Manato; Ikeda, Shiro; Mishima, Masaki; Güntert, Peter; Ito, Yutaka

    2016-01-01

    Investigating three-dimensional (3D) structures of proteins in living cells by in-cell nuclear magnetic resonance (NMR) spectroscopy opens an avenue towards understanding the structural basis of their functions and physical properties under physiological conditions inside cells. In-cell NMR provides data at atomic resolution non-invasively, and has been used to detect protein-protein interactions, thermodynamics of protein stability, the behavior of intrinsically disordered proteins, etc. in cells. However, so far only a single de novo 3D protein structure could be determined based on data derived only from in-cell NMR. Here we introduce methods that enable in-cell NMR protein structure determination for a larger number of proteins at concentrations that approach physiological ones. The new methods comprise (1) advances in the processing of non-uniformly sampled NMR data, which reduces the measurement time for the intrinsically short-lived in-cell NMR samples, (2) automatic chemical shift assignment for obtaining an optimal resonance assignment, and (3) structure refinement with Bayesian inference, which makes it possible to calculate accurate 3D protein structures from sparse data sets of conformational restraints. As an example application we determined the structure of the B1 domain of protein G at about 250 μM concentration in living E. coli cells. PMID:27910948

  7. Predicting the helix packing of globular proteins by self-correcting distance geometry.

    PubMed

    Mumenthaler, C; Braun, W

    1995-05-01

    A new self-correcting distance geometry method for predicting the three-dimensional structure of small globular proteins was assessed with a test set of 8 helical proteins. With the knowledge of the amino acid sequence and the helical segments, our completely automated method calculated the correct backbone topology of six proteins. The accuracy of the predicted structures ranged from 2.3 A to 3.1 A for the helical segments compared to the experimentally determined structures. For two proteins, the predicted constraints were not restrictive enough to yield a conclusive prediction. The method can be applied to all small globular proteins, provided the secondary structure is known from NMR analysis or can be predicted with high reliability.

  8. Structure-function insights of membrane and soluble proteins revealed by electron crystallography.

    PubMed

    Dreaden, Tina M; Devarajan, Bharanidharan; Barry, Bridgette A; Schmidt-Krey, Ingeborg

    2013-01-01

    Electron crystallography is emerging as an important method in solving protein structures. While it has found extensive applications in the understanding of membrane protein structure and function at a wide range of resolutions, from revealing oligomeric arrangements to atomic models, electron crystallography has also provided invaluable information on the soluble α/β-tubulin which could not be obtained by any other method to date. Examples of critical insights from selected structures of membrane proteins as well as α/β-tubulin are described here, demonstrating the vast potential of electron crystallography that is first beginning to unfold.

  9. From protein structure to function via single crystal optical spectroscopy

    PubMed Central

    Ronda, Luca; Bruno, Stefano; Bettati, Stefano; Storici, Paola; Mozzarelli, Andrea

    2015-01-01

    The more than 100,000 protein structures determined by X-ray crystallography provide a wealth of information for the characterization of biological processes at the molecular level. However, several crystallographic “artifacts,” including conformational selection, crystallization conditions and radiation damages, may affect the quality and the interpretation of the electron density maps, thus limiting the relevance of structure determinations. Moreover, for most of these structures, no functional data have been obtained in the crystalline state, thus posing serious questions on their validity in infereing protein mechanisms. In order to solve these issues, spectroscopic methods have been applied for the determination of equilibrium and kinetic properties of proteins in the crystalline state. These methods are UV-vis spectrophotometry, spectrofluorimetry, IR, EPR, Raman, and resonance Raman spectroscopy. Some of these approaches have been implemented with on-line instruments at X-ray synchrotron beamlines. Here, we provide an overview of investigations predominantly carried out in our laboratory by single crystal polarized absorption UV-vis microspectrophotometry, the most applied technique for the functional characterization of proteins in the crystalline state. Studies on hemoglobins, pyridoxal 5′-phosphate dependent enzymes and green fluorescent protein in the crystalline state have addressed key biological issues, leading to either straightforward structure-function correlations or limitations to structure-based mechanisms. PMID:25988179

  10. Genome Pool Strategy for Structural Coverage of Protein Families

    PubMed Central

    Jaroszewski, Lukasz; Slabinski, Lukasz; Wooley, John; Deacon, Ashley M.; Lesley, Scott A.; Wilson, Ian. A.; Godzik, Adam

    2010-01-01

    As noticed by generations of structural biologists, closely homologous proteins may have substantially different crystallization properties and propensities. These observations can be used to systematically introduce additional dimensionality into crystallization trials by targeting homologous proteins from multiple genomes in a “genome pool” strategy. Through extensive use of our recently introduced “crystallization feasibility score” (Slabinski et al., 2007a), we can explain that the genome pool strategy works well because the crystallization feasibility scores are surprisingly broad within families of homologous proteins, with most families containing a range of optimal to very difficult targets. We also show that some families can be regarded as relatively “easy”, where a significant number of proteins are predicted to have optimal crystallization features, and others are “very difficult”, where almost none are predicted to result in a crystal structure. Thus, the outcome of such variable distributions of such crystallizability' preferences leads to uneven structural coverage of known families, with “easier” or “optimal” families having several times more solved structures than “very difficult” ones. Nevertheless, this latter category can be successfully targeted by increasing the number of genomes that are used to select targets from a given family. On average, adding 10 new genomes to the “genome pool” provides more promising targets for 7 “very difficult” families. In contrast, our crystallization feasibility score does not indicate that any specific microbial genomes can be readily classified as “easier” or “very difficult” with respect to providing suitable candidates for crystallization and structure determination. Finally, our analyses show that specific physicochemical properties of the protein sequence favor successful outcomes for structure determination and, hence, the group of proteins with known 3D structures is systematically different from the general pool of known proteins. We, therefore, assess the structural consequences of these differences in protein sequence and protein biophysical properties. PMID:19000818

  11. Structural deformation upon protein-protein interaction: A structural alphabet approach

    PubMed Central

    Martin, Juliette; Regad, Leslie; Lecornet, Hélène; Camproux, Anne-Claude

    2008-01-01

    Background In a number of protein-protein complexes, the 3D structures of bound and unbound partners significantly differ, supporting the induced fit hypothesis for protein-protein binding. Results In this study, we explore the induced fit modifications on a set of 124 proteins available in both bound and unbound forms, in terms of local structure. The local structure is described thanks to a structural alphabet of 27 structural letters that allows a detailed description of the backbone. Using a control set to distinguish induced fit from experimental error and natural protein flexibility, we show that the fraction of structural letters modified upon binding is significantly greater than in the control set (36% versus 28%). This proportion is even greater in the interface regions (41%). Interface regions preferentially involve coils. Our analysis further reveals that some structural letters in coil are not favored in the interface. We show that certain structural letters in coil are particularly subject to modifications at the interface, and that the severity of structural change also varies. These information are used to derive a structural letter substitution matrix that summarizes the local structural changes observed in our data set. We also illustrate the usefulness of our approach to identify common binding motifs in unrelated proteins. Conclusion Our study provides qualitative information about induced fit. These results could be of help for flexible docking. PMID:18307769

  12. Structural deformation upon protein-protein interaction: a structural alphabet approach.

    PubMed

    Martin, Juliette; Regad, Leslie; Lecornet, Hélène; Camproux, Anne-Claude

    2008-02-28

    In a number of protein-protein complexes, the 3D structures of bound and unbound partners significantly differ, supporting the induced fit hypothesis for protein-protein binding. In this study, we explore the induced fit modifications on a set of 124 proteins available in both bound and unbound forms, in terms of local structure. The local structure is described thanks to a structural alphabet of 27 structural letters that allows a detailed description of the backbone. Using a control set to distinguish induced fit from experimental error and natural protein flexibility, we show that the fraction of structural letters modified upon binding is significantly greater than in the control set (36% versus 28%). This proportion is even greater in the interface regions (41%). Interface regions preferentially involve coils. Our analysis further reveals that some structural letters in coil are not favored in the interface. We show that certain structural letters in coil are particularly subject to modifications at the interface, and that the severity of structural change also varies. These information are used to derive a structural letter substitution matrix that summarizes the local structural changes observed in our data set. We also illustrate the usefulness of our approach to identify common binding motifs in unrelated proteins. Our study provides qualitative information about induced fit. These results could be of help for flexible docking.

  13. Sequence co-evolution gives 3D contacts and structures of protein complexes

    PubMed Central

    Hopf, Thomas A; Schärfe, Charlotta P I; Rodrigues, João P G L M; Green, Anna G; Kohlbacher, Oliver; Sander, Chris; Bonvin, Alexandre M J J; Marks, Debora S

    2014-01-01

    Protein–protein interactions are fundamental to many biological processes. Experimental screens have identified tens of thousands of interactions, and structural biology has provided detailed functional insight for select 3D protein complexes. An alternative rich source of information about protein interactions is the evolutionary sequence record. Building on earlier work, we show that analysis of correlated evolutionary sequence changes across proteins identifies residues that are close in space with sufficient accuracy to determine the three-dimensional structure of the protein complexes. We evaluate prediction performance in blinded tests on 76 complexes of known 3D structure, predict protein–protein contacts in 32 complexes of unknown structure, and demonstrate how evolutionary couplings can be used to distinguish between interacting and non-interacting protein pairs in a large complex. With the current growth of sequences, we expect that the method can be generalized to genome-wide elucidation of protein–protein interaction networks and used for interaction predictions at residue resolution. DOI: http://dx.doi.org/10.7554/eLife.03430.001 PMID:25255213

  14. A new definition and properties of the similarity value between two protein structures.

    PubMed

    Saberi Fathi, S M

    2016-10-01

    Knowledge regarding the 3D structure of a protein provides useful information about the protein's functional properties. Particularly, structural similarity between proteins can be used as a good predictor of functional similarity. One method that uses the 3D geometrical structure of proteins in order to compare them is the similarity value (SV). In this paper, we introduce a new definition of the SV measure for comparing two proteins. To this end, we consider the mass of the protein's atoms and concentrate on the number of protein's atoms to be compared. This defines a new measure, called the weighted similarity value (WSV), adding physical properties to geometrical properties. We also show that our results are in good agreement with the results obtained by TM-SCORE and DALILITE. WSV can be of use in protein classification and in drug discovery.

  15. Mixing and Matching Detergents for Membrane Protein NMR Structure Determination

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Columbus, Linda; Lipfert, Jan; Jambunathan, Kalyani

    2009-10-21

    One major obstacle to membrane protein structure determination is the selection of a detergent micelle that mimics the native lipid bilayer. Currently, detergents are selected by exhaustive screening because the effects of protein-detergent interactions on protein structure are poorly understood. In this study, the structure and dynamics of an integral membrane protein in different detergents is investigated by nuclear magnetic resonance (NMR) and electron paramagnetic resonance (EPR) spectroscopy and small-angle X-ray scattering (SAXS). The results suggest that matching of the micelle dimensions to the protein's hydrophobic surface avoids exchange processes that reduce the completeness of the NMR observations. Based onmore » these dimensions, several mixed micelles were designed that improved the completeness of NMR observations. These findings provide a basis for the rational design of mixed micelles that may advance membrane protein structure determination by NMR.« less

  16. Detection of functionally important regions in "hypothetical proteins" of known structure.

    PubMed

    Nimrod, Guy; Schushan, Maya; Steinberg, David M; Ben-Tal, Nir

    2008-12-10

    Structural genomics initiatives provide ample structures of "hypothetical proteins" (i.e., proteins of unknown function) at an ever increasing rate. However, without function annotation, this structural goldmine is of little use to biologists who are interested in particular molecular systems. To this end, we used (an improved version of) the PatchFinder algorithm for the detection of functional regions on the protein surface, which could mediate its interactions with, e.g., substrates, ligands, and other proteins. Examination, using a data set of annotated proteins, showed that PatchFinder outperforms similar methods. We collected 757 structures of hypothetical proteins and their predicted functional regions in the N-Func database. Inspection of several of these regions demonstrated that they are useful for function prediction. For example, we suggested an interprotein interface and a putative nucleotide-binding site. A web-server implementation of PatchFinder and the N-Func database are available at http://patchfinder.tau.ac.il/.

  17. Structure of synaptophysin: a hexameric MARVEL-domain channel protein.

    PubMed

    Arthur, Christopher P; Stowell, Michael H B

    2007-06-01

    Synaptophysin I (SypI) is an archetypal member of the MARVEL-domain family of integral membrane proteins and one of the first synaptic vesicle proteins to be identified and cloned. Most all MARVEL-domain proteins are involved in membrane apposition and vesicle-trafficking events, but their precise role in these processes is unclear. We have purified mammalian SypI and determined its three-dimensional (3D) structure by using electron microscopy and single-particle 3D reconstruction. The hexameric structure resembles an open basket with a large pore and tenuous interactions within the cytosolic domain. The structure suggests a model for Synaptophysin's role in fusion and recycling that is regulated by known interactions with the SNARE machinery. This 3D structure of a MARVEL-domain protein provides a structural foundation for understanding the role of these important proteins in a variety of biological processes.

  18. Ensemble-based evaluation for protein structure models.

    PubMed

    Jamroz, Michal; Kolinski, Andrzej; Kihara, Daisuke

    2016-06-15

    Comparing protein tertiary structures is a fundamental procedure in structural biology and protein bioinformatics. Structure comparison is important particularly for evaluating computational protein structure models. Most of the model structure evaluation methods perform rigid body superimposition of a structure model to its crystal structure and measure the difference of the corresponding residue or atom positions between them. However, these methods neglect intrinsic flexibility of proteins by treating the native structure as a rigid molecule. Because different parts of proteins have different levels of flexibility, for example, exposed loop regions are usually more flexible than the core region of a protein structure, disagreement of a model to the native needs to be evaluated differently depending on the flexibility of residues in a protein. We propose a score named FlexScore for comparing protein structures that consider flexibility of each residue in the native state of proteins. Flexibility information may be extracted from experiments such as NMR or molecular dynamics simulation. FlexScore considers an ensemble of conformations of a protein described as a multivariate Gaussian distribution of atomic displacements and compares a query computational model with the ensemble. We compare FlexScore with other commonly used structure similarity scores over various examples. FlexScore agrees with experts' intuitive assessment of computational models and provides information of practical usefulness of models. https://bitbucket.org/mjamroz/flexscore dkihara@purdue.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.

  19. Ensemble-based evaluation for protein structure models

    PubMed Central

    Jamroz, Michal; Kolinski, Andrzej; Kihara, Daisuke

    2016-01-01

    Motivation: Comparing protein tertiary structures is a fundamental procedure in structural biology and protein bioinformatics. Structure comparison is important particularly for evaluating computational protein structure models. Most of the model structure evaluation methods perform rigid body superimposition of a structure model to its crystal structure and measure the difference of the corresponding residue or atom positions between them. However, these methods neglect intrinsic flexibility of proteins by treating the native structure as a rigid molecule. Because different parts of proteins have different levels of flexibility, for example, exposed loop regions are usually more flexible than the core region of a protein structure, disagreement of a model to the native needs to be evaluated differently depending on the flexibility of residues in a protein. Results: We propose a score named FlexScore for comparing protein structures that consider flexibility of each residue in the native state of proteins. Flexibility information may be extracted from experiments such as NMR or molecular dynamics simulation. FlexScore considers an ensemble of conformations of a protein described as a multivariate Gaussian distribution of atomic displacements and compares a query computational model with the ensemble. We compare FlexScore with other commonly used structure similarity scores over various examples. FlexScore agrees with experts’ intuitive assessment of computational models and provides information of practical usefulness of models. Availability and implementation: https://bitbucket.org/mjamroz/flexscore Contact: dkihara@purdue.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27307633

  20. Fast protein tertiary structure retrieval based on global surface shape similarity.

    PubMed

    Sael, Lee; Li, Bin; La, David; Fang, Yi; Ramani, Karthik; Rustamov, Raif; Kihara, Daisuke

    2008-09-01

    Characterization and identification of similar tertiary structure of proteins provides rich information for investigating function and evolution. The importance of structure similarity searches is increasing as structure databases continue to expand, partly due to the structural genomics projects. A crucial drawback of conventional protein structure comparison methods, which compare structures by their main-chain orientation or the spatial arrangement of secondary structure, is that a database search is too slow to be done in real-time. Here we introduce a global surface shape representation by three-dimensional (3D) Zernike descriptors, which represent a protein structure compactly as a series expansion of 3D functions. With this simplified representation, the search speed against a few thousand structures takes less than a minute. To investigate the agreement between surface representation defined by 3D Zernike descriptor and conventional main-chain based representation, a benchmark was performed against a protein classification generated by the combinatorial extension algorithm. Despite the different representation, 3D Zernike descriptor retrieved proteins of the same conformation defined by combinatorial extension in 89.6% of the cases within the top five closest structures. The real-time protein structure search by 3D Zernike descriptor will open up new possibility of large-scale global and local protein surface shape comparison. 2008 Wiley-Liss, Inc.

  1. SCOPe: Manual Curation and Artifact Removal in the Structural Classification of Proteins - extended Database.

    PubMed

    Chandonia, John-Marc; Fox, Naomi K; Brenner, Steven E

    2017-02-03

    SCOPe (Structural Classification of Proteins-extended, http://scop.berkeley.edu) is a database of relationships between protein structures that extends the Structural Classification of Proteins (SCOP) database. SCOP is an expert-curated ordering of domains from the majority of proteins of known structure in a hierarchy according to structural and evolutionary relationships. SCOPe classifies the majority of protein structures released since SCOP development concluded in 2009, using a combination of manual curation and highly precise automated tools, aiming to have the same accuracy as fully hand-curated SCOP releases. SCOPe also incorporates and updates the ASTRAL compendium, which provides several databases and tools to aid in the analysis of the sequences and structures of proteins classified in SCOPe. SCOPe continues high-quality manual classification of new superfamilies, a key feature of SCOP. Artifacts such as expression tags are now separated into their own class, in order to distinguish them from the homology-based annotations in the remainder of the SCOPe hierarchy. SCOPe 2.06 contains 77,439 Protein Data Bank entries, double the 38,221 structures classified in SCOP. Copyright © 2016 The Author(s). Published by Elsevier Ltd.. All rights reserved.

  2. Protein backbone engineering as a strategy to advance foldamers toward the frontier of protein-like tertiary structure.

    PubMed

    Reinert, Zachary E; Horne, W Seth

    2014-11-28

    A variety of non-biological structural motifs have been incorporated into the backbone of natural protein sequences. In parallel work, diverse unnatural oligomers of de novo design (termed "foldamers") have been developed that fold in defined ways. In this Perspective article, we survey foundational studies on protein backbone engineering, with a focus on alterations made in the context of complex tertiary folds. We go on to summarize recent work illustrating the potential promise of these methods to provide a general framework for the construction of foldamer mimics of protein tertiary structures.

  3. Natural antigenic differences in the functionally equivalent extracellular DNABII proteins of bacterial biofilms provide a means for targeted biofilm therapeutics

    PubMed Central

    Rocco, Christopher J.; Davey, Mary Ellen; Bakaletz, Lauren O.; Goodman, Steven D.

    2016-01-01

    SUMMARY Bacteria that persist in the oral cavity exist within complex biofilm communities. A hallmark of biofilms is the presence of an extracellular polymeric substance (EPS), which consists of polysaccharides, extracellular DNA (eDNA), and proteins, including the DNABII family of proteins. The removal of DNABII proteins from a biofilm results in the loss of structural integrity of the eDNA and the collapse of the biofilm structure. We examined the role of DNABII proteins in the biofilm structure of the periodontal pathogen Porphyromonas gingivalis and the oral commensal Streptococcus gordonii. Co-aggregation with oral streptococci is thought to facilitate the establishment of P. gingivalis within the biofilm community. We demonstrate that DNABII proteins are present in the EPS of both S. gordonii and P. gingivalis biofilms, and that these biofilms can be disrupted through the addition of antisera derived against their respective DNABII proteins. We provide evidence that both eDNA and DNABII proteins are limiting in S. gordonii but not in P. gingivalis biofilms. In addition, these proteins are capable of complementing one another functionally. We also found that while antisera derived against most DNABII proteins are capable of binding a wide variety of DNABII proteins, the P. gingivalis DNABII proteins are antigenically distinct. The presence of DNABII proteins in the EPS of these biofilms and the antigenic uniqueness of the P. gingivalis proteins provide an opportunity to develop therapies that are targeted to remove P. gingivalis and biofilms that contain P. gingivalis from the oral cavity. PMID:26988714

  4. StralSV: assessment of sequence variability within similar 3D structures and application to polio RNA-dependent RNA polymerase

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zemla, A; Lang, D; Kostova, T

    2010-11-29

    Most of the currently used methods for protein function prediction rely on sequence-based comparisons between a query protein and those for which a functional annotation is provided. A serious limitation of sequence similarity-based approaches for identifying residue conservation among proteins is the low confidence in assigning residue-residue correspondences among proteins when the level of sequence identity between the compared proteins is poor. Multiple sequence alignment methods are more satisfactory - still, they cannot provide reliable results at low levels of sequence identity. Our goal in the current work was to develop an algorithm that could overcome these difficulties and facilitatemore » the identification of structurally (and possibly functionally) relevant residue-residue correspondences between compared protein structures. Here we present StralSV, a new algorithm for detecting closely related structure fragments and quantifying residue frequency from tight local structure alignments. We apply StralSV in a study of the RNA-dependent RNA polymerase of poliovirus and demonstrate that the algorithm can be used to determine regions of the protein that are relatively unique or that shared structural similarity with structures that are distantly related. By quantifying residue frequencies among many residue-residue pairs extracted from local alignments, one can infer potential structural or functional importance of specific residues that are determined to be highly conserved or that deviate from a consensus. We further demonstrate that considerable detailed structural and phylogenetic information can be derived from StralSV analyses. StralSV is a new structure-based algorithm for identifying and aligning structure fragments that have similarity to a reference protein. StralSV analysis can be used to quantify residue-residue correspondences and identify residues that may be of particular structural or functional importance, as well as unusual or unexpected residues at a given sequence position.« less

  5. The history of the CATH structural classification of protein domains.

    PubMed

    Sillitoe, Ian; Dawson, Natalie; Thornton, Janet; Orengo, Christine

    2015-12-01

    This article presents a historical review of the protein structure classification database CATH. Together with the SCOP database, CATH remains comprehensive and reasonably up-to-date with the now more than 100,000 protein structures in the PDB. We review the expansion of the CATH and SCOP resources to capture predicted domain structures in the genome sequence data and to provide information on the likely functions of proteins mediated by their constituent domains. The establishment of comprehensive function annotation resources has also meant that domain families can be functionally annotated allowing insights into functional divergence and evolution within protein families. Copyright © 2015 The Authors. Published by Elsevier B.V. All rights reserved.

  6. Exploring the atomic structure and conformational flexibility of a 320 Å long engineered viral fiber using X-ray crystallography.

    PubMed

    Bhardwaj, Anshul; Casjens, Sherwood R; Cingolani, Gino

    2014-02-01

    Protein fibers are widespread in nature, but only a limited number of high-resolution structures have been determined experimentally. Unlike globular proteins, fibers are usually recalcitrant to form three-dimensional crystals, preventing single-crystal X-ray diffraction analysis. In the absence of three-dimensional crystals, X-ray fiber diffraction is a powerful tool to determine the internal symmetry of a fiber, but it rarely yields atomic resolution structural information on complex protein fibers. An 85-residue-long minimal coiled-coil repeat unit (MiCRU) was previously identified in the trimeric helical core of tail needle gp26, a fibrous protein emanating from the tail apparatus of the bacteriophage P22 virion. Here, evidence is provided that an MiCRU can be inserted in frame inside the gp26 helical core to generate a rationally extended fiber (gp26-2M) which, like gp26, retains a trimeric quaternary structure in solution. The 2.7 Å resolution crystal structure of this engineered fiber, which measures ∼320 Å in length and is only 20-35 Å wide, was determined. This structure, the longest for a trimeric protein fiber to be determined to such a high resolution, reveals the architecture of 22 consecutive trimerization heptads and provides a framework to decipher the structural determinants for protein fiber assembly, stability and flexibility.

  7. A computational tool to predict the evolutionarily conserved protein-protein interaction hot-spot residues from the structure of the unbound protein.

    PubMed

    Agrawal, Neeraj J; Helk, Bernhard; Trout, Bernhardt L

    2014-01-21

    Identifying hot-spot residues - residues that are critical to protein-protein binding - can help to elucidate a protein's function and assist in designing therapeutic molecules to target those residues. We present a novel computational tool, termed spatial-interaction-map (SIM), to predict the hot-spot residues of an evolutionarily conserved protein-protein interaction from the structure of an unbound protein alone. SIM can predict the protein hot-spot residues with an accuracy of 36-57%. Thus, the SIM tool can be used to predict the yet unknown hot-spot residues for many proteins for which the structure of the protein-protein complexes are not available, thereby providing a clue to their functions and an opportunity to design therapeutic molecules to target these proteins. Copyright © 2013 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved.

  8. Rebelling for a Reason: Protein Structural “Outliers”

    PubMed Central

    Arumugam, Gandhimathi; Nair, Anu G.; Hariharaputran, Sridhar; Ramanathan, Sowdhamini

    2013-01-01

    Analysis of structural variation in domain superfamilies can reveal constraints in protein evolution which aids protein structure prediction and classification. Structure-based sequence alignment of distantly related proteins, organized in PASS2 database, provides clues about structurally conserved regions among different functional families. Some superfamily members show large structural differences which are functionally relevant. This paper analyses the impact of structural divergence on function for multi-member superfamilies, selected from the PASS2 superfamily alignment database. Functional annotations within superfamilies, with structural outliers or ‘rebels’, are discussed in the context of structural variations. Overall, these data reinforce the idea that functional similarities cannot be extrapolated from mere structural conservation. The implication for fold-function prediction is that the functional annotations can only be inherited with very careful consideration, especially at low sequence identities. PMID:24073209

  9. Prediction of physical protein protein interactions

    NASA Astrophysics Data System (ADS)

    Szilágyi, András; Grimm, Vera; Arakaki, Adrián K.; Skolnick, Jeffrey

    2005-06-01

    Many essential cellular processes such as signal transduction, transport, cellular motion and most regulatory mechanisms are mediated by protein-protein interactions. In recent years, new experimental techniques have been developed to discover the protein-protein interaction networks of several organisms. However, the accuracy and coverage of these techniques have proven to be limited, and computational approaches remain essential both to assist in the design and validation of experimental studies and for the prediction of interaction partners and detailed structures of protein complexes. Here, we provide a critical overview of existing structure-independent and structure-based computational methods. Although these techniques have significantly advanced in the past few years, we find that most of them are still in their infancy. We also provide an overview of experimental techniques for the detection of protein-protein interactions. Although the developments are promising, false positive and false negative results are common, and reliable detection is possible only by taking a consensus of different experimental approaches. The shortcomings of experimental techniques affect both the further development and the fair evaluation of computational prediction methods. For an adequate comparative evaluation of prediction and high-throughput experimental methods, an appropriately large benchmark set of biophysically characterized protein complexes would be needed, but is sorely lacking.

  10. Structural anatomy of telomere OB proteins.

    PubMed

    Horvath, Martin P

    2011-10-01

    Telomere DNA-binding proteins protect the ends of chromosomes in eukaryotes. A subset of these proteins are constructed with one or more OB folds and bind with G+T-rich single-stranded DNA found at the extreme termini. The resulting DNA-OB protein complex interacts with other telomere components to coordinate critical telomere functions of DNA protection and DNA synthesis. While the first crystal and NMR structures readily explained protection of telomere ends, the picture of how single-stranded DNA becomes available to serve as primer and template for synthesis of new telomere DNA is only recently coming into focus. New structures of telomere OB fold proteins alongside insights from genetic and biochemical experiments have made significant contributions towards understanding how protein-binding OB proteins collaborate with DNA-binding OB proteins to recruit telomerase and DNA polymerase for telomere homeostasis. This review surveys telomere OB protein structures alongside highly comparable structures derived from replication protein A (RPA) components, with the goal of providing a molecular context for understanding telomere OB protein evolution and mechanism of action in protection and synthesis of telomere DNA.

  11. Structural anatomy of telomere OB proteins

    PubMed Central

    Horvath, Martin P.

    2015-01-01

    Telomere DNA-binding proteins protect the ends of chromosomes in eukaryotes. A subset of these proteins are constructed with one or more OB folds and bind with G+T-rich single-stranded DNA found at the extreme termini. The resulting DNA-OB protein complex interacts with other telomere components to coordinate critical telomere functions of DNA protection and DNA synthesis. While the first crystal and NMR structures readily explained protection of telomere ends, the picture of how single-stranded DNA becomes available to serve as primer and template for synthesis of new telomere DNA is only recently coming into focus. New structures of telomere OB fold proteins alongside insights from genetic and biochemical experiments have made significant contributions towards understanding how protein-binding OB proteins collaborate with DNA-binding OB proteins to recruit telomerase and DNA polymerase for telomere homeostasis. This review surveys telomere OB protein structures alongside highly comparable structures derived from replication protein A (RPA) components, with the goal of providing a molecular context for understanding telomere OB protein evolution and mechanism of action in protection and synthesis of telomere DNA. PMID:21950380

  12. Markov State Models Provide Insights into Dynamic Modulation of Protein Function

    PubMed Central

    2015-01-01

    Conspectus Protein function is inextricably linked to protein dynamics. As we move from a static structural picture to a dynamic ensemble view of protein structure and function, novel computational paradigms are required for observing and understanding conformational dynamics of proteins and its functional implications. In principle, molecular dynamics simulations can provide the time evolution of atomistic models of proteins, but the long time scales associated with functional dynamics make it difficult to observe rare dynamical transitions. The issue of extracting essential functional components of protein dynamics from noisy simulation data presents another set of challenges in obtaining an unbiased understanding of protein motions. Therefore, a methodology that provides a statistical framework for efficient sampling and a human-readable view of the key aspects of functional dynamics from data analysis is required. The Markov state model (MSM), which has recently become popular worldwide for studying protein dynamics, is an example of such a framework. In this Account, we review the use of Markov state models for efficient sampling of the hierarchy of time scales associated with protein dynamics, automatic identification of key conformational states, and the degrees of freedom associated with slow dynamical processes. Applications of MSMs for studying long time scale phenomena such as activation mechanisms of cellular signaling proteins has yielded novel insights into protein function. In particular, from MSMs built using large-scale simulations of GPCRs and kinases, we have shown that complex conformational changes in proteins can be described in terms of structural changes in key structural motifs or “molecular switches” within the protein, the transitions between functionally active and inactive states of proteins proceed via multiple pathways, and ligand or substrate binding modulates the flux through these pathways. Finally, MSMs also provide a theoretical toolbox for studying the effect of nonequilibrium perturbations on conformational dynamics. Considering that protein dynamics in vivo occur under nonequilibrium conditions, MSMs coupled with nonequilibrium statistical mechanics provide a way to connect cellular components to their functional environments. Nonequilibrium perturbations of protein folding MSMs reveal the presence of dynamically frozen glass-like states in their conformational landscape. These frozen states are also observed to be rich in β-sheets, which indicates their possible role in the nucleation of β-sheet rich aggregates such as those observed in amyloid-fibril formation. Finally, we describe how MSMs have been used to understand the dynamical behavior of intrinsically disordered proteins such as amyloid-β, human islet amyloid polypeptide, and p53. While certainly not a panacea for studying functional dynamics, MSMs provide a rigorous theoretical foundation for understanding complex entropically dominated processes and a convenient lens for viewing protein motions. PMID:25625937

  13. PreSSAPro: a software for the prediction of secondary structure by amino acid properties.

    PubMed

    Costantini, Susan; Colonna, Giovanni; Facchiano, Angelo M

    2007-10-01

    PreSSAPro is a software, available to the scientific community as a free web service designed to provide predictions of secondary structures starting from the amino acid sequence of a given protein. Predictions are based on our recently published work on the amino acid propensities for secondary structures in either large but not homogeneous protein data sets, as well as in smaller but homogeneous data sets corresponding to protein structural classes, i.e. all-alpha, all-beta, or alpha-beta proteins. Predictions result improved by the use of propensities evaluated for the right protein class. PreSSAPro predicts the secondary structure according to the right protein class, if known, or gives a multiple prediction with reference to the different structural classes. The comparison of these predictions represents a novel tool to evaluate what sequence regions can assume different secondary structures depending on the structural class assignment, in the perspective of identifying proteins able to fold in different conformations. The service is available at the URL http://bioinformatica.isa.cnr.it/PRESSAPRO/.

  14. Native aggregation as a cause of origin of temporary cellular structures needed for all forms of cellular activity, signaling and transformations.

    PubMed

    Matveev, Vladimir V

    2010-06-09

    According to the hypothesis explored in this paper, native aggregation is genetically controlled (programmed) reversible aggregation that occurs when interacting proteins form new temporary structures through highly specific interactions. It is assumed that Anfinsen's dogma may be extended to protein aggregation: composition and amino acid sequence determine not only the secondary and tertiary structure of single protein, but also the structure of protein aggregates (associates). Cell function is considered as a transition between two states (two states model), the resting state and state of activity (this applies to the cell as a whole and to its individual structures). In the resting state, the key proteins are found in the following inactive forms: natively unfolded and globular. When the cell is activated, secondary structures appear in natively unfolded proteins (including unfolded regions in other proteins), and globular proteins begin to melt and their secondary structures become available for interaction with the secondary structures of other proteins. These temporary secondary structures provide a means for highly specific interactions between proteins. As a result, native aggregation creates temporary structures necessary for cell activity."One of the principal objects of theoretical research in any department of knowledge is to find the point of view from which the subject appears in its greatest simplicity."Josiah Willard Gibbs (1839-1903).

  15. Comparative Protein Structure Modeling Using MODELLER.

    PubMed

    Webb, Benjamin; Sali, Andrej

    2014-09-08

    Functional characterization of a protein sequence is one of the most frequent problems in biology. This task is usually facilitated by accurate three-dimensional (3-D) structure of the studied protein. In the absence of an experimentally determined structure, comparative or homology modeling can sometimes provide a useful 3-D model for a protein that is related to at least one known protein structure. Comparative modeling predicts the 3-D structure of a given protein sequence (target) based primarily on its alignment to one or more proteins of known structure (templates). The prediction process consists of fold assignment, target-template alignment, model building, and model evaluation. This unit describes how to calculate comparative models using the program MODELLER and discusses all four steps of comparative modeling, frequently observed errors, and some applications. Modeling lactate dehydrogenase from Trichomonas vaginalis (TvLDH) is described as an example. The download and installation of the MODELLER software is also described. Copyright © 2014 John Wiley & Sons, Inc.

  16. Self-assembly in the ferritin nano-cage protein superfamily.

    PubMed

    Zhang, Yu; Orner, Brendan P

    2011-01-01

    Protein self-assembly, through specific, high affinity, and geometrically constraining protein-protein interactions, can control and lead to complex cellular nano-structures. Establishing an understanding of the underlying principles that govern protein self-assembly is not only essential to appreciate the fundamental biological functions of these structures, but could also provide a basis for their enhancement for nano-material applications. The ferritins are a superfamily of well studied proteins that self-assemble into hollow cage-like structures which are ubiquitously found in both prokaryotes and eukaryotes. Structural studies have revealed that many members of the ferritin family can self-assemble into nano-cages of two types. Maxi-ferritins form hollow spheres with octahedral symmetry composed of twenty-four monomers. Mini-ferritins, on the other hand, are tetrahedrally symmetric, hollow assemblies composed of twelve monomers. This review will focus on the structure of members of the ferritin superfamily, the mechanism of ferritin self-assembly and the structure-function relations of these proteins.

  17. Automated structure determination of proteins with the SAIL-FLYA NMR method.

    PubMed

    Takeda, Mitsuhiro; Ikeya, Teppei; Güntert, Peter; Kainosho, Masatsune

    2007-01-01

    The labeling of proteins with stable isotopes enhances the NMR method for the determination of 3D protein structures in solution. Stereo-array isotope labeling (SAIL) provides an optimal stereospecific and regiospecific pattern of stable isotopes that yields sharpened lines, spectral simplification without loss of information, and the ability to collect rapidly and evaluate fully automatically the structural restraints required to solve a high-quality solution structure for proteins up to twice as large as those that can be analyzed using conventional methods. Here, we describe a protocol for the preparation of SAIL proteins by cell-free methods, including the preparation of S30 extract and their automated structure analysis using the FLYA algorithm and the program CYANA. Once efficient cell-free expression of the unlabeled or uniformly labeled target protein has been achieved, the NMR sample preparation of a SAIL protein can be accomplished in 3 d. A fully automated FLYA structure calculation can be completed in 1 d on a powerful computer system.

  18. In-situ and real-time growth observation of high-quality protein crystals under quasi-microgravity on earth.

    PubMed

    Nakamura, Akira; Ohtsuka, Jun; Kashiwagi, Tatsuki; Numoto, Nobutaka; Hirota, Noriyuki; Ode, Takahiro; Okada, Hidehiko; Nagata, Koji; Kiyohara, Motosuke; Suzuki, Ei-Ichiro; Kita, Akiko; Wada, Hitoshi; Tanokura, Masaru

    2016-02-26

    Precise protein structure determination provides significant information on life science research, although high-quality crystals are not easily obtained. We developed a system for producing high-quality protein crystals with high throughput. Using this system, gravity-controlled crystallization are made possible by a magnetic microgravity environment. In addition, in-situ and real-time observation and time-lapse imaging of crystal growth are feasible for over 200 solution samples independently. In this paper, we also report results of crystallization experiments for two protein samples. Crystals grown in the system exhibited magnetic orientation and showed higher and more homogeneous quality compared with the control crystals. The structural analysis reveals that making use of the magnetic microgravity during the crystallization process helps us to build a well-refined protein structure model, which has no significant structural differences with a control structure. Therefore, the system contributes to improvement in efficiency of structural analysis for "difficult" proteins, such as membrane proteins and supermolecular complexes.

  19. Exploring the atomic structure and conformational flexibility of a 320 Å long engineered viral fiber using X-ray crystallography

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bhardwaj, Anshul; Casjens, Sherwood R.; Cingolani, Gino, E-mail: gino.cingolani@jefferson.edu

    2014-02-01

    This study presents the crystal structure of a ∼320 Å long protein fiber generated by in-frame extension of its repeated helical coiled-coil core. Protein fibers are widespread in nature, but only a limited number of high-resolution structures have been determined experimentally. Unlike globular proteins, fibers are usually recalcitrant to form three-dimensional crystals, preventing single-crystal X-ray diffraction analysis. In the absence of three-dimensional crystals, X-ray fiber diffraction is a powerful tool to determine the internal symmetry of a fiber, but it rarely yields atomic resolution structural information on complex protein fibers. An 85-residue-long minimal coiled-coil repeat unit (MiCRU) was previously identifiedmore » in the trimeric helical core of tail needle gp26, a fibrous protein emanating from the tail apparatus of the bacteriophage P22 virion. Here, evidence is provided that an MiCRU can be inserted in frame inside the gp26 helical core to generate a rationally extended fiber (gp26-2M) which, like gp26, retains a trimeric quaternary structure in solution. The 2.7 Å resolution crystal structure of this engineered fiber, which measures ∼320 Å in length and is only 20–35 Å wide, was determined. This structure, the longest for a trimeric protein fiber to be determined to such a high resolution, reveals the architecture of 22 consecutive trimerization heptads and provides a framework to decipher the structural determinants for protein fiber assembly, stability and flexibility.« less

  20. Functional structural motifs for protein-ligand, protein-protein, and protein-nucleic acid interactions and their connection to supersecondary structures.

    PubMed

    Kinjo, Akira R; Nakamura, Haruki

    2013-01-01

    Protein functions are mediated by interactions between proteins and other molecules. One useful approach to analyze protein functions is to compare and classify the structures of interaction interfaces of proteins. Here, we describe the procedures for compiling a database of interface structures and efficiently comparing the interface structures. To do so requires a good understanding of the data structures of the Protein Data Bank (PDB). Therefore, we also provide a detailed account of the PDB exchange dictionary necessary for extracting data that are relevant for analyzing interaction interfaces and secondary structures. We identify recurring structural motifs by classifying similar interface structures, and we define a coarse-grained representation of supersecondary structures (SSS) which represents a sequence of two or three secondary structure elements including their relative orientations as a string of four to seven letters. By examining the correspondence between structural motifs and SSS strings, we show that no SSS string has particularly high propensity to be found interaction interfaces in general, indicating any SSS can be used as a binding interface. When individual structural motifs are examined, there are some SSS strings that have high propensity for particular groups of structural motifs. In addition, it is shown that while the SSS strings found in particular structural motifs for nonpolymer and protein interfaces are as abundant as in other structural motifs that belong to the same subunit, structural motifs for nucleic acid interfaces exhibit somewhat stronger preference for SSS strings. In regard to protein folds, many motif-specific SSS strings were found across many folds, suggesting that SSS may be a useful description to investigate the universality of ligand binding modes.

  1. HARMONY: a server for the assessment of protein structures

    PubMed Central

    Pugalenthi, G.; Shameer, K.; Srinivasan, N.; Sowdhamini, R.

    2006-01-01

    Protein structure validation is an important step in computational modeling and structure determination. Stereochemical assessment of protein structures examine internal parameters such as bond lengths and Ramachandran (φ,ψ) angles. Gross structure prediction methods such as inverse folding procedure and structure determination especially at low resolution can sometimes give rise to models that are incorrect due to assignment of misfolds or mistracing of electron density maps. Such errors are not reflected as strain in internal parameters. HARMONY is a procedure that examines the compatibility between the sequence and the structure of a protein by assigning scores to individual residues and their amino acid exchange patterns after considering their local environments. Local environments are described by the backbone conformation, solvent accessibility and hydrogen bonding patterns. We are now providing HARMONY through a web server such that users can submit their protein structure files and, if required, the alignment of homologous sequences. Scores are mapped on the structure for subsequent examination that is useful to also recognize regions of possible local errors in protein structures. HARMONY server is located at PMID:16844999

  2. Rapid search for tertiary fragments reveals protein sequence–structure relationships

    PubMed Central

    Zhou, Jianfu; Grigoryan, Gevorg

    2015-01-01

    Finding backbone substructures from the Protein Data Bank that match an arbitrary query structural motif, composed of multiple disjoint segments, is a problem of growing relevance in structure prediction and protein design. Although numerous protein structure search approaches have been proposed, methods that address this specific task without additional restrictions and on practical time scales are generally lacking. Here, we propose a solution, dubbed MASTER, that is both rapid, enabling searches over the Protein Data Bank in a matter of seconds, and provably correct, finding all matches below a user-specified root-mean-square deviation cutoff. We show that despite the potentially exponential time complexity of the problem, running times in practice are modest even for queries with many segments. The ability to explore naturally plausible structural and sequence variations around a given motif has the potential to synthesize its design principles in an automated manner; so we go on to illustrate the utility of MASTER to protein structural biology. We demonstrate its capacity to rapidly establish structure–sequence relationships, uncover the native designability landscapes of tertiary structural motifs, identify structural signatures of binding, and automatically rewire protein topologies. Given the broad utility of protein tertiary fragment searches, we hope that providing MASTER in an open-source format will enable novel advances in understanding, predicting, and designing protein structure. PMID:25420575

  3. Mass spectrometry for the biophysical characterization of therapeutic monoclonal antibodies.

    PubMed

    Zhang, Hao; Cui, Weidong; Gross, Michael L

    2014-01-21

    Monoclonal antibodies (mAbs) are powerful therapeutics, and their characterization has drawn considerable attention and urgency. Unlike small-molecule drugs (150-600 Da) that have rigid structures, mAbs (∼150 kDa) are engineered proteins that undergo complicated folding and can exist in a number of low-energy structures, posing a challenge for traditional methods in structural biology. Mass spectrometry (MS)-based biophysical characterization approaches can provide structural information, bringing high sensitivity, fast turnaround, and small sample consumption. This review outlines various MS-based strategies for protein biophysical characterization and then reviews how these strategies provide structural information of mAbs at the protein level (intact or top-down approaches), peptide, and residue level (bottom-up approaches), affording information on higher order structure, aggregation, and the nature of antibody complexes. Copyright © 2013 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved.

  4. Geometry of proteins: hydrogen bonding, sterics, and marginally compact tubes.

    PubMed

    Banavar, Jayanth R; Cieplak, Marek; Flammini, Alessandro; Hoang, Trinh X; Kamien, Randall D; Lezon, Timothy; Marenduzzo, Davide; Maritan, Amos; Seno, Flavio; Snir, Yehuda; Trovato, Antonio

    2006-03-01

    The functionality of proteins is governed by their structure in the native state. Protein structures are made up of emergent building blocks of helices and almost planar sheets. A simple coarse-grained geometrical model of a flexible tube barely subject to compaction provides a unified framework for understanding the common character of globular proteins. We argue that a recent critique of the tube idea is not well founded.

  5. Geometry of proteins: Hydrogen bonding, sterics, and marginally compact tubes

    NASA Astrophysics Data System (ADS)

    Banavar, Jayanth R.; Cieplak, Marek; Flammini, Alessandro; Hoang, Trinh X.; Kamien, Randall D.; Lezon, Timothy; Marenduzzo, Davide; Maritan, Amos; Seno, Flavio; Snir, Yehuda; Trovato, Antonio

    2006-03-01

    The functionality of proteins is governed by their structure in the native state. Protein structures are made up of emergent building blocks of helices and almost planar sheets. A simple coarse-grained geometrical model of a flexible tube barely subject to compaction provides a unified framework for understanding the common character of globular proteins. We argue that a recent critique of the tube idea is not well founded.

  6. Impact of genetic variation on three dimensional structure and function of proteins

    PubMed Central

    Bhattacharya, Roshni; Rose, Peter W.; Burley, Stephen K.

    2017-01-01

    The Protein Data Bank (PDB; http://wwpdb.org) was established in 1971 as the first open access digital data resource in biology with seven protein structures as its initial holdings. The global PDB archive now contains more than 126,000 experimentally determined atomic level three-dimensional (3D) structures of biological macromolecules (proteins, DNA, RNA), all of which are freely accessible via the Internet. Knowledge of the 3D structure of the gene product can help in understanding its function and role in disease. Of particular interest in the PDB archive are proteins for which 3D structures of genetic variant proteins have been determined, thus revealing atomic-level structural differences caused by the variation at the DNA level. Herein, we present a systematic and qualitative analysis of such cases. We observe a wide range of structural and functional changes caused by single amino acid differences, including changes in enzyme activity, aggregation propensity, structural stability, binding, and dissociation, some in the context of large assemblies. Structural comparison of wild type and mutated proteins, when both are available, provide insights into atomic-level structural differences caused by the genetic variation. PMID:28296894

  7. Structural alphabets derived from attractors in conformational space

    PubMed Central

    2010-01-01

    Background The hierarchical and partially redundant nature of protein structures justifies the definition of frequently occurring conformations of short fragments as 'states'. Collections of selected representatives for these states define Structural Alphabets, describing the most typical local conformations within protein structures. These alphabets form a bridge between the string-oriented methods of sequence analysis and the coordinate-oriented methods of protein structure analysis. Results A Structural Alphabet has been derived by clustering all four-residue fragments of a high-resolution subset of the protein data bank and extracting the high-density states as representative conformational states. Each fragment is uniquely defined by a set of three independent angles corresponding to its degrees of freedom, capturing in simple and intuitive terms the properties of the conformational space. The fragments of the Structural Alphabet are equivalent to the conformational attractors and therefore yield a most informative encoding of proteins. Proteins can be reconstructed within the experimental uncertainty in structure determination and ensembles of structures can be encoded with accuracy and robustness. Conclusions The density-based Structural Alphabet provides a novel tool to describe local conformations and it is specifically suitable for application in studies of protein dynamics. PMID:20170534

  8. Recognition of coarse-grained protein tertiary structure.

    PubMed

    Lezon, Timothy; Banavar, Jayanth R; Maritan, Amos

    2004-05-15

    A model of the protein backbone is considered in which each residue is characterized by the location of its C(alpha) atom and one of a discrete set of conformal (phi, psi) states. We investigate the key differences between a description that offers a locally precise fit to known backbone structures and one that provides a globally accurate fit to protein structures. Using a statistical scoring scheme and threading, a protein's local best-fit conformation is highly recognizable, but its global structure cannot be directly determined from an amino acid sequence. The incorporation of information about the conformal states of neighboring residues along the chain allows one to accurately translate the local structure into a global structure. We present a two-step algorithm, which recognizes up to 95% of the tested protein native-state structures to within a 2.5 A root mean square deviation. Copyright 2004 Wiley-Liss, Inc.

  9. Architecture of the 99 bp DNA-six-protein regulatory complex of the lambda att site.

    PubMed

    Sun, Xingmin; Mierke, Dale F; Biswas, Tapan; Lee, Sang Yeol; Landy, Arthur; Radman-Livaja, Marta

    2006-11-17

    The highly directional and tightly regulated recombination reaction used to site-specifically excise the bacteriophage lambda chromosome out of its E. coli host chromosome requires the binding of six sequence-specific proteins to a 99 bp segment of the phage att site. To gain structural insights into this recombination pathway, we measured 27 FRET distances between eight points on the 99 bp regulatory DNA bound with all six proteins. Triangulation of these distances using a metric matrix distance-geometry algorithm provided coordinates for these eight points. The resulting path for the protein-bound regulatory DNA, which fits well with the genetics, biochemistry, and X-ray crystal structures describing the individual proteins and their interactions with DNA, provides a new structural perspective into the molecular mechanism and regulation of the recombination reaction and illustrates a design by which different families of higher-order complexes can be assembled from different numbers and combinations of the same few proteins.

  10. RaptorX-Angle: real-value prediction of protein backbone dihedral angles through a hybrid method of clustering and deep learning.

    PubMed

    Gao, Yujuan; Wang, Sheng; Deng, Minghua; Xu, Jinbo

    2018-05-08

    Protein dihedral angles provide a detailed description of protein local conformation. Predicted dihedral angles can be used to narrow down the conformational space of the whole polypeptide chain significantly, thus aiding protein tertiary structure prediction. However, direct angle prediction from sequence alone is challenging. In this article, we present a novel method (named RaptorX-Angle) to predict real-valued angles by combining clustering and deep learning. Tested on a subset of PDB25 and the targets in the latest two Critical Assessment of protein Structure Prediction (CASP), our method outperforms the existing state-of-art method SPIDER2 in terms of Pearson Correlation Coefficient (PCC) and Mean Absolute Error (MAE). Our result also shows approximately linear relationship between the real prediction errors and our estimated bounds. That is, the real prediction error can be well approximated by our estimated bounds. Our study provides an alternative and more accurate prediction of dihedral angles, which may facilitate protein structure prediction and functional study.

  11. Steady-state structural fluctuation is a predictor of the necessity of pausing-mediated co-translational folding for small proteins.

    PubMed

    Huang, Wenxi; Liu, Wanting; Jin, Jingjie; Xiao, Qilan; Lu, Ruibin; Chen, Wei; Xiong, Sheng; Zhang, Gong

    2018-03-25

    Translational pausing coordinates protein synthesis and co-translational folding. It is a common factor that facilitates the correct folding of large, multi-domain proteins. For small proteins, pausing sites rarely occurs in the gene body, and the 3'-end pausing sites are only essential for the folding of a fraction of proteins. The determinant of the necessity of the pausings remains obscure. In this study, we demonstrated that the steady-state structural fluctuation is a predictor of the necessity of pausing-mediated co-translational folding for small proteins. Validated by experiments with 5 model proteins, we found that the rigid protein structures do not, while the flexible structures do need 3'-end pausings to fold correctly. Therefore, rational optimization of translational pausing can improve soluble expression of small proteins with flexible structures, but not the rigid ones. The rigidity of the structure can be quantitatively estimated in silico using molecular dynamic simulation. Nevertheless, we also found that the translational pausing optimization increases the fitness of the expression host, and thus benefits the recombinant protein production, independent from the soluble expression. These results shed light on the structural basis of the translational pausing and provided a practical tool for industrial protein fermentation. Copyright © 2017. Published by Elsevier Inc.

  12. SAIL--stereo-array isotope labeling.

    PubMed

    Kainosho, Masatsune; Güntert, Peter

    2009-11-01

    Optimal stereospecific and regiospecific labeling of proteins with stable isotopes enhances the nuclear magnetic resonance (NMR) method for the determination of the three-dimensional protein structures in solution. Stereo-array isotope labeling (SAIL) offers sharpened lines, spectral simplification without loss of information and the ability to rapidly collect and automatically evaluate the structural restraints required to solve a high-quality solution structure for proteins up to twice as large as before. This review gives an overview of stable isotope labeling methods for NMR spectroscopy with proteins and provides an in-depth treatment of the SAIL technology.

  13. Voroprot: an interactive tool for the analysis and visualization of complex geometric features of protein structure.

    PubMed

    Olechnovic, Kliment; Margelevicius, Mindaugas; Venclovas, Ceslovas

    2011-03-01

    We present Voroprot, an interactive cross-platform software tool that provides a unique set of capabilities for exploring geometric features of protein structure. Voroprot allows the construction and visualization of the Apollonius diagram (also known as the additively weighted Voronoi diagram), the Apollonius graph, protein alpha shapes, interatomic contact surfaces, solvent accessible surfaces, pockets and cavities inside protein structure. Voroprot is available for Windows, Linux and Mac OS X operating systems and can be downloaded from http://www.ibt.lt/bioinformatics/voroprot/.

  14. Protein Interaction Profile Sequencing (PIP-seq).

    PubMed

    Foley, Shawn W; Gregory, Brian D

    2016-10-10

    Every eukaryotic RNA transcript undergoes extensive post-transcriptional processing from the moment of transcription up through degradation. This regulation is performed by a distinct cohort of RNA-binding proteins which recognize their target transcript by both its primary sequence and secondary structure. Here, we describe protein interaction profile sequencing (PIP-seq), a technique that uses ribonuclease-based footprinting followed by high-throughput sequencing to globally assess both protein-bound RNA sequences and RNA secondary structure. PIP-seq utilizes single- and double-stranded RNA-specific nucleases in the absence of proteins to infer RNA secondary structure. These libraries are also compared to samples that undergo nuclease digestion in the presence of proteins in order to find enriched protein-bound sequences. Combined, these four libraries provide a comprehensive, transcriptome-wide view of RNA secondary structure and RNA protein interaction sites from a single experimental technique. © 2016 by John Wiley & Sons, Inc. Copyright © 2016 John Wiley & Sons, Inc.

  15. Crystal Structure of a Plant Multidrug and Toxic Compound Extrusion Family Protein.

    PubMed

    Tanaka, Yoshiki; Iwaki, Shigehiro; Tsukazaki, Tomoya

    2017-09-05

    The multidrug and toxic compound extrusion (MATE) family of proteins consists of transporters responsible for multidrug resistance in prokaryotes. In plants, a number of MATE proteins were identified by recent genomic and functional studies, which imply that the proteins have substrate-specific transport functions instead of multidrug extrusion. The three-dimensional structure of eukaryotic MATE proteins, including those of plants, has not been reported, preventing a better understanding of the molecular mechanism of these proteins. Here, we describe the crystal structure of a MATE protein from the plant Camelina sativa at 2.9 Å resolution. Two sets of six transmembrane α helices, assembled pseudo-symmetrically, possess a negatively charged internal pocket with an outward-facing shape. The crystal structure provides insight into the diversity of plant MATE proteins and their substrate recognition and transport through the membrane. Copyright © 2017 Elsevier Ltd. All rights reserved.

  16. Exploring representations of protein structure for automated remote homology detection and mapping of protein structure space

    PubMed Central

    2014-01-01

    Background Due to rapid sequencing of genomes, there are now millions of deposited protein sequences with no known function. Fast sequence-based comparisons allow detecting close homologs for a protein of interest to transfer functional information from the homologs to the given protein. Sequence-based comparison cannot detect remote homologs, in which evolution has adjusted the sequence while largely preserving structure. Structure-based comparisons can detect remote homologs but most methods for doing so are too expensive to apply at a large scale over structural databases of proteins. Recently, fragment-based structural representations have been proposed that allow fast detection of remote homologs with reasonable accuracy. These representations have also been used to obtain linearly-reducible maps of protein structure space. It has been shown, as additionally supported from analysis in this paper that such maps preserve functional co-localization of the protein structure space. Methods Inspired by a recent application of the Latent Dirichlet Allocation (LDA) model for conducting structural comparisons of proteins, we propose higher-order LDA-obtained topic-based representations of protein structures to provide an alternative route for remote homology detection and organization of the protein structure space in few dimensions. Various techniques based on natural language processing are proposed and employed to aid the analysis of topics in the protein structure domain. Results We show that a topic-based representation is just as effective as a fragment-based one at automated detection of remote homologs and organization of protein structure space. We conduct a detailed analysis of the information content in the topic-based representation, showing that topics have semantic meaning. The fragment-based and topic-based representations are also shown to allow prediction of superfamily membership. Conclusions This work opens exciting venues in designing novel representations to extract information about protein structures, as well as organizing and mining protein structure space with mature text mining tools. PMID:25080993

  17. Structural disorder within sendai virus nucleoprotein and phosphoprotein: insight into the structural basis of molecular recognition.

    PubMed

    Jensen, Malene Ringkjøbing; Bernadó, Pau; Houben, Klaartje; Blanchard, Laurence; Marion, Dominque; Ruigrok, Rob W H; Blackledge, Martin

    2010-08-01

    Intrinsically disordered regions of significant length are present throughout eukaryotic genomes, and are particularly prevalent in viral proteins. Due to their inherent flexibility, these proteins inhabit a conformational landscape that is too complex to be described by classical structural biology. The elucidation of the role that conformational flexibility plays in molecular function will redefine our understanding of the molecular basis of biological function, and the development of appropriate technology to achieve this aim remains one of the major challenges for the future of structural biology. NMR is the technique of choice for studying intrinsically disordered proteins, providing information about structure, flexibility and interactions at atomic resolution even in completely disordered proteins. In particular residual dipolar couplings (RDCs) are sensitive and powerful tools for determining local and long-range structural behaviour in flexible proteins. Here we describe recent applications of the use of RDCs to quantitatively describe the level of local structure in intrinsically disordered proteins involved in replication and transcription in Sendai virus.

  18. Scop3D: three-dimensional visualization of sequence conservation.

    PubMed

    Vermeire, Tessa; Vermaere, Stijn; Schepens, Bert; Saelens, Xavier; Van Gucht, Steven; Martens, Lennart; Vandermarliere, Elien

    2015-04-01

    The integration of a protein's structure with its known sequence variation provides insight on how that protein evolves, for instance in terms of (changing) function or immunogenicity. Yet, collating the corresponding sequence variants into a multiple sequence alignment, calculating each position's conservation, and mapping this information back onto a relevant structure is not straightforward. We therefore built the Sequence Conservation on Protein 3D structure (scop3D) tool to perform these tasks automatically. The output consists of two modified PDB files in which the B-values for each position are replaced by the percentage sequence conservation, or the information entropy for each position, respectively. Furthermore, text files with absolute and relative amino acid occurrences for each position are also provided, along with snapshots of the protein from six distinct directions in space. The visualization provided by scop3D can for instance be used as an aid in vaccine development or to identify antigenic hotspots, which we here demonstrate based on an analysis of the fusion proteins of human respiratory syncytial virus and mumps virus. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  19. qPIPSA: Relating enzymatic kinetic parameters and interaction fields

    PubMed Central

    Gabdoulline, Razif R; Stein, Matthias; Wade, Rebecca C

    2007-01-01

    Background The simulation of metabolic networks in quantitative systems biology requires the assignment of enzymatic kinetic parameters. Experimentally determined values are often not available and therefore computational methods to estimate these parameters are needed. It is possible to use the three-dimensional structure of an enzyme to perform simulations of a reaction and derive kinetic parameters. However, this is computationally demanding and requires detailed knowledge of the enzyme mechanism. We have therefore sought to develop a general, simple and computationally efficient procedure to relate protein structural information to enzymatic kinetic parameters that allows consistency between the kinetic and structural information to be checked and estimation of kinetic constants for structurally and mechanistically similar enzymes. Results We describe qPIPSA: quantitative Protein Interaction Property Similarity Analysis. In this analysis, molecular interaction fields, for example, electrostatic potentials, are computed from the enzyme structures. Differences in molecular interaction fields between enzymes are then related to the ratios of their kinetic parameters. This procedure can be used to estimate unknown kinetic parameters when enzyme structural information is available and kinetic parameters have been measured for related enzymes or were obtained under different conditions. The detailed interaction of the enzyme with substrate or cofactors is not modeled and is assumed to be similar for all the proteins compared. The protein structure modeling protocol employed ensures that differences between models reflect genuine differences between the protein sequences, rather than random fluctuations in protein structure. Conclusion Provided that the experimental conditions and the protein structural models refer to the same protein state or conformation, correlations between interaction fields and kinetic parameters can be established for sets of related enzymes. Outliers may arise due to variation in the importance of different contributions to the kinetic parameters, such as protein stability and conformational changes. The qPIPSA approach can assist in the validation as well as estimation of kinetic parameters, and provide insights into enzyme mechanism. PMID:17919319

  20. Systematic Validation of Protein Force Fields against Experimental Data

    PubMed Central

    Eastwood, Michael P.; Dror, Ron O.; Shaw, David E.

    2012-01-01

    Molecular dynamics simulations provide a vehicle for capturing the structures, motions, and interactions of biological macromolecules in full atomic detail. The accuracy of such simulations, however, is critically dependent on the force field—the mathematical model used to approximate the atomic-level forces acting on the simulated molecular system. Here we present a systematic and extensive evaluation of eight different protein force fields based on comparisons of experimental data with molecular dynamics simulations that reach a previously inaccessible timescale. First, through extensive comparisons with experimental NMR data, we examined the force fields' abilities to describe the structure and fluctuations of folded proteins. Second, we quantified potential biases towards different secondary structure types by comparing experimental and simulation data for small peptides that preferentially populate either helical or sheet-like structures. Third, we tested the force fields' abilities to fold two small proteins—one α-helical, the other with β-sheet structure. The results suggest that force fields have improved over time, and that the most recent versions, while not perfect, provide an accurate description of many structural and dynamical properties of proteins. PMID:22384157

  1. Structural analysis of a highly glycosylated and unliganded gp120-based antigen using mass spectrometry†

    PubMed Central

    Wang, Liwen; Qin, Yali; Ilchenko, Serguei; Bohon, Jen; Shi, Wuxian; Cho, Michael W.; Takamoto, Keiji; Chance, Mark R.

    2010-01-01

    Structural characterization of the HIV envelope protein gp120 is very important to provide an understanding of the protein's immunogenicity and it's binding to cell receptors. So far, crystallographic structure determination of gp120 with an intact V3 loop (in the absence of CD4 co-receptor or antibody) has not been achieved. The third variable region (V3) of the gp120 is immunodominant and contains glycosylation signatures that are essential for co-receptor binding and viral entry to T-cells. In this study, we characterized the structure of the outer domain of gp120 with an intact V3 loop (gp120-OD8) purified from Drosophila S2 cells utilizing mass spectrometry-based approaches. We mapped the glycosylation sites and calculated glycosylation occupancy of gp120-OD8; eleven sites from fifteen glycosylation motifs were determined as having high mannose or hybrid glycosylation structures. The specific glycan moieties of nine glycosylation sites from eight unique glycopeptides were determined by a combination of ECD and CID MS approaches. Hydroxyl radical-mediated protein footprinting coupled with mass spectrometry analysis was employed to provide detailed information on protein structure of gp120-OD8 by directly identifying accessible and hydroxyl radical-reactive side chain residues. Comparison of gp120-OD8 experimental footprinting data with a homology model derived from the ligated CD4/ gp120-OD8 crystal structure revealed a flexible V3 loop structure where the V3 tip may provide contacts with the rest of the protein while residues in the V3 base remain solvent accessible. In addition, the data illustrate interactions between specific sugar moieties and amino acid side chains potentially important to the gp120-OD8 structure. PMID:20825246

  2. Homology modeling a fast tool for drug discovery: current perspectives.

    PubMed

    Vyas, V K; Ukawala, R D; Ghate, M; Chintha, C

    2012-01-01

    Major goal of structural biology involve formation of protein-ligand complexes; in which the protein molecules act energetically in the course of binding. Therefore, perceptive of protein-ligand interaction will be very important for structure based drug design. Lack of knowledge of 3D structures has hindered efforts to understand the binding specificities of ligands with protein. With increasing in modeling software and the growing number of known protein structures, homology modeling is rapidly becoming the method of choice for obtaining 3D coordinates of proteins. Homology modeling is a representation of the similarity of environmental residues at topologically corresponding positions in the reference proteins. In the absence of experimental data, model building on the basis of a known 3D structure of a homologous protein is at present the only reliable method to obtain the structural information. Knowledge of the 3D structures of proteins provides invaluable insights into the molecular basis of their functions. The recent advances in homology modeling, particularly in detecting and aligning sequences with template structures, distant homologues, modeling of loops and side chains as well as detecting errors in a model contributed to consistent prediction of protein structure, which was not possible even several years ago. This review focused on the features and a role of homology modeling in predicting protein structure and described current developments in this field with victorious applications at the different stages of the drug design and discovery.

  3. Homology Modeling a Fast Tool for Drug Discovery: Current Perspectives

    PubMed Central

    Vyas, V. K.; Ukawala, R. D.; Ghate, M.; Chintha, C.

    2012-01-01

    Major goal of structural biology involve formation of protein-ligand complexes; in which the protein molecules act energetically in the course of binding. Therefore, perceptive of protein-ligand interaction will be very important for structure based drug design. Lack of knowledge of 3D structures has hindered efforts to understand the binding specificities of ligands with protein. With increasing in modeling software and the growing number of known protein structures, homology modeling is rapidly becoming the method of choice for obtaining 3D coordinates of proteins. Homology modeling is a representation of the similarity of environmental residues at topologically corresponding positions in the reference proteins. In the absence of experimental data, model building on the basis of a known 3D structure of a homologous protein is at present the only reliable method to obtain the structural information. Knowledge of the 3D structures of proteins provides invaluable insights into the molecular basis of their functions. The recent advances in homology modeling, particularly in detecting and aligning sequences with template structures, distant homologues, modeling of loops and side chains as well as detecting errors in a model contributed to consistent prediction of protein structure, which was not possible even several years ago. This review focused on the features and a role of homology modeling in predicting protein structure and described current developments in this field with victorious applications at the different stages of the drug design and discovery. PMID:23204616

  4. Improved protein surface comparison and application to low-resolution protein structure data

    PubMed Central

    2010-01-01

    Background Recent advancements of experimental techniques for determining protein tertiary structures raise significant challenges for protein bioinformatics. With the number of known structures of unknown function expanding at a rapid pace, an urgent task is to provide reliable clues to their biological function on a large scale. Conventional approaches for structure comparison are not suitable for a real-time database search due to their slow speed. Moreover, a new challenge has arisen from recent techniques such as electron microscopy (EM), which provide low-resolution structure data. Previously, we have introduced a method for protein surface shape representation using the 3D Zernike descriptors (3DZDs). The 3DZD enables fast structure database searches, taking advantage of its rotation invariance and compact representation. The search results of protein surface represented with the 3DZD has showngood agreement with the existing structure classifications, but some discrepancies were also observed. Results The three new surface representations of backbone atoms, originally devised all-atom-surface representation, and the combination of all-atom surface with the backbone representation are examined. All representations are encoded with the 3DZD. Also, we have investigated the applicability of the 3DZD for searching protein EM density maps of varying resolutions. The surface representations are evaluated on structure retrieval using two existing classifications, SCOP and the CE-based classification. Conclusions Overall, the 3DZDs representing backbone atoms show better retrieval performance than the original all-atom surface representation. The performance further improved when the two representations are combined. Moreover, we observed that the 3DZD is also powerful in comparing low-resolution structures obtained by electron microscopy. PMID:21172052

  5. Motivated Proteins: A web application for studying small three-dimensional protein motifs

    PubMed Central

    Leader, David P; Milner-White, E James

    2009-01-01

    Background Small loop-shaped motifs are common constituents of the three-dimensional structure of proteins. Typically they comprise between three and seven amino acid residues, and are defined by a combination of dihedral angles and hydrogen bonding partners. The most abundant of these are αβ-motifs, asx-motifs, asx-turns, β-bulges, β-bulge loops, β-turns, nests, niches, Schellmann loops, ST-motifs, ST-staples and ST-turns. We have constructed a database of such motifs from a range of high-quality protein structures and built a web application as a visual interface to this. Description The web application, Motivated Proteins, provides access to these 12 motifs (with 48 sub-categories) in a database of over 400 representative proteins. Queries can be made for specific categories or sub-categories of motif, motifs in the vicinity of ligands, motifs which include part of an enzyme active site, overlapping motifs, or motifs which include a particular amino acid sequence. Individual proteins can be specified, or, where appropriate, motifs for all proteins listed. The results of queries are presented in textual form as an (X)HTML table, and may be saved as parsable plain text or XML. Motifs can be viewed and manipulated either individually or in the context of the protein in the Jmol applet structural viewer. Cartoons of the motifs imposed on a linear representation of protein secondary structure are also provided. Summary information for the motifs is available, as are histograms of amino acid distribution, and graphs of dihedral angles at individual positions in the motifs. Conclusion Motivated Proteins is a publicly and freely accessible web application that enables protein scientists to study small three-dimensional motifs without requiring knowledge of either Structured Query Language or the underlying database schema. PMID:19210785

  6. Accelerated molecular dynamics simulations of protein folding.

    PubMed

    Miao, Yinglong; Feixas, Ferran; Eun, Changsun; McCammon, J Andrew

    2015-07-30

    Folding of four fast-folding proteins, including chignolin, Trp-cage, villin headpiece and WW domain, was simulated via accelerated molecular dynamics (aMD). In comparison with hundred-of-microsecond timescale conventional molecular dynamics (cMD) simulations performed on the Anton supercomputer, aMD captured complete folding of the four proteins in significantly shorter simulation time. The folded protein conformations were found within 0.2-2.1 Å of the native NMR or X-ray crystal structures. Free energy profiles calculated through improved reweighting of the aMD simulations using cumulant expansion to the second-order are in good agreement with those obtained from cMD simulations. This allows us to identify distinct conformational states (e.g., unfolded and intermediate) other than the native structure and the protein folding energy barriers. Detailed analysis of protein secondary structures and local key residue interactions provided important insights into the protein folding pathways. Furthermore, the selections of force fields and aMD simulation parameters are discussed in detail. Our work shows usefulness and accuracy of aMD in studying protein folding, providing basic references in using aMD in future protein-folding studies. © 2015 Wiley Periodicals, Inc.

  7. Structural insights into the inactivation of CRISPR-Cas systems by diverse anti-CRISPR proteins.

    PubMed

    Zhu, Yuwei; Zhang, Fan; Huang, Zhiwei

    2018-03-19

    A molecular arms race is progressively being unveiled between prokaryotes and viruses. Prokaryotes utilize CRISPR-mediated adaptive immune systems to kill the invading phages and mobile genetic elements, and in turn, the viruses evolve diverse anti-CRISPR proteins to fight back. The structures of several anti-CRISPR proteins have now been reported, and here we discuss their structural features, with a particular emphasis on topology, to discover their similarities and differences. We summarize the CRISPR-Cas inhibition mechanisms of these anti-CRISPR proteins in their structural context. Considering anti-CRISPRs in this way will provide important clues for studying their origin and evolution.

  8. Computational Simulation of the Activation Cycle of Gα Subunit in the G Protein Cycle Using an Elastic Network Model

    PubMed Central

    Kim, Min Hyeok; Kim, Young Jin; Kim, Hee Ryung; Jeon, Tae-Joon; Choi, Jae Boong; Chung, Ka Young; Kim, Moon Ki

    2016-01-01

    Agonist-activated G protein-coupled receptors (GPCRs) interact with GDP-bound G protein heterotrimers (Gαβγ) promoting GDP/GTP exchange, which results in dissociation of Gα from the receptor and Gβγ. The GTPase activity of Gα hydrolyzes GTP to GDP, and the GDP-bound Gα interacts with Gβγ, forming a GDP-bound G protein heterotrimer. The G protein cycle is allosterically modulated by conformational changes of the Gα subunit. Although biochemical and biophysical methods have elucidated the structure and dynamics of Gα, the precise conformational mechanisms underlying the G protein cycle are not fully understood yet. Simulation methods could help to provide additional details to gain further insight into G protein signal transduction mechanisms. In this study, using the available X-ray crystal structures of Gα, we simulated the entire G protein cycle and described not only the steric features of the Gα structure, but also conformational changes at each step. Each reference structure in the G protein cycle was modeled as an elastic network model and subjected to normal mode analysis. Our simulation data suggests that activated receptors trigger conformational changes of the Gα subunit that are thermodynamically favorable for opening of the nucleotide-binding pocket and GDP release. Furthermore, the effects of GTP binding and hydrolysis on mobility changes of the C and N termini and switch regions are elucidated. In summary, our simulation results enabled us to provide detailed descriptions of the structural and dynamic features of the G protein cycle. PMID:27483005

  9. Genetics Home Reference: early-onset myopathy with fatal cardiomyopathy

    MedlinePlus

    ... they are made of proteins that generate the mechanical force needed for muscles to contract. Titin has several functions within sarcomeres. One of this protein's most important jobs is to provide structure, flexibility, and stability to these cell structures. Titin ...

  10. Protein-based hydrogels for tissue engineering

    PubMed Central

    Schloss, Ashley C.; Williams, Danielle M.; Regan, Lynne J.

    2017-01-01

    The tunable mechanical and structural properties of protein-based hydrogels make them excellent scaffolds for tissue engineering and repair. Moreover, using protein-based components provides the option to insert sequences associated with the promoting both cellular adhesion to the substrate and overall cell growth. Protein-based hydrogel components are appealing for their structural designability, specific biological functionality, and stimuli-responsiveness. Here we present highlights in the field of protein-based hydrogels for tissue engineering applications including design requirements, components, and gel types. PMID:27677513

  11. Biotechnology Protein Expression and Purification Facility

    NASA Technical Reports Server (NTRS)

    2003-01-01

    The purpose of the Project Scientist Core Facility is to provide purified proteins, both recombinant and natural, to the Biotechnology Science Team Project Scientists and the NRA-Structural Biology Test Investigators. Having a core facility for this purpose obviates the need for each scientist to develop the necessary expertise and equipment for molecular biology, protein expression, and protein purification. Because of this, they are able to focus their energies as well as their funding on the crystallization and structure determination of their target proteins.

  12. RStrucFam: a web server to associate structure and cognate RNA for RNA-binding proteins from sequence information.

    PubMed

    Ghosh, Pritha; Mathew, Oommen K; Sowdhamini, Ramanathan

    2016-10-07

    RNA-binding proteins (RBPs) interact with their cognate RNA(s) to form large biomolecular assemblies. They are versatile in their functionality and are involved in a myriad of processes inside the cell. RBPs with similar structural features and common biological functions are grouped together into families and superfamilies. It will be useful to obtain an early understanding and association of RNA-binding property of sequences of gene products. Here, we report a web server, RStrucFam, to predict the structure, type of cognate RNA(s) and function(s) of proteins, where possible, from mere sequence information. The web server employs Hidden Markov Model scan (hmmscan) to enable association to a back-end database of structural and sequence families. The database (HMMRBP) comprises of 437 HMMs of RBP families of known structure that have been generated using structure-based sequence alignments and 746 sequence-centric RBP family HMMs. The input protein sequence is associated with structural or sequence domain families, if structure or sequence signatures exist. In case of association of the protein with a family of known structures, output features like, multiple structure-based sequence alignment (MSSA) of the query with all others members of that family is provided. Further, cognate RNA partner(s) for that protein, Gene Ontology (GO) annotations, if any and a homology model of the protein can be obtained. The users can also browse through the database for details pertaining to each family, protein or RNA and their related information based on keyword search or RNA motif search. RStrucFam is a web server that exploits structurally conserved features of RBPs, derived from known family members and imprinted in mathematical profiles, to predict putative RBPs from sequence information. Proteins that fail to associate with such structure-centric families are further queried against the sequence-centric RBP family HMMs in the HMMRBP database. Further, all other essential information pertaining to an RBP, like overall function annotations, are provided. The web server can be accessed at the following link: http://caps.ncbs.res.in/rstrucfam .

  13. Systems biology of the structural proteome.

    PubMed

    Brunk, Elizabeth; Mih, Nathan; Monk, Jonathan; Zhang, Zhen; O'Brien, Edward J; Bliven, Spencer E; Chen, Ke; Chang, Roger L; Bourne, Philip E; Palsson, Bernhard O

    2016-03-11

    The success of genome-scale models (GEMs) can be attributed to the high-quality, bottom-up reconstructions of metabolic, protein synthesis, and transcriptional regulatory networks on an organism-specific basis. Such reconstructions are biochemically, genetically, and genomically structured knowledge bases that can be converted into a mathematical format to enable a myriad of computational biological studies. In recent years, genome-scale reconstructions have been extended to include protein structural information, which has opened up new vistas in systems biology research and empowered applications in structural systems biology and systems pharmacology. Here, we present the generation, application, and dissemination of genome-scale models with protein structures (GEM-PRO) for Escherichia coli and Thermotoga maritima. We show the utility of integrating molecular scale analyses with systems biology approaches by discussing several comparative analyses on the temperature dependence of growth, the distribution of protein fold families, substrate specificity, and characteristic features of whole cell proteomes. Finally, to aid in the grand challenge of big data to knowledge, we provide several explicit tutorials of how protein-related information can be linked to genome-scale models in a public GitHub repository ( https://github.com/SBRG/GEMPro/tree/master/GEMPro_recon/). Translating genome-scale, protein-related information to structured data in the format of a GEM provides a direct mapping of gene to gene-product to protein structure to biochemical reaction to network states to phenotypic function. Integration of molecular-level details of individual proteins, such as their physical, chemical, and structural properties, further expands the description of biochemical network-level properties, and can ultimately influence how to model and predict whole cell phenotypes as well as perform comparative systems biology approaches to study differences between organisms. GEM-PRO offers insight into the physical embodiment of an organism's genotype, and its use in this comparative framework enables exploration of adaptive strategies for these organisms, opening the door to many new lines of research. With these provided tools, tutorials, and background, the reader will be in a position to run GEM-PRO for their own purposes.

  14. Identification of family-specific residue packing motifs and their use for structure-based protein function prediction: I. Method development.

    PubMed

    Bandyopadhyay, Deepak; Huan, Jun; Prins, Jan; Snoeyink, Jack; Wang, Wei; Tropsha, Alexander

    2009-11-01

    Protein function prediction is one of the central problems in computational biology. We present a novel automated protein structure-based function prediction method using libraries of local residue packing patterns that are common to most proteins in a known functional family. Critical to this approach is the representation of a protein structure as a graph where residue vertices (residue name used as a vertex label) are connected by geometrical proximity edges. The approach employs two steps. First, it uses a fast subgraph mining algorithm to find all occurrences of family-specific labeled subgraphs for all well characterized protein structural and functional families. Second, it queries a new structure for occurrences of a set of motifs characteristic of a known family, using a graph index to speed up Ullman's subgraph isomorphism algorithm. The confidence of function inference from structure depends on the number of family-specific motifs found in the query structure compared with their distribution in a large non-redundant database of proteins. This method can assign a new structure to a specific functional family in cases where sequence alignments, sequence patterns, structural superposition and active site templates fail to provide accurate annotation.

  15. Computer Simulations of Intrinsically Disordered Proteins

    NASA Astrophysics Data System (ADS)

    Chong, Song-Ho; Chatterjee, Prathit; Ham, Sihyun

    2017-05-01

    The investigation of intrinsically disordered proteins (IDPs) is a new frontier in structural and molecular biology that requires a new paradigm to connect structural disorder to function. Molecular dynamics simulations and statistical thermodynamics potentially offer ideal tools for atomic-level characterizations and thermodynamic descriptions of this fascinating class of proteins that will complement experimental studies. However, IDPs display sensitivity to inaccuracies in the underlying molecular mechanics force fields. Thus, achieving an accurate structural characterization of IDPs via simulations is a challenge. It is also daunting to perform a configuration-space integration over heterogeneous structural ensembles sampled by IDPs to extract, in particular, protein configurational entropy. In this review, we summarize recent efforts devoted to the development of force fields and the critical evaluations of their performance when applied to IDPs. We also survey recent advances in computational methods for protein configurational entropy that aim to provide a thermodynamic link between structural disorder and protein activity.

  16. Crystal structure of mitochondrial respiratory membrane protein complex II.

    PubMed

    Sun, Fei; Huo, Xia; Zhai, Yujia; Wang, Aojin; Xu, Jianxing; Su, Dan; Bartlam, Mark; Rao, Zihe

    2005-07-01

    The mitochondrial respiratory Complex II or succinate:ubiquinone oxidoreductase (SQR) is an integral membrane protein complex in both the tricarboxylic acid cycle and aerobic respiration. Here we report the first crystal structure of Complex II from porcine heart at 2.4 A resolution and its complex structure with inhibitors 3-nitropropionate and 2-thenoyltrifluoroacetone (TTFA) at 3.5 A resolution. Complex II is comprised of two hydrophilic proteins, flavoprotein (Fp) and iron-sulfur protein (Ip), and two transmembrane proteins (CybL and CybS), as well as prosthetic groups required for electron transfer from succinate to ubiquinone. The structure correlates the protein environments around prosthetic groups with their unique midpoint redox potentials. Two ubiquinone binding sites are discussed and elucidated by TTFA binding. The Complex II structure provides a bona fide model for study of the mitochondrial respiratory system and human mitochondrial diseases related to mutations in this complex.

  17. Protein purification and crystallization artifacts: The tale usually not told.

    PubMed

    Niedzialkowska, Ewa; Gasiorowska, Olga; Handing, Katarzyna B; Majorek, Karolina A; Porebski, Przemyslaw J; Shabalin, Ivan G; Zasadzinska, Ewelina; Cymborowski, Marcin; Minor, Wladek

    2016-03-01

    The misidentification of a protein sample, or contamination of a sample with the wrong protein, may be a potential reason for the non-reproducibility of experiments. This problem may occur in the process of heterologous overexpression and purification of recombinant proteins, as well as purification of proteins from natural sources. If the contaminated or misidentified sample is used for crystallization, in many cases the problem may not be detected until structures are determined. In the case of functional studies, the problem may not be detected for years. Here several procedures that can be successfully used for the identification of crystallized protein contaminants, including: (i) a lattice parameter search against known structures, (ii) sequence or fold identification from partially built models, and (iii) molecular replacement with common contaminants as search templates have been presented. A list of common contaminant structures to be used as alternative search models was provided. These methods were used to identify four cases of purification and crystallization artifacts. This report provides troubleshooting pointers for researchers facing difficulties in phasing or model building. © 2016 The Protein Society.

  18. Restricted mobility of side chains on concave surfaces of solenoid proteins may impart heightened potential for intermolecular interactions.

    PubMed

    Ramya, L; Gautham, N; Chaloin, Laurent; Kajava, Andrey V

    2015-09-01

    Significant progress has been made in the determination of the protein structures with their number today passing over a hundred thousand structures. The next challenge is the understanding and prediction of protein-protein and protein-ligand interactions. In this work we address this problem by analyzing curved solenoid proteins. Many of these proteins are considered as "hub molecules" for their high potential to interact with many different molecules and to be a scaffold for multisubunit protein machineries. Our analysis of these structures through molecular dynamics simulations reveals that the mobility of the side-chains on the concave surfaces of the solenoids is lower than on the convex ones. This result provides an explanation to the observed preferential binding of the ligands, including small and flexible ligands, to the concave surface of the curved solenoid proteins. The relationship between the landscapes and dynamic properties of the protein surfaces can be further generalized to the other types of protein structures and eventually used in the computer algorithms, allowing prediction of protein-ligand interactions by analysis of protein surfaces. © 2015 Wiley Periodicals, Inc.

  19. Investigating the effect of an arterial hypertension drug on the structural properties of plasma protein.

    PubMed

    Hassan, Natalia; Maldonado-Valderrama, Julia; Gunning, A Patrick; Morris, V J; Ruso, Juan M

    2011-10-15

    Propanolol is a betablocker drug used in the treatment of arterial hypertension related diseases. In order to achieve an optimal performance of this drug it is important to consider the possible interactions of propanolol with plasma proteins. In this work, we have used several experimental techniques to characterise the effect of addition of the betablocker propanolol on the properties of bovine plasma fibrinogen (FB). Differential scanning calorimeter (DSC), circular dichroism (CD), dynamic light scattering (DLS), surface tension techniques and atomic force microscopy (AFM) measurements have been combined to carry out a detailed physicochemical and surface characterization of the mixed system. As a result, DSC measurements show that propranolol can play two opposite roles, either acting as a structure stabilizer at low molar concentrations or as a structure destabilizer at higher concentrations, in different domains of fibrinogen. CD measurements have revealed that the effect of propanolol on the secondary structure of fibrinogen depends on the temperature and the drug concentration and the DLS analysis showed evidence for protein aggregation. Interestingly, surface tension measurements provided further evidence of the conformational change induced by propanolol on the secondary structure of FB by importantly increasing the surface tension of the system. Finally, AFM imaging of the fibrinogen system provided direct visualization of the protein structure in the presence of propanolol. Combination of these techniques has produced complementary information on the behavior of the mixed system, providing new insights into the structural properties of proteins with potential medical interest. Copyright © 2011 Elsevier B.V. All rights reserved.

  20. Duck hepatitis A virus structural proteins expressed in insect cells self-assemble into virus-like particles with strong immunogenicity in ducklings.

    PubMed

    Wang, Anping; Gu, Lingling; Wu, Shuang; Zhu, Shanyuan

    2018-02-01

    Duck hepatitis A virus (DHAV), a non-enveloped ssRNA virus, can cause a highly contagious disease in young ducklings. The three capsid proteins of VP0, VP1 and VP3 are translated within a single large open reading frame (ORF) and hydrolyzed by protease 3CD. However, little is known on whether the recombinant viral structural proteins (VPs) expressed in insect cells could spontaneously assemble into virus-like particles (VLPs) and whether these VLPs could induce protective immunity in young ducklings. To address these issues, the structural polyprotein precursor gene P1 and the protease gene 3CD were amplified by PCR, and the recombinant proteins were expressed in insect cells using a baculovirus expression system for the characterization of their structures and immunogenicity. The recombinant proteins expressed in Sf9 cells were detected by indirect immunofluorescence assay and Western blot analysis. Electron microscopy showed that the recombinant proteins spontaneously assembled into VLPs in insect cells. Western blot analysis of the purified VLPs revealed that the VLPs were composed with the three structural proteins. In addition, vaccination with the VLPs induced high humoral immune response and provided strong protection. Therefore, our findings may provide a framework for development of new vaccines for the prevention of duck viral hepatitis. Copyright © 2018 Elsevier B.V. All rights reserved.

  1. A constraint logic programming approach to associate 1D and 3D structural components for large protein complexes.

    PubMed

    Dal Palù, Alessandro; Pontelli, Enrico; He, Jing; Lu, Yonggang

    2007-01-01

    The paper describes a novel framework, constructed using Constraint Logic Programming (CLP) and parallelism, to determine the association between parts of the primary sequence of a protein and alpha-helices extracted from 3D low-resolution descriptions of large protein complexes. The association is determined by extracting constraints from the 3D information, regarding length, relative position and connectivity of helices, and solving these constraints with the guidance of a secondary structure prediction algorithm. Parallelism is employed to enhance performance on large proteins. The framework provides a fast, inexpensive alternative to determine the exact tertiary structure of unknown proteins.

  2. Molecular details of secretory phospholipase A2 from flax (Linum usitatissimum L.) provide insight into its structure and function.

    PubMed

    Gupta, Payal; Dash, Prasanta K

    2017-09-11

    Secretory phospholipase A 2 (sPLA 2 ) are low molecular weight proteins (12-18 kDa) involved in a suite of plant cellular processes imparting growth and development. With myriad roles in physiological and biochemical processes in plants, detailed analysis of sPLA 2 in flax/linseed is meagre. The present work, first in flax, embodies cloning, expression, purification and molecular characterisation of two distinct sPLA 2 s (I and II) from flax. PLA 2 activity of the cloned sPLA 2 s were biochemically assayed authenticating them as bona fide phospholipase A 2 . Physiochemical properties of both the sPLA 2 s revealed they are thermostable proteins requiring di-valent cations for optimum activity.While, structural analysis of both the proteins revealed deviations in the amino acid sequence at C- & N-terminal regions; hydropathic study revealed LusPLA 2 I as a hydrophobic protein and LusPLA 2 II as a hydrophilic protein. Structural analysis of flax sPLA 2 s revealed that secondary structure of both the proteins are dominated by α-helix followed by random coils. Modular superimposition of LusPLA 2 isoforms with rice sPLA 2 confirmed monomeric structural preservation among plant phospholipase A 2 and provided insight into structure of folded flax sPLA 2 s.

  3. Lipidic cubic phase injector facilitates membrane protein serial femtosecond crystallography.

    PubMed

    Weierstall, Uwe; James, Daniel; Wang, Chong; White, Thomas A; Wang, Dingjie; Liu, Wei; Spence, John C H; Bruce Doak, R; Nelson, Garrett; Fromme, Petra; Fromme, Raimund; Grotjohann, Ingo; Kupitz, Christopher; Zatsepin, Nadia A; Liu, Haiguang; Basu, Shibom; Wacker, Daniel; Han, Gye Won; Katritch, Vsevolod; Boutet, Sébastien; Messerschmidt, Marc; Williams, Garth J; Koglin, Jason E; Marvin Seibert, M; Klinker, Markus; Gati, Cornelius; Shoeman, Robert L; Barty, Anton; Chapman, Henry N; Kirian, Richard A; Beyerlein, Kenneth R; Stevens, Raymond C; Li, Dianfan; Shah, Syed T A; Howe, Nicole; Caffrey, Martin; Cherezov, Vadim

    2014-01-01

    Lipidic cubic phase (LCP) crystallization has proven successful for high-resolution structure determination of challenging membrane proteins. Here we present a technique for extruding gel-like LCP with embedded membrane protein microcrystals, providing a continuously renewed source of material for serial femtosecond crystallography. Data collected from sub-10-μm-sized crystals produced with less than 0.5 mg of purified protein yield structural insights regarding cyclopamine binding to the Smoothened receptor.

  4. Natural antigenic differences in the functionally equivalent extracellular DNABII proteins of bacterial biofilms provide a means for targeted biofilm therapeutics.

    PubMed

    Rocco, C J; Davey, M E; Bakaletz, L O; Goodman, S D

    2017-04-01

    Bacteria that persist in the oral cavity exist within complex biofilm communities. A hallmark of biofilms is the presence of an extracellular polymeric substance (EPS), which consists of polysaccharides, extracellular DNA (eDNA), and proteins, including the DNABII family of proteins. The removal of DNABII proteins from a biofilm results in the loss of structural integrity of the eDNA and the collapse of the biofilm structure. We examined the role of DNABII proteins in the biofilm structure of the periodontal pathogen Porphyromonas gingivalis and the oral commensal Streptococcus gordonii. Co-aggregation with oral streptococci is thought to facilitate the establishment of P. gingivalis within the biofilm community. We demonstrate that DNABII proteins are present in the EPS of both S. gordonii and P. gingivalis biofilms, and that these biofilms can be disrupted through the addition of antisera derived against their respective DNABII proteins. We provide evidence that both eDNA and DNABII proteins are limiting in S. gordonii but not in P. gingivalis biofilms. In addition, these proteins are capable of complementing one another functionally. We also found that whereas antisera derived against most DNABII proteins are capable of binding a wide variety of DNABII proteins, the P. gingivalis DNABII proteins are antigenically distinct. The presence of DNABII proteins in the EPS of these biofilms and the antigenic uniqueness of the P. gingivalis proteins provide an opportunity to develop therapies that are targeted to remove P. gingivalis and biofilms that contain P. gingivalis from the oral cavity. © 2016 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  5. Computational investigation of the human SOD1 mutant, Cys146Arg, that directs familial amyotrophic lateral sclerosis.

    PubMed

    Srinivasan, E; Rajasekaran, R

    2017-07-25

    The genetic substitution mutation of Cys146Arg in the SOD1 protein is predominantly found in the Japanese population suffering from familial amyotrophic lateral sclerosis (FALS). A complete study of the biophysical aspects of this particular missense mutation through conformational analysis and producing free energy landscapes could provide an insight into the pathogenic mechanism of ALS disease. In this study, we utilized general molecular dynamics simulations along with computational predictions to assess the structural characterization of the protein as well as the conformational preferences of monomeric wild type and mutant SOD1. Our static analysis, accomplished through multiple programs, predicted the deleterious and destabilizing effect of mutant SOD1. Subsequently, comparative molecular dynamic studies performed on the wild type and mutant SOD1 indicated a loss in the protein conformational stability and flexibility. We observed the mutational consequences not only in local but also in long-range variations in the structural properties of the SOD1 protein. Long-range intramolecular protein interactions decrease upon mutation, resulting in less compact structures in the mutant protein rather than in the wild type, suggesting that the mutant structures are less stable than the wild type SOD1. We also presented the free energy landscape to study the collective motion of protein conformations through principal component analysis for the wild type and mutant SOD1. Overall, the study assisted in revealing the cause of the structural destabilization and protein misfolding via structural characterization, secondary structure composition and free energy landscapes. Hence, the computational framework in our study provides a valuable direction for the search for the cure against fatal FALS.

  6. Surface layer protein characterization by small angle x-ray scattering and a fractal mean force concept: from protein structure to nanodisk assemblies.

    PubMed

    Horejs, Christine; Pum, Dietmar; Sleytr, Uwe B; Peterlik, Herwig; Jungbauer, Alois; Tscheliessnig, Rupert

    2010-11-07

    Surface layers (S-layers) are the most commonly observed cell surface structure of prokaryotic organisms. They are made up of proteins that spontaneously self-assemble into functional crystalline lattices in solution, on various solid surfaces, and interfaces. While classical experimental techniques failed to recover a complete structural model of an unmodified S-layer protein, small angle x-ray scattering (SAXS) provides an opportunity to study the structure of S-layer monomers in solution and of self-assembled two-dimensional sheets. For the protein under investigation we recently suggested an atomistic structural model by the use of molecular dynamics simulations. This structural model is now refined on the basis of SAXS data together with a fractal assembly approach. Here we show that a nondiluted critical system of proteins, which crystallize into monomolecular structures, might be analyzed by SAXS if protein-protein interactions are taken into account by relating a fractal local density distribution to a fractal local mean potential, which has to fulfill the Poisson equation. The present work demonstrates an important step into the elucidation of the structure of S-layers and offers a tool to analyze the structure of self-assembling systems in solution by means of SAXS and computer simulations.

  7. Surface layer protein characterization by small angle x-ray scattering and a fractal mean force concept: From protein structure to nanodisk assemblies

    NASA Astrophysics Data System (ADS)

    Horejs, Christine; Pum, Dietmar; Sleytr, Uwe B.; Peterlik, Herwig; Jungbauer, Alois; Tscheliessnig, Rupert

    2010-11-01

    Surface layers (S-layers) are the most commonly observed cell surface structure of prokaryotic organisms. They are made up of proteins that spontaneously self-assemble into functional crystalline lattices in solution, on various solid surfaces, and interfaces. While classical experimental techniques failed to recover a complete structural model of an unmodified S-layer protein, small angle x-ray scattering (SAXS) provides an opportunity to study the structure of S-layer monomers in solution and of self-assembled two-dimensional sheets. For the protein under investigation we recently suggested an atomistic structural model by the use of molecular dynamics simulations. This structural model is now refined on the basis of SAXS data together with a fractal assembly approach. Here we show that a nondiluted critical system of proteins, which crystallize into monomolecular structures, might be analyzed by SAXS if protein-protein interactions are taken into account by relating a fractal local density distribution to a fractal local mean potential, which has to fulfill the Poisson equation. The present work demonstrates an important step into the elucidation of the structure of S-layers and offers a tool to analyze the structure of self-assembling systems in solution by means of SAXS and computer simulations.

  8. BAYESIAN PROTEIN STRUCTURE ALIGNMENT.

    PubMed

    Rodriguez, Abel; Schmidler, Scott C

    The analysis of the three-dimensional structure of proteins is an important topic in molecular biochemistry. Structure plays a critical role in defining the function of proteins and is more strongly conserved than amino acid sequence over evolutionary timescales. A key challenge is the identification and evaluation of structural similarity between proteins; such analysis can aid in understanding the role of newly discovered proteins and help elucidate evolutionary relationships between organisms. Computational biologists have developed many clever algorithmic techniques for comparing protein structures, however, all are based on heuristic optimization criteria, making statistical interpretation somewhat difficult. Here we present a fully probabilistic framework for pairwise structural alignment of proteins. Our approach has several advantages, including the ability to capture alignment uncertainty and to estimate key "gap" parameters which critically affect the quality of the alignment. We show that several existing alignment methods arise as maximum a posteriori estimates under specific choices of prior distributions and error models. Our probabilistic framework is also easily extended to incorporate additional information, which we demonstrate by including primary sequence information to generate simultaneous sequence-structure alignments that can resolve ambiguities obtained using structure alone. This combined model also provides a natural approach for the difficult task of estimating evolutionary distance based on structural alignments. The model is illustrated by comparison with well-established methods on several challenging protein alignment examples.

  9. Surface layer protein characterization by small angle x-ray scattering and a fractal mean force concept: From protein structure to nanodisk assemblies

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Horejs, Christine; Pum, Dietmar; Sleytr, Uwe B.

    2010-11-07

    Surface layers (S-layers) are the most commonly observed cell surface structure of prokaryotic organisms. They are made up of proteins that spontaneously self-assemble into functional crystalline lattices in solution, on various solid surfaces, and interfaces. While classical experimental techniques failed to recover a complete structural model of an unmodified S-layer protein, small angle x-ray scattering (SAXS) provides an opportunity to study the structure of S-layer monomers in solution and of self-assembled two-dimensional sheets. For the protein under investigation we recently suggested an atomistic structural model by the use of molecular dynamics simulations. This structural model is now refined on themore » basis of SAXS data together with a fractal assembly approach. Here we show that a nondiluted critical system of proteins, which crystallize into monomolecular structures, might be analyzed by SAXS if protein-protein interactions are taken into account by relating a fractal local density distribution to a fractal local mean potential, which has to fulfill the Poisson equation. The present work demonstrates an important step into the elucidation of the structure of S-layers and offers a tool to analyze the structure of self-assembling systems in solution by means of SAXS and computer simulations.« less

  10. Challenging the state-of-the-art in protein structure prediction: Highlights of experimental target structures for the 10th Critical Assessment of Techniques for Protein Structure Prediction Experiment CASP10

    PubMed Central

    Kryshtafovych, Andriy; Moult, John; Bales, Patrick; Bazan, J. Fernando; Biasini, Marco; Burgin, Alex; Chen, Chen; Cochran, Frank V.; Craig, Timothy K.; Das, Rhiju; Fass, Deborah; Garcia-Doval, Carmela; Herzberg, Osnat; Lorimer, Donald; Luecke, Hartmut; Ma, Xiaolei; Nelson, Daniel C.; van Raaij, Mark J.; Rohwer, Forest; Segall, Anca; Seguritan, Victor; Zeth, Kornelius; Schwede, Torsten

    2014-01-01

    For the last two decades, CASP has assessed the state of the art in techniques for protein structure prediction and identified areas which required further development. CASP would not have been possible without the prediction targets provided by the experimental structural biology community. In the latest experiment, CASP10, over 100 structures were suggested as prediction targets, some of which appeared to be extraordinarily difficult for modeling. In this paper, authors of some of the most challenging targets discuss which specific scientific question motivated the experimental structure determination of the target protein, which structural features were especially interesting from a structural or functional perspective, and to what extent these features were correctly reproduced in the predictions submitted to CASP10. Specifically, the following targets will be presented: the acid-gated urea channel, a difficult to predict trans-membrane protein from the important human pathogen Helicobacter pylori; the structure of human interleukin IL-34, a recently discovered helical cytokine; the structure of a functionally uncharacterized enzyme OrfY from Thermoproteus tenax formed by a gene duplication and a novel fold; an ORFan domain of mimivirus sulfhydryl oxidase R596; the fibre protein gp17 from bacteriophage T7; the Bacteriophage CBA-120 tailspike protein; a virus coat protein from metagenomic samples of the marine environment; and finally an unprecedented class of structure prediction targets based on engineered disulfide-rich small proteins. PMID:24318984

  11. The fuzzy oil drop model, based on hydrophobicity density distribution, generalizes the influence of water environment on protein structure and function.

    PubMed

    Banach, Mateusz; Konieczny, Leszek; Roterman, Irena

    2014-10-21

    In this paper we show that the fuzzy oil drop model represents a general framework for describing the generation of hydrophobic cores in proteins and thus provides insight into the influence of the water environment upon protein structure and stability. The model has been successfully applied in the study of a wide range of proteins, however this paper focuses specifically on domains representing immunoglobulin-like folds. Here we provide evidence that immunoglobulin-like domains, despite being structurally similar, differ with respect to their participation in the generation of hydrophobic core. It is shown that β-structural fragments in β-barrels participate in hydrophobic core formation in a highly differentiated manner. Quantitatively measured participation in core formation helps explain the variable stability of proteins and is shown to be related to their biological properties. This also includes the known tendency of immunoglobulin domains to form amyloids, as shown using transthyretin to reveal the clear relation between amyloidogenic properties and structural characteristics based on the fuzzy oil drop model. Copyright © 2014 The Authors. Published by Elsevier Ltd.. All rights reserved.

  12. Protein Data Bank Japan (PDBj): maintaining a structural data archive and resource description framework format

    PubMed Central

    Kinjo, Akira R.; Suzuki, Hirofumi; Yamashita, Reiko; Ikegawa, Yasuyo; Kudou, Takahiro; Igarashi, Reiko; Kengaku, Yumiko; Cho, Hasumi; Standley, Daron M.; Nakagawa, Atsushi; Nakamura, Haruki

    2012-01-01

    The Protein Data Bank Japan (PDBj, http://pdbj.org) is a member of the worldwide Protein Data Bank (wwPDB) and accepts and processes the deposited data of experimentally determined macromolecular structures. While maintaining the archive in collaboration with other wwPDB partners, PDBj also provides a wide range of services and tools for analyzing structures and functions of proteins, which are summarized in this article. To enhance the interoperability of the PDB data, we have recently developed PDB/RDF, PDB data in the Resource Description Framework (RDF) format, along with its ontology in the Web Ontology Language (OWL) based on the PDB mmCIF Exchange Dictionary. Being in the standard format for the Semantic Web, the PDB/RDF data provide a means to integrate the PDB with other biological information resources. PMID:21976737

  13. Teaching Foundational Topics and Scientific Skills in Biochemistry within the Conceptual Framework of HIV Protease

    ERIC Educational Resources Information Center

    Johnson, R. Jeremy

    2014-01-01

    HIV protease has served as a model protein for understanding protein structure, enzyme kinetics, structure-based drug design, and protein evolution. Inhibitors of HIV protease are also an essential part of effective HIV/AIDS treatment and have provided great societal benefits. The broad applications for HIV protease and its inhibitors make it a…

  14. Structure Prediction and Analysis of DNA Transposon and LINE Retrotransposon Proteins*

    PubMed Central

    Abrusán, György; Zhang, Yang; Szilágyi, András

    2013-01-01

    Despite the considerable amount of research on transposable elements, no large-scale structural analyses of the TE proteome have been performed so far. We predicted the structures of hundreds of proteins from a representative set of DNA and LINE transposable elements and used the obtained structural data to provide the first general structural characterization of TE proteins and to estimate the frequency of TE domestication and horizontal transfer events. We show that 1) ORF1 and Gag proteins of retrotransposons contain high amounts of structural disorder; thus, despite their very low conservation, the presence of disordered regions and probably their chaperone function is conserved. 2) The distribution of SCOP classes in DNA transposons and LINEs indicates that the proteins of DNA transposons are more ancient, containing folds that already existed when the first cellular organisms appeared. 3) DNA transposon proteins have lower contact order than randomly selected reference proteins, indicating rapid folding, most likely to avoid protein aggregation. 4) Structure-based searches for TE homologs indicate that the overall frequency of TE domestication events is low, whereas we found a relatively high number of cases where horizontal transfer, frequently involving parasites, is the most likely explanation for the observed homology. PMID:23530042

  15. High-Resolution NMR Reveals Secondary Structure and Folding of Amino Acid Transporter from Outer Chloroplast Membrane

    PubMed Central

    Zook, James D.; Molugu, Trivikram R.; Jacobsen, Neil E.; Lin, Guangxin; Soll, Jürgen; Cherry, Brian R.; Brown, Michael F.; Fromme, Petra

    2013-01-01

    Solving high-resolution structures for membrane proteins continues to be a daunting challenge in the structural biology community. In this study we report our high-resolution NMR results for a transmembrane protein, outer envelope protein of molar mass 16 kDa (OEP16), an amino acid transporter from the outer membrane of chloroplasts. Three-dimensional, high-resolution NMR experiments on the 13C, 15N, 2H-triply-labeled protein were used to assign protein backbone resonances and to obtain secondary structure information. The results yield over 95% assignment of N, HN, CO, Cα, and Cβ chemical shifts, which is essential for obtaining a high resolution structure from NMR data. Chemical shift analysis from the assignment data reveals experimental evidence for the first time on the location of the secondary structure elements on a per residue basis. In addition T 1Z and T2 relaxation experiments were performed in order to better understand the protein dynamics. Arginine titration experiments yield an insight into the amino acid residues responsible for protein transporter function. The results provide the necessary basis for high-resolution structural determination of this important plant membrane protein. PMID:24205117

  16. Using NMR chemical shifts to calculate the propensity for structural order and disorder in proteins.

    PubMed

    Tamiola, Kamil; Mulder, Frans A A

    2012-10-01

    NMR spectroscopy offers the unique possibility to relate the structural propensities of disordered proteins and loop segments of folded peptides to biological function and aggregation behaviour. Backbone chemical shifts are ideally suited for this task, provided that appropriate reference data are available and idiosyncratic sensitivity of backbone chemical shifts to structural information is treated in a sensible manner. In the present paper, we describe methods to detect structural protein changes from chemical shifts, and present an online tool [ncSPC (neighbour-corrected Structural Propensity Calculator)], which unites aspects of several current approaches. Examples of structural propensity calculations are given for two well-characterized systems, namely the binding of α-synuclein to micelles and light activation of photoactive yellow protein. These examples spotlight the great power of NMR chemical shift analysis for the quantitative assessment of protein disorder at the atomic level, and further our understanding of biologically important problems.

  17. Protein Crystallography in Vaccine Research and Development.

    PubMed

    Malito, Enrico; Carfi, Andrea; Bottomley, Matthew J

    2015-06-09

    The use of protein X-ray crystallography for structure-based design of small-molecule drugs is well-documented and includes several notable success stories. However, it is less well-known that structural biology has emerged as a major tool for the design of novel vaccine antigens. Here, we review the important contributions that protein crystallography has made so far to vaccine research and development. We discuss several examples of the crystallographic characterization of vaccine antigen structures, alone or in complexes with ligands or receptors. We cover the critical role of high-resolution epitope mapping by reviewing structures of complexes between antigens and their cognate neutralizing, or protective, antibody fragments. Most importantly, we provide recent examples where structural insights obtained via protein crystallography have been used to design novel optimized vaccine antigens. This review aims to illustrate the value of protein crystallography in the emerging discipline of structural vaccinology and its impact on the rational design of vaccines.

  18. Protein Crystallography in Vaccine Research and Development

    PubMed Central

    Malito, Enrico; Carfi, Andrea; Bottomley, Matthew J.

    2015-01-01

    The use of protein X-ray crystallography for structure-based design of small-molecule drugs is well-documented and includes several notable success stories. However, it is less well-known that structural biology has emerged as a major tool for the design of novel vaccine antigens. Here, we review the important contributions that protein crystallography has made so far to vaccine research and development. We discuss several examples of the crystallographic characterization of vaccine antigen structures, alone or in complexes with ligands or receptors. We cover the critical role of high-resolution epitope mapping by reviewing structures of complexes between antigens and their cognate neutralizing, or protective, antibody fragments. Most importantly, we provide recent examples where structural insights obtained via protein crystallography have been used to design novel optimized vaccine antigens. This review aims to illustrate the value of protein crystallography in the emerging discipline of structural vaccinology and its impact on the rational design of vaccines. PMID:26068237

  19. Bayesian comparison of protein structures using partial Procrustes distance.

    PubMed

    Ejlali, Nasim; Faghihi, Mohammad Reza; Sadeghi, Mehdi

    2017-09-26

    An important topic in bioinformatics is the protein structure alignment. Some statistical methods have been proposed for this problem, but most of them align two protein structures based on the global geometric information without considering the effect of neighbourhood in the structures. In this paper, we provide a Bayesian model to align protein structures, by considering the effect of both local and global geometric information of protein structures. Local geometric information is incorporated to the model through the partial Procrustes distance of small substructures. These substructures are composed of β-carbon atoms from the side chains. Parameters are estimated using a Markov chain Monte Carlo (MCMC) approach. We evaluate the performance of our model through some simulation studies. Furthermore, we apply our model to a real dataset and assess the accuracy and convergence rate. Results show that our model is much more efficient than previous approaches.

  20. A general method for targeted quantitative cross-linking mass spectrometry

    USDA-ARS?s Scientific Manuscript database

    Chemical cross-linking mass spectrometry (XL-MS) provides protein structural information by identifying covalently linked proximal amino acid residues on protein surfaces. The information gained by this technique is complementary to other structural biology methods such as x-ray crystallography, NM...

  1. Predicting the tolerated sequences for proteins and protein interfaces using RosettaBackrub flexible backbone design.

    PubMed

    Smith, Colin A; Kortemme, Tanja

    2011-01-01

    Predicting the set of sequences that are tolerated by a protein or protein interface, while maintaining a desired function, is useful for characterizing protein interaction specificity and for computationally designing sequence libraries to engineer proteins with new functions. Here we provide a general method, a detailed set of protocols, and several benchmarks and analyses for estimating tolerated sequences using flexible backbone protein design implemented in the Rosetta molecular modeling software suite. The input to the method is at least one experimentally determined three-dimensional protein structure or high-quality model. The starting structure(s) are expanded or refined into a conformational ensemble using Monte Carlo simulations consisting of backrub backbone and side chain moves in Rosetta. The method then uses a combination of simulated annealing and genetic algorithm optimization methods to enrich for low-energy sequences for the individual members of the ensemble. To emphasize certain functional requirements (e.g. forming a binding interface), interactions between and within parts of the structure (e.g. domains) can be reweighted in the scoring function. Results from each backbone structure are merged together to create a single estimate for the tolerated sequence space. We provide an extensive description of the protocol and its parameters, all source code, example analysis scripts and three tests applying this method to finding sequences predicted to stabilize proteins or protein interfaces. The generality of this method makes many other applications possible, for example stabilizing interactions with small molecules, DNA, or RNA. Through the use of within-domain reweighting and/or multistate design, it may also be possible to use this method to find sequences that stabilize particular protein conformations or binding interactions over others.

  2. A combination of feature extraction methods with an ensemble of different classifiers for protein structural class prediction problem.

    PubMed

    Dehzangi, Abdollah; Paliwal, Kuldip; Sharma, Alok; Dehzangi, Omid; Sattar, Abdul

    2013-01-01

    Better understanding of structural class of a given protein reveals important information about its overall folding type and its domain. It can also be directly used to provide critical information on general tertiary structure of a protein which has a profound impact on protein function determination and drug design. Despite tremendous enhancements made by pattern recognition-based approaches to solve this problem, it still remains as an unsolved issue for bioinformatics that demands more attention and exploration. In this study, we propose a novel feature extraction model that incorporates physicochemical and evolutionary-based information simultaneously. We also propose overlapped segmented distribution and autocorrelation-based feature extraction methods to provide more local and global discriminatory information. The proposed feature extraction methods are explored for 15 most promising attributes that are selected from a wide range of physicochemical-based attributes. Finally, by applying an ensemble of different classifiers namely, Adaboost.M1, LogitBoost, naive Bayes, multilayer perceptron (MLP), and support vector machine (SVM) we show enhancement of the protein structural class prediction accuracy for four popular benchmarks.

  3. Structural domains and main-chain flexibility in prion proteins.

    PubMed

    Blinov, N; Berjanskii, M; Wishart, D S; Stepanova, M

    2009-02-24

    In this study we describe a novel approach to define structural domains and to characterize the local flexibility in both human and chicken prion proteins. The approach we use is based on a comprehensive theory of collective dynamics in proteins that was recently developed. This method determines the essential collective coordinates, which can be found from molecular dynamics trajectories via principal component analysis. Under this particular framework, we are able to identify the domains where atoms move coherently while at the same time to determine the local main-chain flexibility for each residue. We have verified this approach by comparing our results for the predicted dynamic domain systems with the computed main-chain flexibility profiles and the NMR-derived random coil indexes for human and chicken prion proteins. The three sets of data show excellent agreement. Additionally, we demonstrate that the dynamic domains calculated in this fashion provide a highly sensitive measure of protein collective structure and dynamics. Furthermore, such an analysis is capable of revealing structural and dynamic properties of proteins that are inaccessible to the conventional assessment of secondary structure. Using the collective dynamic simulation approach described here along with a high-temperature simulations of unfolding of human prion protein, we have explored whether locations of relatively low stability could be identified where the unfolding process could potentially be facilitated. According to our analysis, the locations of relatively low stability may be associated with the beta-sheet formed by strands S1 and S2 and the adjacent loops, whereas helix HC appears to be a relatively stable part of the protein. We suggest that this kind of structural analysis may provide a useful background for a more quantitative assessment of potential routes of spontaneous misfolding in prion proteins.

  4. Single-Molecule FRET Spectroscopy and the Polymer Physics of Unfolded and Intrinsically Disordered Proteins.

    PubMed

    Schuler, Benjamin; Soranno, Andrea; Hofmann, Hagen; Nettels, Daniel

    2016-07-05

    The properties of unfolded proteins have long been of interest because of their importance to the protein folding process. Recently, the surprising prevalence of unstructured regions or entirely disordered proteins under physiological conditions has led to the realization that such intrinsically disordered proteins can be functional even in the absence of a folded structure. However, owing to their broad conformational distributions, many of the properties of unstructured proteins are difficult to describe with the established concepts of structural biology. We have thus seen a reemergence of polymer physics as a versatile framework for understanding their structure and dynamics. An important driving force for these developments has been single-molecule spectroscopy, as it allows structural heterogeneity, intramolecular distance distributions, and dynamics to be quantified over a wide range of timescales and solution conditions. Polymer concepts provide an important basis for relating the physical properties of unstructured proteins to folding and function.

  5. In Search of Functional Advantages of Knots in Proteins.

    PubMed

    Dabrowski-Tumanski, Pawel; Stasiak, Andrzej; Sulkowska, Joanna I

    2016-01-01

    We analysed the structure of deeply knotted proteins representing three unrelated families of knotted proteins. We looked at the correlation between positions of knotted cores in these proteins and such local structural characteristics as the number of intra-chain contacts, structural stability and solvent accessibility. We observed that the knotted cores and especially their borders showed strong enrichment in the number of contacts. These regions showed also increased thermal stability, whereas their solvent accessibility was decreased. Interestingly, the active sites within these knotted proteins preferentially located in the regions with increased number of contacts that also have increased thermal stability and decreased solvent accessibility. Our results suggest that knotting of polypeptide chains provides a favourable environment for the active sites observed in knotted proteins. Some knotted proteins have homologues without a knot. Interestingly, these unknotted homologues form local entanglements that retain structural characteristics of the knotted cores.

  6. Conservation of coevolving protein interfaces bridges prokaryote-eukaryote homologies in the twilight zone.

    PubMed

    Rodriguez-Rivas, Juan; Marsili, Simone; Juan, David; Valencia, Alfonso

    2016-12-27

    Protein-protein interactions are fundamental for the proper functioning of the cell. As a result, protein interaction surfaces are subject to strong evolutionary constraints. Recent developments have shown that residue coevolution provides accurate predictions of heterodimeric protein interfaces from sequence information. So far these approaches have been limited to the analysis of families of prokaryotic complexes for which large multiple sequence alignments of homologous sequences can be compiled. We explore the hypothesis that coevolution points to structurally conserved contacts at protein-protein interfaces, which can be reliably projected to homologous complexes with distantly related sequences. We introduce a domain-centered protocol to study the interplay between residue coevolution and structural conservation of protein-protein interfaces. We show that sequence-based coevolutionary analysis systematically identifies residue contacts at prokaryotic interfaces that are structurally conserved at the interface of their eukaryotic counterparts. In turn, this allows the prediction of conserved contacts at eukaryotic protein-protein interfaces with high confidence using solely mutational patterns extracted from prokaryotic genomes. Even in the context of high divergence in sequence (the twilight zone), where standard homology modeling of protein complexes is unreliable, our approach provides sequence-based accurate information about specific details of protein interactions at the residue level. Selected examples of the application of prokaryotic coevolutionary analysis to the prediction of eukaryotic interfaces further illustrate the potential of this approach.

  7. Statistical approaches to maximize recombinant protein expression in Escherichia coli: a general review.

    PubMed

    Papaneophytou, Christos P; Kontopidis, George

    2014-02-01

    The supply of many valuable proteins that have potential clinical or industrial use is often limited by their low natural availability. With the modern advances in genomics, proteomics and bioinformatics, the number of proteins being produced using recombinant techniques is exponentially increasing and seems to guarantee an unlimited supply of recombinant proteins. The demand of recombinant proteins has increased as more applications in several fields become a commercial reality. Escherichia coli (E. coli) is the most widely used expression system for the production of recombinant proteins for structural and functional studies. However, producing soluble proteins in E. coli is still a major bottleneck for structural biology projects. One of the most challenging steps in any structural biology project is predicting which protein or protein fragment will express solubly and purify for crystallographic studies. The production of soluble and active proteins is influenced by several factors including expression host, fusion tag, induction temperature and time. Statistical designed experiments are gaining success in the production of recombinant protein because they provide information on variable interactions that escape the "one-factor-at-a-time" method. Here, we review the most important factors affecting the production of recombinant proteins in a soluble form. Moreover, we provide information about how the statistical design experiments can increase protein yield and purity as well as find conditions for crystal growth. Copyright © 2013 Elsevier Inc. All rights reserved.

  8. Fast large-scale clustering of protein structures using Gauss integrals.

    PubMed

    Harder, Tim; Borg, Mikael; Boomsma, Wouter; Røgen, Peter; Hamelryck, Thomas

    2012-02-15

    Clustering protein structures is an important task in structural bioinformatics. De novo structure prediction, for example, often involves a clustering step for finding the best prediction. Other applications include assigning proteins to fold families and analyzing molecular dynamics trajectories. We present Pleiades, a novel approach to clustering protein structures with a rigorous mathematical underpinning. The method approximates clustering based on the root mean square deviation by first mapping structures to Gauss integral vectors--which were introduced by Røgen and co-workers--and subsequently performing K-means clustering. Compared to current methods, Pleiades dramatically improves on the time needed to perform clustering, and can cluster a significantly larger number of structures, while providing state-of-the-art results. The number of low energy structures generated in a typical folding study, which is in the order of 50,000 structures, can be clustered within seconds to minutes.

  9. Insights from molecular dynamics simulations for computational protein design.

    PubMed

    Childers, Matthew Carter; Daggett, Valerie

    2017-02-01

    A grand challenge in the field of structural biology is to design and engineer proteins that exhibit targeted functions. Although much success on this front has been achieved, design success rates remain low, an ever-present reminder of our limited understanding of the relationship between amino acid sequences and the structures they adopt. In addition to experimental techniques and rational design strategies, computational methods have been employed to aid in the design and engineering of proteins. Molecular dynamics (MD) is one such method that simulates the motions of proteins according to classical dynamics. Here, we review how insights into protein dynamics derived from MD simulations have influenced the design of proteins. One of the greatest strengths of MD is its capacity to reveal information beyond what is available in the static structures deposited in the Protein Data Bank. In this regard simulations can be used to directly guide protein design by providing atomistic details of the dynamic molecular interactions contributing to protein stability and function. MD simulations can also be used as a virtual screening tool to rank, select, identify, and assess potential designs. MD is uniquely poised to inform protein design efforts where the application requires realistic models of protein dynamics and atomic level descriptions of the relationship between dynamics and function. Here, we review cases where MD simulations was used to modulate protein stability and protein function by providing information regarding the conformation(s), conformational transitions, interactions, and dynamics that govern stability and function. In addition, we discuss cases where conformations from protein folding/unfolding simulations have been exploited for protein design, yielding novel outcomes that could not be obtained from static structures.

  10. Insights from molecular dynamics simulations for computational protein design

    PubMed Central

    Childers, Matthew Carter; Daggett, Valerie

    2017-01-01

    A grand challenge in the field of structural biology is to design and engineer proteins that exhibit targeted functions. Although much success on this front has been achieved, design success rates remain low, an ever-present reminder of our limited understanding of the relationship between amino acid sequences and the structures they adopt. In addition to experimental techniques and rational design strategies, computational methods have been employed to aid in the design and engineering of proteins. Molecular dynamics (MD) is one such method that simulates the motions of proteins according to classical dynamics. Here, we review how insights into protein dynamics derived from MD simulations have influenced the design of proteins. One of the greatest strengths of MD is its capacity to reveal information beyond what is available in the static structures deposited in the Protein Data Bank. In this regard simulations can be used to directly guide protein design by providing atomistic details of the dynamic molecular interactions contributing to protein stability and function. MD simulations can also be used as a virtual screening tool to rank, select, identify, and assess potential designs. MD is uniquely poised to inform protein design efforts where the application requires realistic models of protein dynamics and atomic level descriptions of the relationship between dynamics and function. Here, we review cases where MD simulations was used to modulate protein stability and protein function by providing information regarding the conformation(s), conformational transitions, interactions, and dynamics that govern stability and function. In addition, we discuss cases where conformations from protein folding/unfolding simulations have been exploited for protein design, yielding novel outcomes that could not be obtained from static structures. PMID:28239489

  11. Dynamic protein interaction networks and new structural paradigms in signaling

    PubMed Central

    Csizmok, Veronika; Follis, Ariele Viacava; Kriwacki, Richard W.; Forman-Kay, Julie D.

    2017-01-01

    Understanding signaling and other complex biological processes requires elucidating the critical roles of intrinsically disordered proteins and regions (IDPs/IDRs), which represent ~30% of the proteome and enable unique regulatory mechanisms. In this review we describe the structural heterogeneity of disordered proteins that underpins these mechanisms and the latest progress in obtaining structural descriptions of ensembles of disordered proteins that are needed for linking structure and dynamics to function. We describe the diverse interactions of IDPs that can have unusual characteristics such as “ultrasensitivity” and “regulated folding and unfolding”. We also summarize the mounting data showing that large-scale assembly and protein phase separation occurs within a variety of signaling complexes and cellular structures. In addition, we discuss efforts to therapeutically target disordered proteins with small molecules. Overall, we interpret the remodeling of disordered state ensembles due to binding and post-translational modifications within an expanded framework for allostery that provides significant insights into how disordered proteins transmit biological information. PMID:26922996

  12. Secondary Structure Prediction of Protein Constructs Using Random Incremental Truncation and Vacuum-Ultraviolet CD Spectroscopy

    PubMed Central

    Pukáncsik, Mária; Orbán, Ágnes; Nagy, Kinga; Matsuo, Koichi; Gekko, Kunihiko; Maurin, Damien; Hart, Darren; Kézsmárki, István; Vertessy, Beata G.

    2016-01-01

    A novel uracil-DNA degrading protein factor (termed UDE) was identified in Drosophila melanogaster with no significant structural and functional homology to other uracil-DNA binding or processing factors. Determination of the 3D structure of UDE is excepted to provide key information on the description of the molecular mechanism of action of UDE catalysis, as well as in general uracil-recognition and nuclease action. Towards this long-term aim, the random library ESPRIT technology was applied to the novel protein UDE to overcome problems in identifying soluble expressing constructs given the absence of precise information on domain content and arrangement. Nine constructs of UDE were chosen to decipher structural and functional relationships. Vacuum ultraviolet circular dichroism (VUVCD) spectroscopy was performed to define the secondary structure content and location within UDE and its truncated variants. The quantitative analysis demonstrated exclusive α-helical content for the full-length protein, which is preserved in the truncated constructs. Arrangement of α-helical bundles within the truncated protein segments suggested new domain boundaries which differ from the conserved motifs determined by sequence-based alignment of UDE homologues. Here we demonstrate that the combination of ESPRIT and VUVCD spectroscopy provides a new structural description of UDE and confirms that the truncated constructs are useful for further detailed functional studies. PMID:27273007

  13. Structural Basis for Antagonism by Suramin of Heparin Binding to Vaccinia Complement Protein

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ganesh, Vannakambadi K.; Muthuvel, Suresh Kumar; Smith, Scott A.

    2010-07-19

    Suramin is a competitive inhibitor of heparin binding to many proteins, including viral envelope proteins, protein tyrosine phosphatases, and fibroblast growth factors (FGFs). It has been clinically evaluated as a potential therapeutic in treatment of cancers caused by unregulated angiogenesis, triggered by FGFs. Although it has shown clinical promise in treatment of several cancers, suramin has many undesirable side effects. There is currently no experimental structure that reveals the molecular interactions responsible for suramin inhibition of heparin binding, which could be of potential use in structure-assisted design of improved analogues of suramin. We report the structure of suramin, in complexmore » with the heparin-binding site of vaccinia virus complement control protein (VCP), which interacts with heparin in a geometrically similar manner to many FGFs. The larger than anticipated flexibility of suramin manifested in this structure, and other details of VCP-suramin interactions, might provide useful structural information for interpreting interactions of suramin with many proteins.« less

  14. Structural Determination of Functional Domains in Early B-cell Factor (EBF) Family of Transcription Factors Reveals Similarities to Rel DNA-binding Proteins and a Novel Dimerization Motif*

    PubMed Central

    Siponen, Marina I.; Wisniewska, Magdalena; Lehtiö, Lari; Johansson, Ida; Svensson, Linda; Raszewski, Grzegorz; Nilsson, Lennart; Sigvardsson, Mikael; Berglund, Helena

    2010-01-01

    The early B-cell factor (EBF) transcription factors are central regulators of development in several organs and tissues. This protein family shows low sequence similarity to other protein families, which is why structural information for the functional domains of these proteins is crucial to understand their biochemical features. We have used a modular approach to determine the crystal structures of the structured domains in the EBF family. The DNA binding domain reveals a striking resemblance to the DNA binding domains of the Rel homology superfamily of transcription factors but contains a unique zinc binding structure, termed zinc knuckle. Further the EBF proteins contain an IPT/TIG domain and an atypical helix-loop-helix domain with a novel type of dimerization motif. The data presented here provide insights into unique structural features of the EBF proteins and open possibilities for detailed molecular investigations of this important transcription factor family. PMID:20592035

  15. Structure of the virulence-associated protein VapD from the intracellular pathogen Rhodococcus equi

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Whittingham, Jean L.; Blagova, Elena V.; Finn, Ciaran E.

    2014-08-01

    VapD is one of a set of highly homologous virulence-associated proteins from the multi-host pathogen Rhodococcus equi. The crystal structure reveals an eight-stranded β-barrel with a novel fold and a glycine rich ‘bald’ surface. Rhodococcus equi is a multi-host pathogen that infects a range of animals as well as immune-compromised humans. Equine and porcine isolates harbour a virulence plasmid encoding a homologous family of virulence-associated proteins associated with the capacity of R. equi to divert the normal processes of endosomal maturation, enabling bacterial survival and proliferation in alveolar macrophages. To provide a basis for probing the function of the Vapmore » proteins in virulence, the crystal structure of VapD was determined. VapD is a monomer as determined by multi-angle laser light scattering. The structure reveals an elliptical, compact eight-stranded β-barrel with a novel strand topology and pseudo-twofold symmetry, suggesting evolution from an ancestral dimer. Surface-associated octyl-β-d-glucoside molecules may provide clues to function. Circular-dichroism spectroscopic analysis suggests that the β-barrel structure is preceded by a natively disordered region at the N-terminus. Sequence comparisons indicate that the core folds of the other plasmid-encoded virulence-associated proteins from R. equi strains are similar to that of VapD. It is further shown that sequences encoding putative R. equi Vap-like proteins occur in diverse bacterial species. Finally, the functional implications of the structure are discussed in the light of the unique structural features of VapD and its partial structural similarity to other β-barrel proteins.« less

  16. SITEX 2.0: Projections of protein functional sites on eukaryotic genes. Extension with orthologous genes.

    PubMed

    Medvedeva, Irina V; Demenkov, Pavel S; Ivanisenko, Vladimir A

    2017-04-01

    Functional sites define the diversity of protein functions and are the central object of research of the structural and functional organization of proteins. The mechanisms underlying protein functional sites emergence and their variability during evolution are distinguished by duplication, shuffling, insertion and deletion of the exons in genes. The study of the correlation between a site structure and exon structure serves as the basis for the in-depth understanding of sites organization. In this regard, the development of programming resources that allow the realization of the mutual projection of exon structure of genes and primary and tertiary structures of encoded proteins is still the actual problem. Previously, we developed the SitEx system that provides information about protein and gene sequences with mapped exon borders and protein functional sites amino acid positions. The database included information on proteins with known 3D structure. However, data with respect to orthologs was not available. Therefore, we added the projection of sites positions to the exon structures of orthologs in SitEx 2.0. We implemented a search through database using site conservation variability and site discontinuity through exon structure. Inclusion of the information on orthologs allowed to expand the possibilities of SitEx usage for solving problems regarding the analysis of the structural and functional organization of proteins. Database URL: http://www-bionet.sscc.ru/sitex/ .

  17. Mode localization in the cooperative dynamics of protein recognition

    NASA Astrophysics Data System (ADS)

    Copperman, J.; Guenza, M. G.

    2016-07-01

    The biological function of proteins is encoded in their structure and expressed through the mediation of their dynamics. This paper presents a study on the correlation between local fluctuations, binding, and biological function for two sample proteins, starting from the Langevin Equation for Protein Dynamics (LE4PD). The LE4PD is a microscopic and residue-specific coarse-grained approach to protein dynamics, which starts from the static structural ensemble of a protein and predicts the dynamics analytically. It has been shown to be accurate in its prediction of NMR relaxation experiments and Debye-Waller factors. The LE4PD is solved in a set of diffusive modes which span a vast range of time scales of the protein dynamics, and provides a detailed picture of the mode-dependent localization of the fluctuation as a function of the primary structure of the protein. To investigate the dynamics of protein complexes, the theory is implemented here to treat the coarse-grained dynamics of interacting macromolecules. As an example, calculations of the dynamics of monomeric and dimerized HIV protease and the free Insulin Growth Factor II Receptor (IGF2R) domain 11 and its IGF2R:IGF2 complex are presented. Either simulation-derived or experimentally measured NMR conformers are used as input structural ensembles to the theory. The picture that emerges suggests a dynamical heterogeneous protein where biologically active regions provide energetically comparable conformational states that are trapped by a reacting partner in agreement with the conformation-selection mechanism of binding.

  18. Robust enzyme design: bioinformatic tools for improved protein stability.

    PubMed

    Suplatov, Dmitry; Voevodin, Vladimir; Švedas, Vytas

    2015-03-01

    The ability of proteins and enzymes to maintain a functionally active conformation under adverse environmental conditions is an important feature of biocatalysts, vaccines, and biopharmaceutical proteins. From an evolutionary perspective, robust stability of proteins improves their biological fitness and allows for further optimization. Viewed from an industrial perspective, enzyme stability is crucial for the practical application of enzymes under the required reaction conditions. In this review, we analyze bioinformatic-driven strategies that are used to predict structural changes that can be applied to wild type proteins in order to produce more stable variants. The most commonly employed techniques can be classified into stochastic approaches, empirical or systematic rational design strategies, and design of chimeric proteins. We conclude that bioinformatic analysis can be efficiently used to study large protein superfamilies systematically as well as to predict particular structural changes which increase enzyme stability. Evolution has created a diversity of protein properties that are encoded in genomic sequences and structural data. Bioinformatics has the power to uncover this evolutionary code and provide a reproducible selection of hotspots - key residues to be mutated in order to produce more stable and functionally diverse proteins and enzymes. Further development of systematic bioinformatic procedures is needed to organize and analyze sequences and structures of proteins within large superfamilies and to link them to function, as well as to provide knowledge-based predictions for experimental evaluation. Copyright © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  19. Bioinformatics and variability in drug response: a protein structural perspective

    PubMed Central

    Lahti, Jennifer L.; Tang, Grace W.; Capriotti, Emidio; Liu, Tianyun; Altman, Russ B.

    2012-01-01

    Marketed drugs frequently perform worse in clinical practice than in the clinical trials on which their approval is based. Many therapeutic compounds are ineffective for a large subpopulation of patients to whom they are prescribed; worse, a significant fraction of patients experience adverse effects more severe than anticipated. The unacceptable risk–benefit profile for many drugs mandates a paradigm shift towards personalized medicine. However, prior to adoption of patient-specific approaches, it is useful to understand the molecular details underlying variable drug response among diverse patient populations. Over the past decade, progress in structural genomics led to an explosion of available three-dimensional structures of drug target proteins while efforts in pharmacogenetics offered insights into polymorphisms correlated with differential therapeutic outcomes. Together these advances provide the opportunity to examine how altered protein structures arising from genetic differences affect protein–drug interactions and, ultimately, drug response. In this review, we first summarize structural characteristics of protein targets and common mechanisms of drug interactions. Next, we describe the impact of coding mutations on protein structures and drug response. Finally, we highlight tools for analysing protein structures and protein–drug interactions and discuss their application for understanding altered drug responses associated with protein structural variants. PMID:22552919

  20. The 3D structures of VDAC represent a native conformation

    PubMed Central

    Hiller, Sebastian; Abramson, Jeff; Mannella, Carmen; Wagner, Gerhard; Zeth, Kornelius

    2010-01-01

    The most abundant protein of the mitochondrial outer membrane is the voltage-dependent anion channel (VDAC), which facilitates the exchange of ions and molecules between mitochondria and cytosol and is regulated by interactions with other proteins and small molecules. VDAC has been extensively studied for more than three decades, and last year three independent investigations revealed a structure of VDAC-1 exhibiting 19 transmembrane β-strands, constituting a unique structural class of β-barrel membrane proteins. Here, we provide a historical perspective on VDAC research and give an overview of the experimental design used to obtain these structures. Furthermore, we validate the protein refolding approach and summarize biochemical and biophysical evidence that links the 19-stranded structure to the native form of VDAC. PMID:20708406

  1. Structure and expression of a novel compact myelin protein – Small VCP-interacting protein (SVIP)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wu, Jiawen; Peng, Dungeng; Voehler, Markus

    2013-10-11

    Highlights: •SVIP (small p97/VCP-interacting protein) co-localizes with myelin basic protein (MBP) in compact myelin. •We determined that SVIP is an intrinsically disordered protein (IDP). •The helical content of SVIP increases dramatically during its interaction with negatively charged lipid membrane. •This study provides structural insight into interactions between SVIP and myelin membranes. -- Abstract: SVIP (small p97/VCP-interacting protein) was initially identified as one of many cofactors regulating the valosin containing protein (VCP), an AAA+ ATPase involved in endoplasmic-reticulum-associated protein degradation (ERAD). Our previous study showed that SVIP is expressed exclusively in the nervous system. In the present study, SVIP and VCPmore » were seen to be co-localized in neuronal cell bodies. Interestingly, we also observed that SVIP co-localizes with myelin basic protein (MBP) in compact myelin, where VCP was absent. Furthermore, using nuclear magnetic resonance (NMR) and circular dichroism (CD) spectroscopic measurements, we determined that SVIP is an intrinsically disordered protein (IDP). However, upon binding to the surface of membranes containing a net negative charge, the helical content of SVIP increases dramatically. These findings provide structural insight into interactions between SVIP and myelin membranes.« less

  2. Matricellular proteins in drug delivery: Therapeutic targets, active agents, and therapeutic localization.

    PubMed

    Sawyer, Andrew J; Kyriakides, Themis R

    2016-02-01

    Extracellular matrix is composed of a complex array of molecules that together provide structural and functional support to cells. These properties are mainly mediated by the activity of collagenous and elastic fibers, proteoglycans, and proteins such as fibronectin and laminin. ECM composition is tissue-specific and could include matricellular proteins whose primary role is to modulate cell-matrix interactions. In adults, matricellular proteins are primarily expressed during injury, inflammation and disease. Particularly, they are closely associated with the progression and prognosis of cardiovascular and fibrotic diseases, and cancer. This review aims to provide an overview of the potential use of matricellular proteins in drug delivery including the generation of therapeutic agents based on the properties and structures of these proteins as well as their utility as biomarkers for specific diseases. Copyright © 2016 Elsevier B.V. All rights reserved.

  3. Rational Protein Engineering Guided by Deep Mutational Scanning

    PubMed Central

    Shin, HyeonSeok; Cho, Byung-Kwan

    2015-01-01

    Sequence–function relationship in a protein is commonly determined by the three-dimensional protein structure followed by various biochemical experiments. However, with the explosive increase in the number of genome sequences, facilitated by recent advances in sequencing technology, the gap between protein sequences available and three-dimensional structures is rapidly widening. A recently developed method termed deep mutational scanning explores the functional phenotype of thousands of mutants via massive sequencing. Coupled with a highly efficient screening system, this approach assesses the phenotypic changes made by the substitution of each amino acid sequence that constitutes a protein. Such an informational resource provides the functional role of each amino acid sequence, thereby providing sufficient rationale for selecting target residues for protein engineering. Here, we discuss the current applications of deep mutational scanning and consider experimental design. PMID:26404267

  4. Towards a true protein movie: a perspective on the potential impact of the ensemble-based structure determination using exact NOEs.

    PubMed

    Vögeli, Beat; Orts, Julien; Strotz, Dean; Chi, Celestine; Minges, Martina; Wälti, Marielle Aulikki; Güntert, Peter; Riek, Roland

    2014-04-01

    Confined by the Boltzmann distribution of the energies of the states, a multitude of structural states are inherent to biomolecules. For a detailed understanding of a protein's function, its entire structural landscape at atomic resolution and insight into the interconversion between all the structural states (i.e. dynamics) are required. Whereas dedicated trickery with NMR relaxation provides aspects of local dynamics, and 3D structure determination by NMR is well established, only recently have several attempts been made to formulate a more comprehensive description of the dynamics and the structural landscape of a protein. Here, a perspective is given on the use of exact NOEs (eNOEs) for the elucidation of structural ensembles of a protein describing the covered conformational space. Copyright © 2013 Elsevier Inc. All rights reserved.

  5. Analysis of self-assembly of S-layer protein slp-B53 from Lysinibacillus sphaericus.

    PubMed

    Liu, Jun; Falke, Sven; Drobot, Bjoern; Oberthuer, Dominik; Kikhney, Alexey; Guenther, Tobias; Fahmy, Karim; Svergun, Dmitri; Betzel, Christian; Raff, Johannes

    2017-01-01

    The formation of stable and functional surface layers (S-layers) via self-assembly of surface-layer proteins on the cell surface is a dynamic and complex process. S-layers facilitate a number of important biological functions, e.g., providing protection and mediating selective exchange of molecules and thereby functioning as molecular sieves. Furthermore, S-layers selectively bind several metal ions including uranium, palladium, gold, and europium, some of them with high affinity. Most current research on surface layers focuses on investigating crystalline arrays of protein subunits in Archaea and bacteria. In this work, several complementary analytical techniques and methods have been applied to examine structure-function relationships and dynamics for assembly of S-layer protein slp-B53 from Lysinibacillus sphaericus: (1) The secondary structure of the S-layer protein was analyzed by circular dichroism spectroscopy; (2) Small-angle X-ray scattering was applied to gain insights into the three-dimensional structure in solution; (3) The interaction with bivalent cations was followed by differential scanning calorimetry; (4) The dynamics and time-dependent assembly of S-layers were followed by applying dynamic light scattering; (5) The two-dimensional structure of the paracrystalline S-layer lattice was examined by atomic force microscopy. The data obtained provide essential structural insights into the mechanism of S-layer self-assembly, particularly with respect to binding of bivalent cations, i.e., Mg 2+ and Ca 2+ . Furthermore, the results obtained highlight potential applications of S-layers in the fields of micromaterials and nanobiotechnology by providing engineered or individual symmetric thin protein layers, e.g., for protective, antimicrobial, or otherwise functionalized surfaces.

  6. The Protein Structure Initiative Structural Biology Knowledgebase Technology Portal: a structural biology web resource.

    PubMed

    Gifford, Lida K; Carter, Lester G; Gabanyi, Margaret J; Berman, Helen M; Adams, Paul D

    2012-06-01

    The Technology Portal of the Protein Structure Initiative Structural Biology Knowledgebase (PSI SBKB; http://technology.sbkb.org/portal/ ) is a web resource providing information about methods and tools that can be used to relieve bottlenecks in many areas of protein production and structural biology research. Several useful features are available on the web site, including multiple ways to search the database of over 250 technological advances, a link to videos of methods on YouTube, and access to a technology forum where scientists can connect, ask questions, get news, and develop collaborations. The Technology Portal is a component of the PSI SBKB ( http://sbkb.org ), which presents integrated genomic, structural, and functional information for all protein sequence targets selected by the Protein Structure Initiative. Created in collaboration with the Nature Publishing Group, the SBKB offers an array of resources for structural biologists, such as a research library, editorials about new research advances, a featured biological system each month, and a functional sleuth for searching protein structures of unknown function. An overview of the various features and examples of user searches highlight the information, tools, and avenues for scientific interaction available through the Technology Portal.

  7. FINDSITE-metal: Integrating evolutionary information and machine learning for structure-based metal binding site prediction at the proteome level

    PubMed Central

    Brylinski, Michal; Skolnick, Jeffrey

    2010-01-01

    The rapid accumulation of gene sequences, many of which are hypothetical proteins with unknown function, has stimulated the development of accurate computational tools for protein function prediction with evolution/structure-based approaches showing considerable promise. In this paper, we present FINDSITE-metal, a new threading-based method designed specifically to detect metal binding sites in modeled protein structures. Comprehensive benchmarks using different quality protein structures show that weakly homologous protein models provide sufficient structural information for quite accurate annotation by FINDSITE-metal. Combining structure/evolutionary information with machine learning results in highly accurate metal binding annotations; for protein models constructed by TASSER, whose average Cα RMSD from the native structure is 8.9 Å, 59.5% (71.9%) of the best of top five predicted metal locations are within 4 Å (8 Å) from a bound metal in the crystal structure. For most of the targets, multiple metal binding sites are detected with the best predicted binding site at rank 1 and within the top 2 ranks in 65.6% and 83.1% of the cases, respectively. Furthermore, for iron, copper, zinc, calcium and magnesium ions, the binding metal can be predicted with high, typically 70-90%, accuracy. FINDSITE-metal also provides a set of confidence indexes that help assess the reliability of predictions. Finally, we describe the proteome-wide application of FINDSITE-metal that quantifies the metal binding complement of the human proteome. FINDSITE-metal is freely available to the academic community at http://cssb.biology.gatech.edu/findsite-metal/. PMID:21287609

  8. Structure solution of DNA-binding proteins and complexes with ARCIMBOLDO libraries

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Pröpper, Kevin; Instituto de Biologia Molecular de Barcelona; Meindl, Kathrin

    2014-06-01

    The structure solution of DNA-binding protein structures and complexes based on the combination of location of DNA-binding protein motif fragments with density modification in a multi-solution frame is described. Protein–DNA interactions play a major role in all aspects of genetic activity within an organism, such as transcription, packaging, rearrangement, replication and repair. The molecular detail of protein–DNA interactions can be best visualized through crystallography, and structures emphasizing insight into the principles of binding and base-sequence recognition are essential to understanding the subtleties of the underlying mechanisms. An increasing number of high-quality DNA-binding protein structure determinations have been witnessed despite themore » fact that the crystallographic particularities of nucleic acids tend to pose specific challenges to methods primarily developed for proteins. Crystallographic structure solution of protein–DNA complexes therefore remains a challenging area that is in need of optimized experimental and computational methods. The potential of the structure-solution program ARCIMBOLDO for the solution of protein–DNA complexes has therefore been assessed. The method is based on the combination of locating small, very accurate fragments using the program Phaser and density modification with the program SHELXE. Whereas for typical proteins main-chain α-helices provide the ideal, almost ubiquitous, small fragments to start searches, in the case of DNA complexes the binding motifs and DNA double helix constitute suitable search fragments. The aim of this work is to provide an effective library of search fragments as well as to determine the optimal ARCIMBOLDO strategy for the solution of this class of structures.« less

  9. Structural Biology of Pectin Degradation by Enterobacteriaceae

    PubMed Central

    Abbott, D. Wade; Boraston, Alisdair B.

    2008-01-01

    Pectin is a structural polysaccharide that is integral for the stability of plant cell walls. During soft rot infection, secreted virulence factors from pectinolytic bacteria such as Erwinia spp. degrade pectin, resulting in characteristic plant cell necrosis and tissue maceration. Catabolism of pectin and its breakdown products by pectinolytic bacteria occurs within distinct cellular environments. This process initiates outside the cell, continues within the periplasmic space, and culminates in the cytoplasm. Although pectin utilization is well understood at the genetic and biochemical levels, an inclusive structural description of pectinases and pectin binding proteins by both extracellular and periplasmic enzymes has been lacking, especially following the recent characterization of several periplasmic components and protein-oligogalacturonide complexes. Here we provide a comprehensive analysis of the protein folds and mechanisms of pectate lyases, polygalacturonases, and carbohydrate esterases and the binding specificities of two periplasmic pectic binding proteins from Enterobacteriaceae. This review provides a structural understanding of the molecular determinants of pectin utilization and the mechanisms driving catabolite selectivity and flow through the pathway. PMID:18535148

  10. Protein modeling and molecular dynamics simulation of SlWRKY4 protein cloned from drought tolerant tomato (Solanum habrochaites) line EC520061.

    PubMed

    Karkute, Suhas G; Easwaran, Murugesh; Gujjar, Ranjit Singh; Piramanayagam, Shanmughavel; Singh, Major

    2015-10-01

    WRKY genes are members of one of the largest families of plant transcription factors and play an important role in response to biotic and abiotic stresses, and overall growth and development. Understanding the interaction of WRKY proteins with other proteins/ligands in plant cells is of utmost importance to develop plants having tolerance to biotic and abiotic stresses. The SlWRKY4 gene was cloned from a drought tolerant wild species of tomato (Solanum habrochaites) and the secondary structure and 3D modeling of this protein were predicted using Schrödinger Suite-Prime. Predicted structures were also subjected to plot against Ramachandran's conformation, and the modeled structure was minimized using Macromodel. Finally, the minimized structure was simulated in the water environment to check the protein stability. The behavior of the modeled structure was well-simulated and analyzed through RMSD and RMSF of the protein. The present work provides the modeled 3D structure of SlWRKY4 that will help in understanding the mechanism of gene regulation by further in silico interaction studies.

  11. G-LoSA: An efficient computational tool for local structure-centric biological studies and drug design.

    PubMed

    Lee, Hui Sun; Im, Wonpil

    2016-04-01

    Molecular recognition by protein mostly occurs in a local region on the protein surface. Thus, an efficient computational method for accurate characterization of protein local structural conservation is necessary to better understand biology and drug design. We present a novel local structure alignment tool, G-LoSA. G-LoSA aligns protein local structures in a sequence order independent way and provides a GA-score, a chemical feature-based and size-independent structure similarity score. Our benchmark validation shows the robust performance of G-LoSA to the local structures of diverse sizes and characteristics, demonstrating its universal applicability to local structure-centric comparative biology studies. In particular, G-LoSA is highly effective in detecting conserved local regions on the entire surface of a given protein. In addition, the applications of G-LoSA to identifying template ligands and predicting ligand and protein binding sites illustrate its strong potential for computer-aided drug design. We hope that G-LoSA can be a useful computational method for exploring interesting biological problems through large-scale comparison of protein local structures and facilitating drug discovery research and development. G-LoSA is freely available to academic users at http://im.compbio.ku.edu/GLoSA/. © 2016 The Protein Society.

  12. LenVarDB: database of length-variant protein domains.

    PubMed

    Mutt, Eshita; Mathew, Oommen K; Sowdhamini, Ramanathan

    2014-01-01

    Protein domains are functionally and structurally independent modules, which add to the functional variety of proteins. This array of functional diversity has been enabled by evolutionary changes, such as amino acid substitutions or insertions or deletions, occurring in these protein domains. Length variations (indels) can introduce changes at structural, functional and interaction levels. LenVarDB (freely available at http://caps.ncbs.res.in/lenvardb/) traces these length variations, starting from structure-based sequence alignments in our Protein Alignments organized as Structural Superfamilies (PASS2) database, across 731 structural classification of proteins (SCOP)-based protein domain superfamilies connected to 2 730 625 sequence homologues. Alignment of sequence homologues corresponding to a structural domain is available, starting from a structure-based sequence alignment of the superfamily. Orientation of the length-variant (indel) regions in protein domains can be visualized by mapping them on the structure and on the alignment. Knowledge about location of length variations within protein domains and their visual representation will be useful in predicting changes within structurally or functionally relevant sites, which may ultimately regulate protein function. Non-technical summary: Evolutionary changes bring about natural changes to proteins that may be found in many organisms. Such changes could be reflected as amino acid substitutions or insertions-deletions (indels) in protein sequences. LenVarDB is a database that provides an early overview of observed length variations that were set among 731 protein families and after examining >2 million sequences. Indels are followed up to observe if they are close to the active site such that they can affect the activity of proteins. Inclusion of such information can aid the design of bioengineering experiments.

  13. KoBaMIN: a knowledge-based minimization web server for protein structure refinement.

    PubMed

    Rodrigues, João P G L M; Levitt, Michael; Chopra, Gaurav

    2012-07-01

    The KoBaMIN web server provides an online interface to a simple, consistent and computationally efficient protein structure refinement protocol based on minimization of a knowledge-based potential of mean force. The server can be used to refine either a single protein structure or an ensemble of proteins starting from their unrefined coordinates in PDB format. The refinement method is particularly fast and accurate due to the underlying knowledge-based potential derived from structures deposited in the PDB; as such, the energy function implicitly includes the effects of solvent and the crystal environment. Our server allows for an optional but recommended step that optimizes stereochemistry using the MESHI software. The KoBaMIN server also allows comparison of the refined structures with a provided reference structure to assess the changes brought about by the refinement protocol. The performance of KoBaMIN has been benchmarked widely on a large set of decoys, all models generated at the seventh worldwide experiments on critical assessment of techniques for protein structure prediction (CASP7) and it was also shown to produce top-ranking predictions in the refinement category at both CASP8 and CASP9, yielding consistently good results across a broad range of model quality values. The web server is fully functional and freely available at http://csb.stanford.edu/kobamin.

  14. From Sequence and Forces to Structure, Function and Evolution of Intrinsically Disordered Proteins

    PubMed Central

    Forman-Kay, Julie D.; Mittag, Tanja

    2015-01-01

    Intrinsically disordered proteins (IDPs), which lack persistent structure, are a challenge to structural biology due to the inapplicability of standard methods for characterization of folded proteins as well as their deviation from the dominant structure/function paradigm. Their widespread presence and involvement in biological function, however, has spurred the growing acceptance of the importance of IDPs and the development of new tools for studying their structure, dynamics and function. The interplay of folded and disordered domains or regions for function and the existence of a continuum of protein states with respect to conformational energetics, motional timescales and compactness is shaping a unified understanding of structure-dynamics-disorder/function relationships. On the 20th anniversary of this journal, Structure, we provide a historical perspective on the investigation of IDPs and summarize the sequence features and physical forces that underlie their unique structural, functional and evolutionary properties. PMID:24010708

  15. From sequence and forces to structure, function, and evolution of intrinsically disordered proteins.

    PubMed

    Forman-Kay, Julie D; Mittag, Tanja

    2013-09-03

    Intrinsically disordered proteins (IDPs), which lack persistent structure, are a challenge to structural biology due to the inapplicability of standard methods for characterization of folded proteins as well as their deviation from the dominant structure/function paradigm. Their widespread presence and involvement in biological function, however, has spurred the growing acceptance of the importance of IDPs and the development of new tools for studying their structure, dynamics, and function. The interplay of folded and disordered domains or regions for function and the existence of a continuum of protein states with respect to conformational energetics, motional timescales, and compactness are shaping a unified understanding of structure-dynamics-disorder/function relationships. In the 20(th) anniversary of Structure, we provide a historical perspective on the investigation of IDPs and summarize the sequence features and physical forces that underlie their unique structural, functional, and evolutionary properties. Copyright © 2013 Elsevier Ltd. All rights reserved.

  16. Computational methods for constructing protein structure models from 3D electron microscopy maps.

    PubMed

    Esquivel-Rodríguez, Juan; Kihara, Daisuke

    2013-10-01

    Protein structure determination by cryo-electron microscopy (EM) has made significant progress in the past decades. Resolutions of EM maps have been improving as evidenced by recently reported structures that are solved at high resolutions close to 3Å. Computational methods play a key role in interpreting EM data. Among many computational procedures applied to an EM map to obtain protein structure information, in this article we focus on reviewing computational methods that model protein three-dimensional (3D) structures from a 3D EM density map that is constructed from two-dimensional (2D) maps. The computational methods we discuss range from de novo methods, which identify structural elements in an EM map, to structure fitting methods, where known high resolution structures are fit into a low-resolution EM map. A list of available computational tools is also provided. Copyright © 2013 Elsevier Inc. All rights reserved.

  17. CCBuilder 2.0: Powerful and accessible coiled-coil modeling.

    PubMed

    Wood, Christopher W; Woolfson, Derek N

    2018-01-01

    The increased availability of user-friendly and accessible computational tools for biomolecular modeling would expand the reach and application of biomolecular engineering and design. For protein modeling, one key challenge is to reduce the complexities of 3D protein folds to sets of parametric equations that nonetheless capture the salient features of these structures accurately. At present, this is possible for a subset of proteins, namely, repeat proteins. The α-helical coiled coil provides one such example, which represents ≈ 3-5% of all known protein-encoding regions of DNA. Coiled coils are bundles of α helices that can be described by a small set of structural parameters. Here we describe how this parametric description can be implemented in an easy-to-use web application, called CCBuilder 2.0, for modeling and optimizing both α-helical coiled coils and polyproline-based collagen triple helices. This has many applications from providing models to aid molecular replacement for X-ray crystallography, in silico model building and engineering of natural and designed protein assemblies, and through to the creation of completely de novo "dark matter" protein structures. CCBuilder 2.0 is available as a web-based application, the code for which is open-source and can be downloaded freely. http://coiledcoils.chm.bris.ac.uk/ccbuilder2. We have created CCBuilder 2.0, an easy to use web-based application that can model structures for a whole class of proteins, the α-helical coiled coil, which is estimated to account for 3-5% of all proteins in nature. CCBuilder 2.0 will be of use to a large number of protein scientists engaged in fundamental studies, such as protein structure determination, through to more-applied research including designing and engineering novel proteins that have potential applications in biotechnology. © 2017 The Authors Protein Science published by Wiley Periodicals, Inc. on behalf of The Protein Society.

  18. Order within disorder: Aggrecan chondroitin sulphate-attachment region provides new structural insights into protein sequences classified as disordered

    PubMed Central

    Jowitt, Thomas A; Murdoch, Alan D; Baldock, Clair; Berry, Richard; Day, Joanna M; Hardingham, Timothy E

    2010-01-01

    Structural investigation of proteins containing large stretches of sequences without predicted secondary structure is the focus of much increased attention. Here, we have produced an unglycosylated 30 kDa peptide from the chondroitin sulphate (CS)-attachment region of human aggrecan (CS-peptide), which was predicted to be intrinsically disordered and compared its structure with the adjacent aggrecan G3 domain. Biophysical analyses, including analytical ultracentrifugation, light scattering, and circular dichroism showed that the CS-peptide had an elongated and stiffened conformation in contrast to the globular G3 domain. The results suggested that it contained significant secondary structure, which was sensitive to urea, and we propose that the CS-peptide forms an elongated wormlike molecule based on a dynamic range of energetically equivalent secondary structures stabilized by hydrogen bonds. The dimensions of the structure predicted from small-angle X-ray scattering analysis were compatible with EM images of fully glycosylated aggrecan and a partly glycosylated aggrecan CS2-G3 construct. The semiordered structure identified in CS-peptide was not predicted by common structural algorithms and identified a potentially distinct class of semiordered structure within sequences currently identified as disordered. Sequence comparisons suggested some evidence for comparable structures in proteins encoded by other genes (PRG4, MUC5B, and CBP). The function of these semiordered sequences may serve to spatially position attached folded modules and/or to present polypeptides for modification, such as glycosylation, and to provide templates for the multiple pleiotropic interactions proposed for disordered proteins. Proteins 2010. © 2010 Wiley-Liss, Inc. PMID:20806220

  19. Looking at the Disordered Proteins through the Computational Microscope

    PubMed Central

    2018-01-01

    Intrinsically disordered proteins (IDPs) have attracted wide interest over the past decade due to their surprising prevalence in the proteome and versatile roles in cell physiology and pathology. A large selection of IDPs has been identified as potential targets for therapeutic intervention. Characterizing the structure–function relationship of disordered proteins is therefore an essential but daunting task, as these proteins can adapt transient structure, necessitating a new paradigm for connecting structural disorder to function. Molecular simulation has emerged as a natural complement to experiments for atomic-level characterizations and mechanistic investigations of this intriguing class of proteins. The diverse range of length and time scales involved in IDP function requires performing simulations at multiple levels of resolution. In this Outlook, we focus on summarizing available simulation methods, along with a few interesting example applications. We also provide an outlook on how these simulation methods can be further improved in order to provide a more accurate description of IDP structure, binding, and assembly.

  20. Hekate: Software Suite for the Mass Spectrometric Analysis and Three-Dimensional Visualization of Cross-Linked Protein Samples

    PubMed Central

    2013-01-01

    Chemical cross-linking of proteins combined with mass spectrometry provides an attractive and novel method for the analysis of native protein structures and protein complexes. Analysis of the data however is complex. Only a small number of cross-linked peptides are produced during sample preparation and must be identified against a background of more abundant native peptides. To facilitate the search and identification of cross-linked peptides, we have developed a novel software suite, named Hekate. Hekate is a suite of tools that address the challenges involved in analyzing protein cross-linking experiments when combined with mass spectrometry. The software is an integrated pipeline for the automation of the data analysis workflow and provides a novel scoring system based on principles of linear peptide analysis. In addition, it provides a tool for the visualization of identified cross-links using three-dimensional models, which is particularly useful when combining chemical cross-linking with other structural techniques. Hekate was validated by the comparative analysis of cytochrome c (bovine heart) against previously reported data.1 Further validation was carried out on known structural elements of DNA polymerase III, the catalytic α-subunit of the Escherichia coli DNA replisome along with new insight into the previously uncharacterized C-terminal domain of the protein. PMID:24010795

  1. Folding a Protein with Equal Probability of Being Helix or Hairpin

    PubMed Central

    Lin, Chun-Yu; Chen, Nan-Yow; Mou, Chung Yu

    2012-01-01

    We explore the possibility for the native structure of a protein being inherently multiconformational in an ab initio coarse-grained model. Based on the Wang-Landau algorithm, the complete free energy landscape for the designed sequence 2DX4: INYWLAHAKAGYIVHWTA is constructed. It is shown that 2DX4 possesses two nearly degenerate native structures: one is a helix structure with the other a hairpinlike structure, and their free energy difference is <2% of that of local minima. Two degenerate native structures are stabilized by an energy barrier of ∼10 kcal/mol. Furthermore, the hydrogen-bond and dipole-dipole interactions are found to be two major competing interactions in transforming one conformation into the other. Our results indicate that two degenerate native structures are stabilized by subtle balance between different interactions in proteins. In particular, for small proteins, balance between the hydrogen-bond and dipole-dipole interactions happens for proteins of sizes being ∼18 amino acids and is shown to the main driving mechanism for the occurrence of degeneracy. These results provide important clues to the study of native structures of proteins. PMID:22828336

  2. UbSRD: The Ubiquitin Structural Relational Database.

    PubMed

    Harrison, Joseph S; Jacobs, Tim M; Houlihan, Kevin; Van Doorslaer, Koenraad; Kuhlman, Brian

    2016-02-22

    The structurally defined ubiquitin-like homology fold (UBL) can engage in several unique protein-protein interactions and many of these complexes have been characterized with high-resolution techniques. Using Rosetta's structural classification tools, we have created the Ubiquitin Structural Relational Database (UbSRD), an SQL database of features for all 509 UBL-containing structures in the PDB, allowing users to browse these structures by protein-protein interaction and providing a platform for quantitative analysis of structural features. We used UbSRD to define the recognition features of ubiquitin (UBQ) and SUMO observed in the PDB and the orientation of the UBQ tail while interacting with certain types of proteins. While some of the interaction surfaces on UBQ and SUMO overlap, each molecule has distinct features that aid in molecular discrimination. Additionally, we find that the UBQ tail is malleable and can adopt a variety of conformations upon binding. UbSRD is accessible as an online resource at rosettadesign.med.unc.edu/ubsrd. Copyright © 2015 Elsevier Ltd. All rights reserved.

  3. Watching proteins function with picosecond X-ray crystallography and molecular dynamics simulations.

    NASA Astrophysics Data System (ADS)

    Anfinrud, Philip

    2006-03-01

    Time-resolved electron density maps of myoglobin, a ligand-binding heme protein, have been stitched together into movies that unveil with < 2-å spatial resolution and 150-ps time-resolution the correlated protein motions that accompany and/or mediate ligand migration within the hydrophobic interior of a protein. A joint analysis of all-atom molecular dynamics (MD) calculations and picosecond time-resolved X-ray structures provides single-molecule insights into mechanisms of protein function. Ensemble-averaged MD simulations of the L29F mutant of myoglobin following ligand dissociation reproduce the direction, amplitude, and timescales of crystallographically-determined structural changes. This close agreement with experiments at comparable resolution in space and time validates the individual MD trajectories, which identify and structurally characterize a conformational switch that directs dissociated ligands to one of two nearby protein cavities. This unique combination of simulation and experiment unveils functional protein motions and illustrates at an atomic level relationships among protein structure, dynamics, and function. In collaboration with Friedrich Schotte and Gerhard Hummer, NIH.

  4. Evolution driven structural changes in CENP-E motor domain.

    PubMed

    Kumar, Ambuj; Kamaraj, Balu; Sethumadhavan, Rao; Purohit, Rituraj

    2013-06-01

    Genetic evolution corresponds to various biochemical changes that are vital development of new functional traits. Phylogenetic analysis has provided an important insight into the genetic closeness among species and their evolutionary relationships. Centromere-associated protein-E (CENP-E) protein is vital for maintaining cell cycle and checkpoint signal mechanisms are vital for recruitment process of other essential kinetochore proteins. In this study we have focussed on the evolution driven structural changes in CENP-E motor domain among primate lineage. Through molecular dynamics simulation and computational chemistry approaches we examined the changes in ATP binding affinity and conformational deviations in human CENP-E motor domain as compared to the other primates. Root mean square deviation (RMSD), Root mean square fluctuation (RMSF), Radius of gyration (Rg) and principle component analysis (PCA) results together suggested a gain in stability level as we move from tarsier towards human. This study provides a significant insight into how the cell cycle proteins and their corresponding biochemical activities are evolving and illustrates the potency of a theoretical approach for assessing, in a single study, the structural, functional, and dynamical aspects of protein evolution.

  5. A 'periodic table' for protein structures.

    PubMed

    Taylor, William R

    2002-04-11

    Current structural genomics programs aim systematically to determine the structures of all proteins coded in both human and other genomes, providing a complete picture of the number and variety of protein structures that exist. In the past, estimates have been made on the basis of the incomplete sample of structures currently known. These estimates have varied greatly (between 1,000 and 10,000; see for example refs 1 and 2), partly because of limited sample size but also owing to the difficulties of distinguishing one structure from another. This distinction is usually topological, based on the fold of the protein; however, in strict topological terms (neglecting to consider intra-chain cross-links), protein chains are open strings and hence are all identical. To avoid this trivial result, topologies are determined by considering secondary links in the form of intra-chain hydrogen bonds (secondary structure) and tertiary links formed by the packing of secondary structures. However, small additions to or loss of structure can make large changes to these perceived topologies and such subjective solutions are neither robust nor amenable to automation. Here I formalize both secondary and tertiary links to allow the rigorous and automatic definition of protein topology.

  6. A generative, probabilistic model of local protein structure.

    PubMed

    Boomsma, Wouter; Mardia, Kanti V; Taylor, Charles C; Ferkinghoff-Borg, Jesper; Krogh, Anders; Hamelryck, Thomas

    2008-07-01

    Despite significant progress in recent years, protein structure prediction maintains its status as one of the prime unsolved problems in computational biology. One of the key remaining challenges is an efficient probabilistic exploration of the structural space that correctly reflects the relative conformational stabilities. Here, we present a fully probabilistic, continuous model of local protein structure in atomic detail. The generative model makes efficient conformational sampling possible and provides a framework for the rigorous analysis of local sequence-structure correlations in the native state. Our method represents a significant theoretical and practical improvement over the widely used fragment assembly technique by avoiding the drawbacks associated with a discrete and nonprobabilistic approach.

  7. Modeling the Structure of Helical Assemblies with Experimental Constraints in Rosetta.

    PubMed

    André, Ingemar

    2018-01-01

    Determining high-resolution structures of proteins with helical symmetry can be challenging due to limitations in experimental data. In such instances, structure-based protein simulations driven by experimental data can provide a valuable approach for building models of helical assemblies. This chapter describes how the Rosetta macromolecular package can be used to model homomeric protein assemblies with helical symmetry in a range of modeling scenarios including energy refinement, symmetrical docking, comparative modeling, and de novo structure prediction. Data-guided structure modeling of helical assemblies with experimental information from electron density, X-ray fiber diffraction, solid-state NMR, and chemical cross-linking mass spectrometry is also described.

  8. Structure of a nanobody-stabilized active state of the β(2) adrenoceptor.

    PubMed

    Rasmussen, Søren G F; Choi, Hee-Jung; Fung, Juan Jose; Pardon, Els; Casarosa, Paola; Chae, Pil Seok; Devree, Brian T; Rosenbaum, Daniel M; Thian, Foon Sun; Kobilka, Tong Sun; Schnapp, Andreas; Konetzki, Ingo; Sunahara, Roger K; Gellman, Samuel H; Pautsch, Alexander; Steyaert, Jan; Weis, William I; Kobilka, Brian K

    2011-01-13

    G protein coupled receptors (GPCRs) exhibit a spectrum of functional behaviours in response to natural and synthetic ligands. Recent crystal structures provide insights into inactive states of several GPCRs. Efforts to obtain an agonist-bound active-state GPCR structure have proven difficult due to the inherent instability of this state in the absence of a G protein. We generated a camelid antibody fragment (nanobody) to the human β(2) adrenergic receptor (β(2)AR) that exhibits G protein-like behaviour, and obtained an agonist-bound, active-state crystal structure of the receptor-nanobody complex. Comparison with the inactive β(2)AR structure reveals subtle changes in the binding pocket; however, these small changes are associated with an 11 Å outward movement of the cytoplasmic end of transmembrane segment 6, and rearrangements of transmembrane segments 5 and 7 that are remarkably similar to those observed in opsin, an active form of rhodopsin. This structure provides insights into the process of agonist binding and activation.

  9. A Generative Angular Model of Protein Structure Evolution

    PubMed Central

    Golden, Michael; García-Portugués, Eduardo; Sørensen, Michael; Mardia, Kanti V.; Hamelryck, Thomas; Hein, Jotun

    2017-01-01

    Abstract Recently described stochastic models of protein evolution have demonstrated that the inclusion of structural information in addition to amino acid sequences leads to a more reliable estimation of evolutionary parameters. We present a generative, evolutionary model of protein structure and sequence that is valid on a local length scale. The model concerns the local dependencies between sequence and structure evolution in a pair of homologous proteins. The evolutionary trajectory between the two structures in the protein pair is treated as a random walk in dihedral angle space, which is modeled using a novel angular diffusion process on the two-dimensional torus. Coupling sequence and structure evolution in our model allows for modeling both “smooth” conformational changes and “catastrophic” conformational jumps, conditioned on the amino acid changes. The model has interpretable parameters and is comparatively more realistic than previous stochastic models, providing new insights into the relationship between sequence and structure evolution. For example, using the trained model we were able to identify an apparent sequence–structure evolutionary motif present in a large number of homologous protein pairs. The generative nature of our model enables us to evaluate its validity and its ability to simulate aspects of protein evolution conditioned on an amino acid sequence, a related amino acid sequence, a related structure or any combination thereof. PMID:28453724

  10. Isolation, folding and structural investigations of the amino acid transporter OEP16

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ni, Da Qun; Zook, James; Klewer, Douglas A.

    2011-12-01

    Membrane proteins compose more than 30% of all proteins in the living cell. However, many membrane proteins have low abundance in the cell and cannot be isolated from natural sources in concentrations suitable for structure analysis. The overexpression, reconstitution, and stabilization of membrane proteins are complex and remain a formidable challenge in membrane protein characterization. Here we describe a novel, in vitro folding procedure for a cation-selective channel protein, the outer envelope membrane protein 16 (OEP16) of pea chloroplast, overexpressed in Escherichia coli in the form of inclusion bodies. The protein is purified and then folded with detergent on amore » Ni-NTA affinity column. Final concentrations of reconstituted OEP16 of up to 24 mg/ml have been achieved, which provides samples that are sufficient for structural studies by NMR and crystallography. Reconstitution of OEP16 in detergent micelles was monitored by circular dichroism, fluorescence, and NMR spectroscopy. Tryptophan fluorescence spectra of heterologous expressed OEP16 in micelles are similar to spectra of functionally active OEP16 in liposomes, which indicates folding of the membrane protein in detergent micelles. CD spectroscopy studies demonstrate a folded protein consisting primarily of a-helices. 15N-HSQC NMR spectra also provide evidence for a folded protein. We present here a convenient, effective and quantitative method to screen large numbers of conditions for optimal protein stability by using microdialysis chambers in combination with fluorescence spectroscopy. Recent collection of multidimensional NMR data at 500, 600 and 800 MHz demonstrated that the protein is suitable for structure determination by NMR and stable for weeks during data collection.« less

  11. Isolation, folding and structural investigations of the amino acid transporter OEP16.

    PubMed

    Ni, Da Qun; Zook, James; Klewer, Douglas A; Nieman, Ronald A; Soll, J; Fromme, Petra

    2011-12-01

    Membrane proteins compose more than 30% of all proteins in the living cell. However, many membrane proteins have low abundance in the cell and cannot be isolated from natural sources in concentrations suitable for structure analysis. The overexpression, reconstitution, and stabilization of membrane proteins are complex and remain a formidable challenge in membrane protein characterization. Here we describe a novel, in vitro folding procedure for a cation-selective channel protein, the outer envelope membrane protein 16 (OEP16) of pea chloroplast, overexpressed in Escherichia coli in the form of inclusion bodies. The protein is purified and then folded with detergent on a Ni-NTA affinity column. Final concentrations of reconstituted OEP16 of up to 24 mg/ml have been achieved, which provides samples that are sufficient for structural studies by NMR and crystallography. Reconstitution of OEP16 in detergent micelles was monitored by circular dichroism, fluorescence, and NMR spectroscopy. Tryptophan fluorescence spectra of heterologous expressed OEP16 in micelles are similar to spectra of functionally active OEP16 in liposomes, which indicates folding of the membrane protein in detergent micelles. CD spectroscopy studies demonstrate a folded protein consisting primarily of α-helices. ¹⁵N-HSQC NMR spectra also provide evidence for a folded protein. We present here a convenient, effective and quantitative method to screen large numbers of conditions for optimal protein stability by using microdialysis chambers in combination with fluorescence spectroscopy. Recent collection of multidimensional NMR data at 500, 600 and 800 MHz demonstrated that the protein is suitable for structure determination by NMR and stable for weeks during data collection. Copyright © 2011. Published by Elsevier Inc.

  12. Antibody Epitope Analysis to Investigate Folded Structure, Allosteric Conformation, and Evolutionary Lineage of Proteins.

    PubMed

    Wong, Sienna; Jin, J-P

    2017-01-01

    Study of folded structure of proteins provides insights into their biological functions, conformational dynamics and molecular evolution. Current methods of elucidating folded structure of proteins are laborious, low-throughput, and constrained by various limitations. Arising from these methods is the need for a sensitive, quantitative, rapid and high-throughput method not only analysing the folded structure of proteins, but also to monitor dynamic changes under physiological or experimental conditions. In this focused review, we outline the foundation and limitations of current protein structure-determination methods prior to discussing the advantages of an emerging antibody epitope analysis for applications in structural, conformational and evolutionary studies of proteins. We discuss the application of this method using representative examples in monitoring allosteric conformation of regulatory proteins and the determination of the evolutionary lineage of related proteins and protein isoforms. The versatility of the method described herein is validated by the ability to modulate a variety of assay parameters to meet the needs of the user in order to monitor protein conformation. Furthermore, the assay has been used to clarify the lineage of troponin isoforms beyond what has been depicted by sequence homology alone, demonstrating the nonlinear evolutionary relationship between primary structure and tertiary structure of proteins. The antibody epitope analysis method is a highly adaptable technique of protein conformation elucidation, which can be easily applied without the need for specialized equipment or technical expertise. When applied in a systematic and strategic manner, this method has the potential to reveal novel and biomedically meaningful information for structure-function relationship and evolutionary lineage of proteins. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  13. Knowledge-based prediction of protein backbone conformation using a structural alphabet.

    PubMed

    Vetrivel, Iyanar; Mahajan, Swapnil; Tyagi, Manoj; Hoffmann, Lionel; Sanejouand, Yves-Henri; Srinivasan, Narayanaswamy; de Brevern, Alexandre G; Cadet, Frédéric; Offmann, Bernard

    2017-01-01

    Libraries of structural prototypes that abstract protein local structures are known as structural alphabets and have proven to be very useful in various aspects of protein structure analyses and predictions. One such library, Protein Blocks, is composed of 16 standard 5-residues long structural prototypes. This form of analyzing proteins involves drafting its structure as a string of Protein Blocks. Predicting the local structure of a protein in terms of protein blocks is the general objective of this work. A new approach, PB-kPRED is proposed towards this aim. It involves (i) organizing the structural knowledge in the form of a database of pentapeptide fragments extracted from all protein structures in the PDB and (ii) applying a knowledge-based algorithm that does not rely on any secondary structure predictions and/or sequence alignment profiles, to scan this database and predict most probable backbone conformations for the protein local structures. Though PB-kPRED uses the structural information from homologues in preference, if available. The predictions were evaluated rigorously on 15,544 query proteins representing a non-redundant subset of the PDB filtered at 30% sequence identity cut-off. We have shown that the kPRED method was able to achieve mean accuracies ranging from 40.8% to 66.3% depending on the availability of homologues. The impact of the different strategies for scanning the database on the prediction was evaluated and is discussed. Our results highlight the usefulness of the method in the context of proteins without any known structural homologues. A scoring function that gives a good estimate of the accuracy of prediction was further developed. This score estimates very well the accuracy of the algorithm (R2 of 0.82). An online version of the tool is provided freely for non-commercial usage at http://www.bo-protscience.fr/kpred/.

  14. An alternative view of protein fold space.

    PubMed

    Shindyalov, I N; Bourne, P E

    2000-02-15

    Comparing and subsequently classifying protein structures information has received significant attention concurrent with the increase in the number of experimentally derived 3-dimensional structures. Classification schemes have focused on biological function found within protein domains and on structure classification based on topology. Here an alternative view is presented that groups substructures. Substructures are long (50-150 residue) highly repetitive near-contiguous pieces of polypeptide chain that occur frequently in a set of proteins from the PDB defined as structurally non-redundant over the complete polypeptide chain. The substructure classification is based on a previously reported Combinatorial Extension (CE) algorithm that provides a significantly different set of structure alignments than those previously described, having, for example, only a 40% overlap with FSSP. Qualitatively the algorithm provides longer contiguous aligned segments at the price of a slightly higher root-mean-square deviation (rmsd). Clustering these alignments gives a discreet and highly repetitive set of substructures not detectable by sequence similarity alone. In some cases different substructures represent all or different parts of well known folds indicative of the Russian doll effect--the continuity of protein fold space. In other cases they fall into different structure and functional classifications. It is too early to determine whether these newly classified substructures represent new insights into the evolution of a structural framework important to many proteins. What is apparent from on-going work is that these substructures have the potential to be useful probes in finding remote sequence homology and in structure prediction studies. The characteristics of the complete all-by-all comparison of the polypeptide chains present in the PDB and details of the filtering procedure by pair-wise structure alignment that led to the emergent substructure gallery are discussed. Substructure classification, alignments, and tools to analyze them are available at http://cl.sdsc.edu/ce.html.

  15. Challenging the state of the art in protein structure prediction: Highlights of experimental target structures for the 10th Critical Assessment of Techniques for Protein Structure Prediction Experiment CASP10.

    PubMed

    Kryshtafovych, Andriy; Moult, John; Bales, Patrick; Bazan, J Fernando; Biasini, Marco; Burgin, Alex; Chen, Chen; Cochran, Frank V; Craig, Timothy K; Das, Rhiju; Fass, Deborah; Garcia-Doval, Carmela; Herzberg, Osnat; Lorimer, Donald; Luecke, Hartmut; Ma, Xiaolei; Nelson, Daniel C; van Raaij, Mark J; Rohwer, Forest; Segall, Anca; Seguritan, Victor; Zeth, Kornelius; Schwede, Torsten

    2014-02-01

    For the last two decades, CASP has assessed the state of the art in techniques for protein structure prediction and identified areas which required further development. CASP would not have been possible without the prediction targets provided by the experimental structural biology community. In the latest experiment, CASP10, more than 100 structures were suggested as prediction targets, some of which appeared to be extraordinarily difficult for modeling. In this article, authors of some of the most challenging targets discuss which specific scientific question motivated the experimental structure determination of the target protein, which structural features were especially interesting from a structural or functional perspective, and to what extent these features were correctly reproduced in the predictions submitted to CASP10. Specifically, the following targets will be presented: the acid-gated urea channel, a difficult to predict transmembrane protein from the important human pathogen Helicobacter pylori; the structure of human interleukin (IL)-34, a recently discovered helical cytokine; the structure of a functionally uncharacterized enzyme OrfY from Thermoproteus tenax formed by a gene duplication and a novel fold; an ORFan domain of mimivirus sulfhydryl oxidase R596; the fiber protein gene product 17 from bacteriophage T7; the bacteriophage CBA-120 tailspike protein; a virus coat protein from metagenomic samples of the marine environment; and finally, an unprecedented class of structure prediction targets based on engineered disulfide-rich small proteins. Copyright © 2013 The Authors. Wiley Periodicals, Inc.

  16. Accurate secondary structure prediction and fold recognition for circular dichroism spectroscopy

    PubMed Central

    Micsonai, András; Wien, Frank; Kernya, Linda; Lee, Young-Ho; Goto, Yuji; Réfrégiers, Matthieu; Kardos, József

    2015-01-01

    Circular dichroism (CD) spectroscopy is a widely used technique for the study of protein structure. Numerous algorithms have been developed for the estimation of the secondary structure composition from the CD spectra. These methods often fail to provide acceptable results on α/β-mixed or β-structure–rich proteins. The problem arises from the spectral diversity of β-structures, which has hitherto been considered as an intrinsic limitation of the technique. The predictions are less reliable for proteins of unusual β-structures such as membrane proteins, protein aggregates, and amyloid fibrils. Here, we show that the parallel/antiparallel orientation and the twisting of the β-sheets account for the observed spectral diversity. We have developed a method called β-structure selection (BeStSel) for the secondary structure estimation that takes into account the twist of β-structures. This method can reliably distinguish parallel and antiparallel β-sheets and accurately estimates the secondary structure for a broad range of proteins. Moreover, the secondary structure components applied by the method are characteristic to the protein fold, and thus the fold can be predicted to the level of topology in the CATH classification from a single CD spectrum. By constructing a web server, we offer a general tool for a quick and reliable structure analysis using conventional CD or synchrotron radiation CD (SRCD) spectroscopy for the protein science research community. The method is especially useful when X-ray or NMR techniques fail. Using BeStSel on data collected by SRCD spectroscopy, we investigated the structure of amyloid fibrils of various disease-related proteins and peptides. PMID:26038575

  17. High-Resolution Mapping of a Repeat Protein Folding Free Energy Landscape.

    PubMed

    Fossat, Martin J; Dao, Thuy P; Jenkins, Kelly; Dellarole, Mariano; Yang, Yinshan; McCallum, Scott A; Garcia, Angel E; Barrick, Doug; Roumestand, Christian; Royer, Catherine A

    2016-12-06

    A complete description of the pathways and mechanisms of protein folding requires a detailed structural and energetic characterization of the conformational ensemble along the entire folding reaction coordinate. Simulations can provide this level of insight for small proteins. In contrast, with the exception of hydrogen exchange, which does not monitor folding directly, experimental studies of protein folding have not yielded such structural and energetic detail. NMR can provide residue specific atomic level structural information, but its implementation in protein folding studies using chemical or temperature perturbation is problematic. Here we present a highly detailed structural and energetic map of the entire folding landscape of the leucine-rich repeat protein, pp32 (Anp32), obtained by combining pressure-dependent site-specific 1 H- 15 N HSQC data with coarse-grained molecular dynamics simulations. The results obtained using this equilibrium approach demonstrate that the main barrier to folding of pp32 is quite broad and lies near the unfolded state, with structure apparent only in the C-terminal region. Significant deviation from two-state unfolding under pressure reveals an intermediate on the folded side of the main barrier in which the N-terminal region is disordered. A nonlinear temperature dependence of the population of this intermediate suggests a large heat capacity change associated with its formation. The combination of pressure, which favors the population of folding intermediates relative to chemical denaturants; NMR, which allows their observation; and constrained structure-based simulations yield unparalleled insight into protein folding mechanisms. Copyright © 2016 Biophysical Society. Published by Elsevier Inc. All rights reserved.

  18. Engineered control of enzyme structural dynamics and function.

    PubMed

    Boehr, David D; D'Amico, Rebecca N; O'Rourke, Kathleen F

    2018-04-01

    Enzymes undergo a range of internal motions from local, active site fluctuations to large-scale, global conformational changes. These motions are often important for enzyme function, including in ligand binding and dissociation and even preparing the active site for chemical catalysis. Protein engineering efforts have been directed towards manipulating enzyme structural dynamics and conformational changes, including targeting specific amino acid interactions and creation of chimeric enzymes with new regulatory functions. Post-translational covalent modification can provide an additional level of enzyme control. These studies have not only provided insights into the functional role of protein motions, but they offer opportunities to create stimulus-responsive enzymes. These enzymes can be engineered to respond to a number of external stimuli, including light, pH, and the presence of novel allosteric modulators. Altogether, the ability to engineer and control enzyme structural dynamics can provide new tools for biotechnology and medicine. © 2018 The Protein Society.

  19. De Novo Proteins with Life-Sustaining Functions Are Structurally Dynamic.

    PubMed

    Murphy, Grant S; Greisman, Jack B; Hecht, Michael H

    2016-01-29

    Designing and producing novel proteins that fold into stable structures and provide essential biological functions are key goals in synthetic biology. In initial steps toward achieving these goals, we constructed a combinatorial library of de novo proteins designed to fold into 4-helix bundles. As described previously, screening this library for sequences that function in vivo to rescue conditionally lethal mutants of Escherichia coli (auxotrophs) yielded several de novo sequences, termed SynRescue proteins, which rescued four different E. coli auxotrophs. In an effort to understand the structural requirements necessary for auxotroph rescue, we investigated the biophysical properties of the SynRescue proteins, using both computational and experimental approaches. Results from circular dichroism, size-exclusion chromatography, and NMR demonstrate that the SynRescue proteins are α-helical and relatively stable. Surprisingly, however, they do not form well-ordered structures. Instead, they form dynamic structures that fluctuate between monomeric and dimeric states. These findings show that a well-ordered structure is not a prerequisite for life-sustaining functions, and suggests that dynamic structures may have been important in the early evolution of protein function. Copyright © 2015 Elsevier Ltd. All rights reserved.

  20. Structural features that predict real-value fluctuations of globular proteins.

    PubMed

    Jamroz, Michal; Kolinski, Andrzej; Kihara, Daisuke

    2012-05-01

    It is crucial to consider dynamics for understanding the biological function of proteins. We used a large number of molecular dynamics (MD) trajectories of nonhomologous proteins as references and examined static structural features of proteins that are most relevant to fluctuations. We examined correlation of individual structural features with fluctuations and further investigated effective combinations of features for predicting the real value of residue fluctuations using the support vector regression (SVR). It was found that some structural features have higher correlation than crystallographic B-factors with fluctuations observed in MD trajectories. Moreover, SVR that uses combinations of static structural features showed accurate prediction of fluctuations with an average Pearson's correlation coefficient of 0.669 and a root mean square error of 1.04 Å. This correlation coefficient is higher than the one observed in predictions by the Gaussian network model (GNM). An advantage of the developed method over the GNMs is that the former predicts the real value of fluctuation. The results help improve our understanding of relationships between protein structure and fluctuation. Furthermore, the developed method provides a convienient practial way to predict fluctuations of proteins using easily computed static structural features of proteins. Copyright © 2012 Wiley Periodicals, Inc.

  1. Structural features that predict real-value fluctuations of globular proteins

    PubMed Central

    Jamroz, Michal; Kolinski, Andrzej; Kihara, Daisuke

    2012-01-01

    It is crucial to consider dynamics for understanding the biological function of proteins. We used a large number of molecular dynamics trajectories of non-homologous proteins as references and examined static structural features of proteins that are most relevant to fluctuations. We examined correlation of individual structural features with fluctuations and further investigated effective combinations of features for predicting the real-value of residue fluctuations using the support vector regression. It was found that some structural features have higher correlation than crystallographic B-factors with fluctuations observed in molecular dynamics trajectories. Moreover, support vector regression that uses combinations of static structural features showed accurate prediction of fluctuations with an average Pearson’s correlation coefficient of 0.669 and a root mean square error of 1.04 Å. This correlation coefficient is higher than the one observed for the prediction by the Gaussian network model. An advantage of the developed method over the Gaussian network models is that the former predicts the real-value of fluctuation. The results help improve our understanding of relationships between protein structure and fluctuation. Furthermore, the developed method provides a convienient practial way to predict fluctuations of proteins using easily computed static structural features of proteins. PMID:22328193

  2. A Practical Approach to Protein Crystallography.

    PubMed

    Ilari, Andrea; Savino, Carmelinda

    2017-01-01

    Macromolecular crystallography is a powerful tool for structural biology. The resolution of a protein crystal structure is becoming much easier than in the past, thanks to developments in computing, automation of crystallization techniques and high-flux synchrotron sources to collect diffraction datasets. The aim of this chapter is to provide practical procedures to determine a protein crystal structure, illustrating the new techniques, experimental methods, and software that have made protein crystallography a tool accessible to a larger scientific community.It is impossible to give more than a taste of what the X-ray crystallographic technique entails in one brief chapter and there are different ways to solve a protein structure. Since the number of structures available in the Protein Data Bank (PDB) is becoming ever larger (the protein data bank now contains more than 100,000 entries) and therefore the probability to find a good model to solve the structure is ever increasing, we focus our attention on the Molecular Replacement method. Indeed, whenever applicable, this method allows the resolution of macromolecular structures starting from a single data set and a search model downloaded from the PDB, with the aid only of computer work.

  3. Target Highlights in CASP9: Experimental Target Structures for the Critical Assessment of Techniques for Protein Structure Prediction

    PubMed Central

    Kryshtafovych, Andriy; Moult, John; Bartual, Sergio G.; Bazan, J. Fernando; Berman, Helen; Casteel, Darren E.; Christodoulou, Evangelos; Everett, John K.; Hausmann, Jens; Heidebrecht, Tatjana; Hills, Tanya; Hui, Raymond; Hunt, John F.; Jayaraman, Seetharaman; Joachimiak, Andrzej; Kennedy, Michael A.; Kim, Choel; Lingel, Andreas; Michalska, Karolina; Montelione, Gaetano T.; Otero, José M.; Perrakis, Anastassis; Pizarro, Juan C.; van Raaij, Mark J.; Ramelot, Theresa A.; Rousseau, Francois; Tong, Liang; Wernimont, Amy K.; Young, Jasmine; Schwede, Torsten

    2011-01-01

    One goal of the CASP Community Wide Experiment on the Critical Assessment of Techniques for Protein Structure Prediction is to identify the current state of the art in protein structure prediction and modeling. A fundamental principle of CASP is blind prediction on a set of relevant protein targets, i.e. the participating computational methods are tested on a common set of experimental target proteins, for which the experimental structures are not known at the time of modeling. Therefore, the CASP experiment would not have been possible without broad support of the experimental protein structural biology community. In this manuscript, several experimental groups discuss the structures of the proteins which they provided as prediction targets for CASP9, highlighting structural and functional peculiarities of these structures: the long tail fibre protein gp37 from bacteriophage T4, the cyclic GMP-dependent protein kinase Iβ (PKGIβ) dimerization/docking domain, the ectodomain of the JTB (Jumping Translocation Breakpoint) transmembrane receptor, Autotaxin (ATX) in complex with an inhibitor, the DNA-Binding J-Binding Protein 1 (JBP1) domain essential for biosynthesis and maintenance of DNA base-J (β-D-glucosyl-hydroxymethyluracil) in Trypanosoma and Leishmania, an so far uncharacterized 73 residue domain from Ruminococcus gnavus with a fold typical for PDZ-like domains, a domain from the Phycobilisome (PBS) core-membrane linker (LCM) phycobiliprotein ApcE from Synechocystis, the Heat shock protein 90 (Hsp90) activators PFC0360w and PFC0270w from Plasmodium falciparum, and 2-oxo-3-deoxygalactonate kinase from Klebsiella pneumoniae. PMID:22020785

  4. The intrinsic flexibility of the aptamer targeting the ribosomal protein S8 is a key factor for the molecular recognition.

    PubMed

    Autiero, Ida; Ruvo, Menotti; Improta, Roberto; Vitagliano, Luigi

    2018-04-01

    Aptamers are RNA/DNA biomolecules representing an emerging class of protein interactors and regulators. Despite the growing interest in these molecules, current understanding of chemical-physical basis of their target recognition is limited. Recently, the characterization of the aptamer targeting the protein-S8 has suggested that flexibility plays important functional roles. We investigated the structural versatility of the S8-aptamer by molecular dynamics simulations. Five different simulations have been conducted by varying starting structures and temperatures. The simulation of S8-aptamer complex provides a dynamic view of the contacts occurring at the complex interface. The simulation of the aptamer in ligand-free state indicates that its central region is intrinsically endowed with a remarkable flexibility. Nevertheless, none of the trajectory structures adopts the structure observed in the S8-aptamer complex. The aptamer ligand-bound is very rigid in the simulation carried out at 300 K. A structural transition of this state, providing insights into the aptamer-protein recognition process, is observed in a simulation carried out at 400 K. These data indicate that a key event in the binding is linked to the widening of the central region of the aptamer. Particularly relevant is switch of the A26 base from its ligand-free state to a location that allows the G13-C28 base-pairing. Intrinsic flexibility of the aptamer is essential for partner recognition. Present data indicate that S8 recognizes the aptamer through an induced-fit rather than a population-shift mechanism. The present study provides deeper understanding of the structural basis of the structural versatility of aptamers. Copyright © 2018 Elsevier B.V. All rights reserved.

  5. Online interactive analysis of protein structure ensembles with Bio3D-web.

    PubMed

    Skjærven, Lars; Jariwala, Shashank; Yao, Xin-Qiu; Grant, Barry J

    2016-11-15

    Bio3D-web is an online application for analyzing the sequence, structure and conformational heterogeneity of protein families. Major functionality is provided for identifying protein structure sets for analysis, their alignment and refined structure superposition, sequence and structure conservation analysis, mapping and clustering of conformations and the quantitative comparison of their predicted structural dynamics. Bio3D-web is based on the Bio3D and Shiny R packages. All major browsers are supported and full source code is available under a GPL2 license from http://thegrantlab.org/bio3d-web CONTACT: bjgrant@umich.edu or lars.skjarven@uib.no. © The Author 2016. Published by Oxford University Press.

  6. Insights on the structural perturbations in human MTHFR Ala222Val mutant by protein modeling and molecular dynamics.

    PubMed

    Abhinand, P A; Shaikh, Faraz; Bhakat, Soumendranath; Radadiya, Ashish; Bhaskar, L V K S; Shah, Anamik; Ragunath, P K

    2016-01-01

    Methylenetetrahydrofolate reductase (MTHFR) protein catalyzes the only biochemical reaction which produces methyltetrahydrofolate, the active form of folic acid essential for several molecular functions. The Ala222Val polymorphism of human MTHFR encodes a thermolabile protein associated with increased risk of neural tube defects and cardiovascular disease. Experimental studies have shown that the mutation does not affect the kinetic properties of MTHFR, but inactivates the protein by increasing flavin adenine dinucleotide (FAD) loss. The lack of completely solved crystal structure of MTHFR is an impediment in understanding the structural perturbations caused by the Ala222Val mutation; computational modeling provides a suitable alternative. The three-dimensional structure of human MTHFR protein was obtained through homology modeling, by taking the MTHFR structures from Escherichia coli and Thermus thermophilus as templates. Subsequently, the modeled structure was docked with FAD using Glide, which revealed a very good binding affinity, authenticated by a Glide XP score of -10.3983 (kcal mol(-1)). The MTHFR was mutated by changing Alanine 222 to Valine. The wild-type MTHFR-FAD complex and the Ala222Val mutant MTHFR-FAD complex were subjected to molecular dynamics simulation over 50 ns period. The average difference in backbone root mean square deviation (RMSD) between wild and mutant variant was found to be ~.11 Å. The greater degree of fluctuations in the mutant protein translates to increased conformational stability as a result of mutation. The FAD-binding ability of the mutant MTHFR was also found to be significantly lowered as a result of decreased protein grip caused by increased conformational flexibility. The study provides insights into the Ala222Val mutation of human MTHFR that induces major conformational changes in the tertiary structure, causing a significant reduction in the FAD-binding affinity.

  7. RosettaHoles: rapid assessment of protein core packing for structure prediction, refinement, design, and validation.

    PubMed

    Sheffler, Will; Baker, David

    2009-01-01

    We present a novel method called RosettaHoles for visual and quantitative assessment of underpacking in the protein core. RosettaHoles generates a set of spherical cavity balls that fill the empty volume between atoms in the protein interior. For visualization, the cavity balls are aggregated into contiguous overlapping clusters and small cavities are discarded, leaving an uncluttered representation of the unfilled regions of space in a structure. For quantitative analysis, the cavity ball data are used to estimate the probability of observing a given cavity in a high-resolution crystal structure. RosettaHoles provides excellent discrimination between real and computationally generated structures, is predictive of incorrect regions in models, identifies problematic structures in the Protein Data Bank, and promises to be a useful validation tool for newly solved experimental structures.

  8. RosettaHoles: Rapid assessment of protein core packing for structure prediction, refinement, design, and validation

    PubMed Central

    Sheffler, Will; Baker, David

    2009-01-01

    We present a novel method called RosettaHoles for visual and quantitative assessment of underpacking in the protein core. RosettaHoles generates a set of spherical cavity balls that fill the empty volume between atoms in the protein interior. For visualization, the cavity balls are aggregated into contiguous overlapping clusters and small cavities are discarded, leaving an uncluttered representation of the unfilled regions of space in a structure. For quantitative analysis, the cavity ball data are used to estimate the probability of observing a given cavity in a high-resolution crystal structure. RosettaHoles provides excellent discrimination between real and computationally generated structures, is predictive of incorrect regions in models, identifies problematic structures in the Protein Data Bank, and promises to be a useful validation tool for newly solved experimental structures. PMID:19177366

  9. Protein based Block Copolymers

    PubMed Central

    Rabotyagova, Olena S.; Cebe, Peggy; Kaplan, David L.

    2011-01-01

    Advances in genetic engineering have led to the synthesis of protein-based block copolymers with control of chemistry and molecular weight, resulting in unique physical and biological properties. The benefits from incorporating peptide blocks into copolymer designs arise from the fundamental properties of proteins to adopt ordered conformations and to undergo self-assembly, providing control over structure formation at various length scales when compared to conventional block copolymers. This review covers the synthesis, structure, assembly, properties, and applications of protein-based block copolymers. PMID:21235251

  10. The evolution of function within the Nudix homology clan

    PubMed Central

    Srouji, John R.; Xu, Anting; Park, Annsea; Kirsch, Jack F.

    2017-01-01

    ABSTRACT The Nudix homology clan encompasses over 80,000 protein domains from all three domains of life, defined by homology to each other. Proteins with a domain from this clan fall into four general functional classes: pyrophosphohydrolases, isopentenyl diphosphate isomerases (IDIs), adenine/guanine mismatch‐specific adenine glycosylases (A/G‐specific adenine glycosylases), and nonenzymatic activities such as protein/protein interaction and transcriptional regulation. The largest group, pyrophosphohydrolases, encompasses more than 100 distinct hydrolase specificities. To understand the evolution of this vast number of activities, we assembled and analyzed experimental and structural data for 205 Nudix proteins collected from the literature. We corrected erroneous functions or provided more appropriate descriptions for 53 annotations described in the Gene Ontology Annotation database in this family, and propose 275 new experimentally‐based annotations. We manually constructed a structure‐guided sequence alignment of 78 Nudix proteins. Using the structural alignment as a seed, we then made an alignment of 347 “select” Nudix homology domains, curated from structurally determined, functionally characterized, or phylogenetically important Nudix domains. Based on our review of Nudix pyrophosphohydrolase structures and specificities, we further analyzed a loop region downstream of the Nudix hydrolase motif previously shown to contact the substrate molecule and possess known functional motifs. This loop region provides a potential structural basis for the functional radiation and evolution of substrate specificity within the hydrolase family. Finally, phylogenetic analyses of the 347 select protein domains and of the complete Nudix homology clan revealed general monophyly with regard to function and a few instances of probable homoplasy. Proteins 2017; 85:775–811. © 2016 Wiley Periodicals, Inc. PMID:27936487

  11. Prediction of protein secondary structure content for the twilight zone sequences.

    PubMed

    Homaeian, Leila; Kurgan, Lukasz A; Ruan, Jishou; Cios, Krzysztof J; Chen, Ke

    2007-11-15

    Secondary protein structure carries information about local structural arrangements, which include three major conformations: alpha-helices, beta-strands, and coils. Significant majority of successful methods for prediction of the secondary structure is based on multiple sequence alignment. However, multiple alignment fails to provide accurate results when a sequence comes from the twilight zone, that is, it is characterized by low (<30%) homology. To this end, we propose a novel method for prediction of secondary structure content through comprehensive sequence representation, called PSSC-core. The method uses a multiple linear regression model and introduces a comprehensive feature-based sequence representation to predict amount of helices and strands for sequences from the twilight zone. The PSSC-core method was tested and compared with two other state-of-the-art prediction methods on a set of 2187 twilight zone sequences. The results indicate that our method provides better predictions for both helix and strand content. The PSSC-core is shown to provide statistically significantly better results when compared with the competing methods, reducing the prediction error by 5-7% for helix and 7-9% for strand content predictions. The proposed feature-based sequence representation uses a comprehensive set of physicochemical properties that are custom-designed for each of the helix and strand content predictions. It includes composition and composition moment vectors, frequency of tetra-peptides associated with helical and strand conformations, various property-based groups like exchange groups, chemical groups of the side chains and hydrophobic group, auto-correlations based on hydrophobicity, side-chain masses, hydropathy, and conformational patterns for beta-sheets. The PSSC-core method provides an alternative for predicting the secondary structure content that can be used to validate and constrain results of other structure prediction methods. At the same time, it also provides useful insight into design of successful protein sequence representations that can be used in developing new methods related to prediction of different aspects of the secondary protein structure. (c) 2007 Wiley-Liss, Inc.

  12. Buried and accessible surface area control intrinsic protein flexibility.

    PubMed

    Marsh, Joseph A

    2013-09-09

    Proteins experience a wide variety of conformational dynamics that can be crucial for facilitating their diverse functions. How is the intrinsic flexibility required for these motions encoded in their three-dimensional structures? Here, the overall flexibility of a protein is demonstrated to be tightly coupled to the total amount of surface area buried within its fold. A simple proxy for this, the relative solvent-accessible surface area (Arel), therefore shows excellent agreement with independent measures of global protein flexibility derived from various experimental and computational methods. Application of Arel on a large scale demonstrates its utility by revealing unique sequence and structural properties associated with intrinsic flexibility. In particular, flexibility as measured by Arel shows little correspondence with intrinsic disorder, but instead tends to be associated with multiple domains and increased α-helical structure. Furthermore, the apparent flexibility of monomeric proteins is found to be useful for identifying quaternary-structure errors in published crystal structures. There is also a strong tendency for the crystal structures of more flexible proteins to be solved to lower resolutions. Finally, local solvent accessibility is shown to be a primary determinant of local residue flexibility. Overall, this work provides both fundamental mechanistic insight into the origin of protein flexibility and a simple, practical method for predicting flexibility from protein structures. © 2013 Elsevier Ltd. All rights reserved.

  13. Camps 2.0: exploring the sequence and structure space of prokaryotic, eukaryotic, and viral membrane proteins.

    PubMed

    Neumann, Sindy; Hartmann, Holger; Martin-Galiano, Antonio J; Fuchs, Angelika; Frishman, Dmitrij

    2012-03-01

    Structural bioinformatics of membrane proteins is still in its infancy, and the picture of their fold space is only beginning to emerge. Because only a handful of three-dimensional structures are available, sequence comparison and structure prediction remain the main tools for investigating sequence-structure relationships in membrane protein families. Here we present a comprehensive analysis of the structural families corresponding to α-helical membrane proteins with at least three transmembrane helices. The new version of our CAMPS database (CAMPS 2.0) covers nearly 1300 eukaryotic, prokaryotic, and viral genomes. Using an advanced classification procedure, which is based on high-order hidden Markov models and considers both sequence similarity as well as the number of transmembrane helices and loop lengths, we identified 1353 structurally homogeneous clusters roughly corresponding to membrane protein folds. Only 53 clusters are associated with experimentally determined three-dimensional structures, and for these clusters CAMPS is in reasonable agreement with structure-based classification approaches such as SCOP and CATH. We therefore estimate that ∼1300 structures would need to be determined to provide a sufficient structural coverage of polytopic membrane proteins. CAMPS 2.0 is available at http://webclu.bio.wzw.tum.de/CAMPS2.0/. Copyright © 2011 Wiley Periodicals, Inc.

  14. Accounting for observed small angle X-ray scattering profile in the protein-protein docking server ClusPro.

    PubMed

    Xia, Bing; Mamonov, Artem; Leysen, Seppe; Allen, Karen N; Strelkov, Sergei V; Paschalidis, Ioannis Ch; Vajda, Sandor; Kozakov, Dima

    2015-07-30

    The protein-protein docking server ClusPro is used by thousands of laboratories, and models built by the server have been reported in over 300 publications. Although the structures generated by the docking include near-native ones for many proteins, selecting the best model is difficult due to the uncertainty in scoring. Small angle X-ray scattering (SAXS) is an experimental technique for obtaining low resolution structural information in solution. While not sufficient on its own to uniquely predict complex structures, accounting for SAXS data improves the ranking of models and facilitates the identification of the most accurate structure. Although SAXS profiles are currently available only for a small number of complexes, due to its simplicity the method is becoming increasingly popular. Since combining docking with SAXS experiments will provide a viable strategy for fairly high-throughput determination of protein complex structures, the option of using SAXS restraints is added to the ClusPro server. © 2015 Wiley Periodicals, Inc. © 2015 Wiley Periodicals, Inc.

  15. Protein Structure Determination by Assembling Super-Secondary Structure Motifs Using Pseudocontact Shifts.

    PubMed

    Pilla, Kala Bharath; Otting, Gottfried; Huber, Thomas

    2017-03-07

    Computational and nuclear magnetic resonance hybrid approaches provide efficient tools for 3D structure determination of small proteins, but currently available algorithms struggle to perform with larger proteins. Here we demonstrate a new computational algorithm that assembles the 3D structure of a protein from its constituent super-secondary structural motifs (Smotifs) with the help of pseudocontact shift (PCS) restraints for backbone amide protons, where the PCSs are produced from different metal centers. The algorithm, DINGO-PCS (3D assembly of Individual Smotifs to Near-native Geometry as Orchestrated by PCSs), employs the PCSs to recognize, orient, and assemble the constituent Smotifs of the target protein without any other experimental data or computational force fields. Using a universal Smotif database, the DINGO-PCS algorithm exhaustively enumerates any given Smotif. We benchmarked the program against ten different protein targets ranging from 100 to 220 residues with different topologies. For nine of these targets, the method was able to identify near-native Smotifs. Copyright © 2017 Elsevier Ltd. All rights reserved.

  16. Protein Structural Analysis via Mass Spectrometry-Based Proteomics

    PubMed Central

    Artigues, Antonio; Nadeau, Owen W.; Rimmer, Mary Ashley; Villar, Maria T.; Du, Xiuxia; Fenton, Aron W.; Carlson, Gerald M.

    2017-01-01

    Modern mass spectrometry (MS) technologies have provided a versatile platform that can be combined with a large number of techniques to analyze protein structure and dynamics. These techniques include the three detailed in this chapter: 1) hydrogen/deuterium exchange (HDX), 2) limited proteolysis, and 3) chemical crosslinking (CX). HDX relies on the change in mass of a protein upon its dilution into deuterated buffer, which results in varied deuterium content within its backbone amides. Structural information on surface exposed, flexible or disordered linker regions of proteins can be achieved through limited proteolysis, using a variety of proteases and only small extents of digestion. CX refers to the covalent coupling of distinct chemical species and has been used to analyze the structure, function and interactions of proteins by identifying crosslinking sites that are formed by small multi-functional reagents, termed crosslinkers. Each of these MS applications is capable of revealing structural information for proteins when used either with or without other typical high resolution techniques, including NMR and X-ray crystallography. PMID:27975228

  17. Protein structure database search and evolutionary classification.

    PubMed

    Yang, Jinn-Moon; Tung, Chi-Hua

    2006-01-01

    As more protein structures become available and structural genomics efforts provide structural models in a genome-wide strategy, there is a growing need for fast and accurate methods for discovering homologous proteins and evolutionary classifications of newly determined structures. We have developed 3D-BLAST, in part, to address these issues. 3D-BLAST is as fast as BLAST and calculates the statistical significance (E-value) of an alignment to indicate the reliability of the prediction. Using this method, we first identified 23 states of the structural alphabet that represent pattern profiles of the backbone fragments and then used them to represent protein structure databases as structural alphabet sequence databases (SADB). Our method enhanced BLAST as a search method, using a new structural alphabet substitution matrix (SASM) to find the longest common substructures with high-scoring structured segment pairs from an SADB database. Using personal computers with Intel Pentium4 (2.8 GHz) processors, our method searched more than 10 000 protein structures in 1.3 s and achieved a good agreement with search results from detailed structure alignment methods. [3D-BLAST is available at http://3d-blast.life.nctu.edu.tw].

  18. Observation of Water-Protein Interaction Dynamics with Broadband Two-Dimensional Infrared Spectroscopy

    NASA Astrophysics Data System (ADS)

    De Marco, Luigi; Haky, Andrew; Tokmakoff, Andrei

    Two-dimensional infrared (2D IR) spectroscopy has proven itself an indispensable tool for studying molecular dynamics and intermolecular interactions on ultrafast timescales. Using a novel source of broadband mid-IR pulses, we have collected 2D IR spectra of protein films at varying levels of hydration. With 2D IR, we can directly observe coupling between water's motions and the protein's. Protein films provide us with the ability to discriminate hydration waters from bulk water and thus give us access to studying water dynamics along the protein backbone, fluctuations in the protein structure, and the interplay between the molecular dynamics of the two. We present two representative protein films: poly-L-proline (PLP) and hen egg-white lysozyme (HEWL). Having no N-H groups, PLP allows us to look at water dynamics without interference from resonant energy transfer between the protein N-H stretch and the water O-H stretch. We conclude that at low hydration levels water-protein interactions dominate, and the water's dynamics are tied to those of the protein. In HEWL films, we take advantage of the robust secondary structure to partially deuterate the film, allowing us to spectrally distinguish the protein core from the exterior. From this, we show that resonant energy transfer to water provides an effective means of dissipating excess energy within the protein, while maintaining the structure. These methods are general and can easily be extended to studying specific protein-water interactions.

  19. FNL Scientists and Collaborators Solve 3-D Structure of Key Protein in Alzheimer’s Disease | FNLCR Staging

    Cancer.gov

    Researchers at the Frederick National Lab (FNL) have collaborated in solving the three-dimensional structure of a key protein in Alzheimer’s disease, providing new insight into the basic mechanisms that give rise to the devastating illness. The pro

  20. Designing and benchmarking the MULTICOM protein structure prediction system

    PubMed Central

    2013-01-01

    Background Predicting protein structure from sequence is one of the most significant and challenging problems in bioinformatics. Numerous bioinformatics techniques and tools have been developed to tackle almost every aspect of protein structure prediction ranging from structural feature prediction, template identification and query-template alignment to structure sampling, model quality assessment, and model refinement. How to synergistically select, integrate and improve the strengths of the complementary techniques at each prediction stage and build a high-performance system is becoming a critical issue for constructing a successful, competitive protein structure predictor. Results Over the past several years, we have constructed a standalone protein structure prediction system MULTICOM that combines multiple sources of information and complementary methods at all five stages of the protein structure prediction process including template identification, template combination, model generation, model assessment, and model refinement. The system was blindly tested during the ninth Critical Assessment of Techniques for Protein Structure Prediction (CASP9) in 2010 and yielded very good performance. In addition to studying the overall performance on the CASP9 benchmark, we thoroughly investigated the performance and contributions of each component at each stage of prediction. Conclusions Our comprehensive and comparative study not only provides useful and practical insights about how to select, improve, and integrate complementary methods to build a cutting-edge protein structure prediction system but also identifies a few new sources of information that may help improve the design of a protein structure prediction system. Several components used in the MULTICOM system are available at: http://sysbio.rnet.missouri.edu/multicom_toolbox/. PMID:23442819

  1. Single-Molecule Microscopy and Force Spectroscopy of Membrane Proteins

    NASA Astrophysics Data System (ADS)

    Engel, Andreas; Janovjak, Harald; Fotiadis, Dimtrios; Kedrov, Alexej; Cisneros, David; Müller, Daniel J.

    Single-molecule atomic force microscopy (AFM) provides novel ways to characterize the structure-function relationship of native membrane proteins. High-resolution AFM topographs allow observing the structure of single proteins at sub-nanometer resolution as well as their conformational changes, oligomeric state, molecular dynamics and assembly. We will review these feasibilities illustrating examples of membrane proteins in native and reconstituted membranes. Classification of individual topographs of single proteins allows understanding the principles of motions of their extrinsic domains, to learn about their local structural flexibilities and to find the entropy minima of certain conformations. Combined with the visualization of functionally related conformational changes these insights allow understanding why certain flexibilities are required for the protein to function and how structurally flexible regions allow certain conformational changes. Complementary to AFM imaging, single-molecule force spectroscopy (SMFS) experiments detect molecular interactions established within and between membrane proteins. The sensitivity of this method makes it possible to measure interactions that stabilize secondary structures such as transmembrane α-helices, polypeptide loops and segments within. Changes in temperature or protein-protein assembly do not change the locations of stable structural segments, but influence their stability established by collective molecular interactions. Such changes alter the probability of proteins to choose a certain unfolding pathway. Recent examples have elucidated unfolding and refolding pathways of membrane proteins as well as their energy landscapes.

  2. Structural and Functional Studies of H. seropedicae RecA Protein - Insights into the Polymerization of RecA Protein as Nucleoprotein Filament.

    PubMed

    Leite, Wellington C; Galvão, Carolina W; Saab, Sérgio C; Iulek, Jorge; Etto, Rafael M; Steffens, Maria B R; Chitteni-Pattu, Sindhu; Stanage, Tyler; Keck, James L; Cox, Michael M

    2016-01-01

    The bacterial RecA protein plays a role in the complex system of DNA damage repair. Here, we report the functional and structural characterization of the Herbaspirillum seropedicae RecA protein (HsRecA). HsRecA protein is more efficient at displacing SSB protein from ssDNA than Escherichia coli RecA protein. HsRecA also promotes DNA strand exchange more efficiently. The three dimensional structure of HsRecA-ADP/ATP complex has been solved to 1.7 Å resolution. HsRecA protein contains a small N-terminal domain, a central core ATPase domain and a large C-terminal domain, that are similar to homologous bacterial RecA proteins. Comparative structural analysis showed that the N-terminal polymerization motif of archaeal and eukaryotic RecA family proteins are also present in bacterial RecAs. Reconstruction of electrostatic potential from the hexameric structure of HsRecA-ADP/ATP revealed a high positive charge along the inner side, where ssDNA is bound inside the filament. The properties of this surface may explain the greater capacity of HsRecA protein to bind ssDNA, forming a contiguous nucleoprotein filament, displace SSB and promote DNA exchange relative to EcRecA. Our functional and structural analyses provide insight into the molecular mechanisms of polymerization of bacterial RecA as a helical nucleoprotein filament.

  3. Protein fiber linear dichroism for structure determination and kinetics in a low-volume, low-wavelength couette flow cell.

    PubMed

    Dafforn, Timothy R; Rajendra, Jacindra; Halsall, David J; Serpell, Louise C; Rodger, Alison

    2004-01-01

    High-resolution structure determination of soluble globular proteins relies heavily on x-ray crystallography techniques. Such an approach is often ineffective for investigations into the structure of fibrous proteins as these proteins generally do not crystallize. Thus investigations into fibrous protein structure have relied on less direct methods such as x-ray fiber diffraction and circular dichroism. Ultraviolet linear dichroism has the potential to provide additional information on the structure of such biomolecular systems. However, existing systems are not optimized for the requirements of fibrous proteins. We have designed and built a low-volume (200 microL), low-wavelength (down to 180 nm), low-pathlength (100 microm), high-alignment flow-alignment system (couette) to perform ultraviolet linear dichroism studies on the fibers formed by a range of biomolecules. The apparatus has been tested using a number of proteins for which longer wavelength linear dichroism spectra had already been measured. The new couette cell has also been used to obtain data on two medically important protein fibers, the all-beta-sheet amyloid fibers of the Alzheimer's derived protein Abeta and the long-chain assemblies of alpha1-antitrypsin polymers.

  4. Synchrotron IR microspectroscopy for protein structure analysis: Potential and questions

    DOE PAGES

    Yu, Peiqiang

    2006-01-01

    Synchrotron radiation-based Fourier transform infrared microspectroscopy (S-FTIR) has been developed as a rapid, direct, non-destructive, bioanalytical technique. This technique takes advantage of synchrotron light brightness and small effective source size and is capable of exploring the molecular chemical make-up within microstructures of a biological tissue without destruction of inherent structures at ultra-spatial resolutions within cellular dimension. To date there has been very little application of this advanced technique to the study of pure protein inherent structure at a cellular level in biological tissues. In this review, a novel approach was introduced to show the potential of the newly developed, advancedmore » synchrotron-based analytical technology, which can be used to localize relatively “pure“ protein in the plant tissues and relatively reveal protein inherent structure and protein molecular chemical make-up within intact tissue at cellular and subcellular levels. Several complex protein IR spectra data analytical techniques (Gaussian and Lorentzian multi-component peak modeling, univariate and multivariate analysis, principal component analysis (PCA), and hierarchical cluster analysis (CLA) are employed to relatively reveal features of protein inherent structure and distinguish protein inherent structure differences between varieties/species and treatments in plant tissues. By using a multi-peak modeling procedure, RELATIVE estimates (but not EXACT determinations) for protein secondary structure analysis can be made for comparison purpose. The issues of pro- and anti-multi-peaking modeling/fitting procedure for relative estimation of protein structure were discussed. By using the PCA and CLA analyses, the plant molecular structure can be qualitatively separate one group from another, statistically, even though the spectral assignments are not known. The synchrotron-based technology provides a new approach for protein structure research in biological tissues at ultraspatial resolutions.« less

  5. System and methods for predicting transmembrane domains in membrane proteins and mining the genome for recognizing G-protein coupled receptors

    DOEpatents

    Trabanino, Rene J; Vaidehi, Nagarajan; Hall, Spencer E; Goddard, William A; Floriano, Wely

    2013-02-05

    The invention provides computer-implemented methods and apparatus implementing a hierarchical protocol using multiscale molecular dynamics and molecular modeling methods to predict the presence of transmembrane regions in proteins, such as G-Protein Coupled Receptors (GPCR), and protein structural models generated according to the protocol. The protocol features a coarse grain sampling method, such as hydrophobicity analysis, to provide a fast and accurate procedure for predicting transmembrane regions. Methods and apparatus of the invention are useful to screen protein or polynucleotide databases for encoded proteins with transmembrane regions, such as GPCRs.

  6. Decomposition of Proteins into Dynamic Units from Atomic Cross-Correlation Functions.

    PubMed

    Calligari, Paolo; Gerolin, Marco; Abergel, Daniel; Polimeno, Antonino

    2017-01-10

    In this article, we present a clustering method of atoms in proteins based on the analysis of the correlation times of interatomic distance correlation functions computed from MD simulations. The goal is to provide a coarse-grained description of the protein in terms of fewer elements that can be treated as dynamically independent subunits. Importantly, this domain decomposition method does not take into account structural properties of the protein. Instead, the clustering of protein residues in terms of networks of dynamically correlated domains is defined on the basis of the effective correlation times of the pair distance correlation functions. For these properties, our method stands as a complementary analysis to the customary protein decomposition in terms of quasi-rigid, structure-based domains. Results obtained for a prototypal protein structure illustrate the approach proposed.

  7. Structural insights into a secretory abundant heat-soluble protein from an anhydrobiotic tardigrade, Ramazzottius varieornatus.

    PubMed

    Fukuda, Yohta; Miura, Yoshimasa; Mizohata, Eiichi; Inoue, Tsuyoshi

    2017-08-01

    Upon stopping metabolic processes, some tardigrades can undergo anhydrobiosis. Secretory abundant heat-soluble (SAHS) proteins have been reported as candidates for anhydrobiosis-related proteins in tardigrades, which seem to protect extracellular components and/or secretory organelles. We determined structures of a SAHS protein from Ramazzottius varieornatus (RvSAHS1), which is one of the toughest tardigrades. RvSAHS1 shows a β-barrel structure similar to fatty acid-binding proteins (FABPs), in which hydrophilic residues form peculiar hydrogen bond networks, which would provide RvSAHS1 with better tolerance against dehydration. We identified two putative ligand-binding sites: one that superimposes on those of some FABPs and the other, unique to and conserved in SAHS proteins. These results indicate that SAHS proteins constitute a new FABP family. © 2017 Federation of European Biochemical Societies.

  8. Atomic Structures of Minor Proteins VI and VII in the Human Adenovirus.

    PubMed

    Dai, Xinghong; Wu, Lily; Sun, Ren; Zhou, Z Hong

    2017-10-04

    Human adenoviruses (Ad) are dsDNA viruses associated with infectious diseases, yet better known as tools for gene delivery and oncolytic anti-cancer therapy. Atomic structures of Ad provide the basis for the development of antivirals and for engineering efforts towards more effective applications. Since 2010, atomic models of human Ad5 have been independently derived from photographic film cryoEM and X-ray crystallography, but discrepancies exist concerning the assignment of cement proteins IIIa, VIII and IX. To clarify these discrepancies, here we have employed the technology of direct electron-counting to obtain a cryoEM structure of human Ad5 at 3.2 Å resolution. Our improved structure unambiguously confirmed our previous cryoEM models of proteins IIIa, VIII and IX and explained the likely cause of conflict in the crystallography models. The improved structure also allows the identification of three new components in the cavities of hexons - the cleaved N-terminus of precursor protein VI (pVIn), the cleaved N-terminus of precursor protein VII (pVIIn2), and mature protein VI. The binding of pVIIn2--by extension that of genome-condensing pVII--to hexons is consistent with the previously proposed dsDNA genome-capsid co-assembly for adenoviruses, which resembles that of ssRNA viruses but differs from the well-established mechanism of pumping dsDNA into a preformed protein capsid, as exemplified by tailed bacteriophages and herpesviruses. IMPORTANCE Adenovirus is a double-edged sword to humans - as a widespread pathogen and a bioengineering tool for anti-cancer and gene therapy. Atomic structure of the virus provides the basis for antiviral and application developments, but conflicting atomic models from conventional/film cryoEM and X-ray crystallography for important cement proteins IIIa, VIII, and IX have caused confusion. Using the cutting-edge cryoEM technology with electron counting, we improved the structure of human adenovirus type 5 and confirmed our previous models of cement proteins IIIa, VIII, and IX, thus clarifying the inconsistent structures. The improved structure also reveals atomic details of membrane-lytic protein VI and genome-condensing protein VII and supports the previously proposed genome-capsid co-assembly mechanism for adenoviruses. Copyright © 2017 American Society for Microbiology.

  9. RNA chaperoning and intrinsic disorder in the core proteins of Flaviviridae.

    PubMed

    Ivanyi-Nagy, Roland; Lavergne, Jean-Pierre; Gabus, Caroline; Ficheux, Damien; Darlix, Jean-Luc

    2008-02-01

    RNA chaperone proteins are essential partners of RNA in living organisms and viruses. They are thought to assist in the correct folding and structural rearrangements of RNA molecules by resolving misfolded RNA species in an ATP-independent manner. RNA chaperoning is probably an entropy-driven process, mediated by the coupled binding and folding of intrinsically disordered protein regions and the kinetically trapped RNA. Previously, we have shown that the core protein of hepatitis C virus (HCV) is a potent RNA chaperone that can drive profound structural modifications of HCV RNA in vitro. We now examined the RNA chaperone activity and the disordered nature of core proteins from different Flaviviridae genera, namely that of HCV, GBV-B (GB virus B), WNV (West Nile virus) and BVDV (bovine viral diarrhoea virus). Despite low-sequence similarities, all four proteins demonstrated general nucleic acid annealing and RNA chaperone activities. Furthermore, heat resistance of core proteins, as well as far-UV circular dichroism spectroscopy suggested that a well-defined 3D protein structure is not necessary for core-induced RNA structural rearrangements. These data provide evidence that RNA chaperoning-possibly mediated by intrinsically disordered protein segments-is conserved in Flaviviridae core proteins. Thus, besides nucleocapsid formation, core proteins may function in RNA structural rearrangements taking place during virus replication.

  10. RNA chaperoning and intrinsic disorder in the core proteins of Flaviviridae

    PubMed Central

    Ivanyi-Nagy, Roland; Lavergne, Jean-Pierre; Gabus, Caroline; Ficheux, Damien; Darlix, Jean-Luc

    2008-01-01

    RNA chaperone proteins are essential partners of RNA in living organisms and viruses. They are thought to assist in the correct folding and structural rearrangements of RNA molecules by resolving misfolded RNA species in an ATP-independent manner. RNA chaperoning is probably an entropy-driven process, mediated by the coupled binding and folding of intrinsically disordered protein regions and the kinetically trapped RNA. Previously, we have shown that the core protein of hepatitis C virus (HCV) is a potent RNA chaperone that can drive profound structural modifications of HCV RNA in vitro. We now examined the RNA chaperone activity and the disordered nature of core proteins from different Flaviviridae genera, namely that of HCV, GBV-B (GB virus B), WNV (West Nile virus) and BVDV (bovine viral diarrhoea virus). Despite low-sequence similarities, all four proteins demonstrated general nucleic acid annealing and RNA chaperone activities. Furthermore, heat resistance of core proteins, as well as far-UV circular dichroism spectroscopy suggested that a well-defined 3D protein structure is not necessary for core-induced RNA structural rearrangements. These data provide evidence that RNA chaperoning—possibly mediated by intrinsically disordered protein segments—is conserved in Flaviviridae core proteins. Thus, besides nucleocapsid formation, core proteins may function in RNA structural rearrangements taking place during virus replication. PMID:18033802

  11. Expanding the proteome: disordered and alternatively-folded proteins

    PubMed Central

    Dyson, H. Jane

    2011-01-01

    Proteins provide much of the scaffolding for life, as well as undertaking a variety of essential catalytic reactions. These characteristic functions have led us to presuppose that proteins are in general functional only when well-structured and correctly folded. As we begin to explore the repertoire of possible protein sequences inherent in the human and other genomes, two stark facts that belie this supposition become clear: firstly, the number of apparent open reading frames in the human genome is significantly smaller than appears to be necessary to code for all of the diverse proteins in higher organisms, and secondly that a significant proportion of the protein sequences that would be coded by the genome would not be expected to form stable three-dimensional structures. Clearly the genome must include coding for a multitude of alternative forms of proteins, some of which may be partly or fully disordered or incompletely structured in their functional states. At the same time as this likelihood was recognized, experimental studies also began to uncover examples of important protein molecules and domains that were incompletely structured or completely disordered in solution, yet remained perfectly functional. In the ensuing years, we have seen an explosion of experimental and genome-annotation studies that have mapped the extent of the intrinsic disorder phenomenon and explored the possible biological rationales for its widespread occurrence. Answers to the question “why would a particular domain need to be unstructured?” are as varied as the systems where such domains are found. This review provides a survey of recent new directions in this field, and includes an evaluation of the role not only of intrinsically disordered proteins but of partially structured and highly dynamic members of the disorder-order continuum. PMID:21729349

  12. Using Atomic Force Microscopy to Characterize the Conformational Properties of Proteins and Protein-DNA Complexes That Carry Out DNA Repair.

    PubMed

    LeBlanc, Sharonda; Wilkins, Hunter; Li, Zimeng; Kaur, Parminder; Wang, Hong; Erie, Dorothy A

    2017-01-01

    Atomic force microscopy (AFM) is a scanning probe technique that allows visualization of single biomolecules and complexes deposited on a surface with nanometer resolution. AFM is a powerful tool for characterizing protein-protein and protein-DNA interactions. It can be used to capture snapshots of protein-DNA solution dynamics, which in turn, enables the characterization of the conformational properties of transient protein-protein and protein-DNA interactions. With AFM, it is possible to determine the stoichiometries and binding affinities of protein-protein and protein-DNA associations, the specificity of proteins binding to specific sites on DNA, and the conformations of the complexes. We describe methods to prepare and deposit samples, including surface treatments for optimal depositions, and how to quantitatively analyze images. We also discuss a new electrostatic force imaging technique called DREEM, which allows the visualization of the path of DNA within proteins in protein-DNA complexes. Collectively, these methods facilitate the development of comprehensive models of DNA repair and provide a broader understanding of all protein-protein and protein-nucleic acid interactions. The structural details gleaned from analysis of AFM images coupled with biochemistry provide vital information toward establishing the structure-function relationships that govern DNA repair processes. © 2017 Elsevier Inc. All rights reserved.

  13. Thermodynamic prediction of protein neutrality.

    PubMed

    Bloom, Jesse D; Silberg, Jonathan J; Wilke, Claus O; Drummond, D Allan; Adami, Christoph; Arnold, Frances H

    2005-01-18

    We present a simple theory that uses thermodynamic parameters to predict the probability that a protein retains the wild-type structure after one or more random amino acid substitutions. Our theory predicts that for large numbers of substitutions the probability that a protein retains its structure will decline exponentially with the number of substitutions, with the severity of this decline determined by properties of the structure. Our theory also predicts that a protein can gain extra robustness to the first few substitutions by increasing its thermodynamic stability. We validate our theory with simulations on lattice protein models and by showing that it quantitatively predicts previously published experimental measurements on subtilisin and our own measurements on variants of TEM1 beta-lactamase. Our work unifies observations about the clustering of functional proteins in sequence space, and provides a basis for interpreting the response of proteins to substitutions in protein engineering applications.

  14. Thermodynamic prediction of protein neutrality

    PubMed Central

    Bloom, Jesse D.; Silberg, Jonathan J.; Wilke, Claus O.; Drummond, D. Allan; Adami, Christoph; Arnold, Frances H.

    2005-01-01

    We present a simple theory that uses thermodynamic parameters to predict the probability that a protein retains the wild-type structure after one or more random amino acid substitutions. Our theory predicts that for large numbers of substitutions the probability that a protein retains its structure will decline exponentially with the number of substitutions, with the severity of this decline determined by properties of the structure. Our theory also predicts that a protein can gain extra robustness to the first few substitutions by increasing its thermodynamic stability. We validate our theory with simulations on lattice protein models and by showing that it quantitatively predicts previously published experimental measurements on subtilisin and our own measurements on variants of TEM1 β-lactamase. Our work unifies observations about the clustering of functional proteins in sequence space, and provides a basis for interpreting the response of proteins to substitutions in protein engineering applications. PMID:15644440

  15. Correlation between protein sequence similarity and x-ray diffraction quality in the protein data bank.

    PubMed

    Lu, Hui-Meng; Yin, Da-Chuan; Ye, Ya-Jing; Luo, Hui-Min; Geng, Li-Qiang; Li, Hai-Sheng; Guo, Wei-Hong; Shang, Peng

    2009-01-01

    As the most widely utilized technique to determine the 3-dimensional structure of protein molecules, X-ray crystallography can provide structure of the highest resolution among the developed techniques. The resolution obtained via X-ray crystallography is known to be influenced by many factors, such as the crystal quality, diffraction techniques, and X-ray sources, etc. In this paper, the authors found that the protein sequence could also be one of the factors. We extracted information of the resolution and the sequence of proteins from the Protein Data Bank (PDB), classified the proteins into different clusters according to the sequence similarity, and statistically analyzed the relationship between the sequence similarity and the best resolution obtained. The results showed that there was a pronounced correlation between the sequence similarity and the obtained resolution. These results indicate that protein structure itself is one variable that may affect resolution when X-ray crystallography is used.

  16. Characterization of the motion of membrane proteins using high-speed atomic force microscopy

    NASA Astrophysics Data System (ADS)

    Casuso, Ignacio; Khao, Jonathan; Chami, Mohamed; Paul-Gilloteaux, Perrine; Husain, Mohamed; Duneau, Jean-Pierre; Stahlberg, Henning; Sturgis, James N.; Scheuring, Simon

    2012-08-01

    For cells to function properly, membrane proteins must be able to diffuse within biological membranes. The functions of these membrane proteins depend on their position and also on protein-protein and protein-lipid interactions. However, so far, it has not been possible to study simultaneously the structure and dynamics of biological membranes. Here, we show that the motion of unlabelled membrane proteins can be characterized using high-speed atomic force microscopy. We find that the molecules of outer membrane protein F (OmpF) are widely distributed in the membrane as a result of diffusion-limited aggregation, and while the overall protein motion scales roughly with the local density of proteins in the membrane, individual protein molecules can also diffuse freely or become trapped by protein-protein interactions. Using these measurements, and the results of molecular dynamics simulations, we determine an interaction potential map and an interaction pathway for a membrane protein, which should provide new insights into the connection between the structures of individual proteins and the structures and dynamics of supramolecular membranes.

  17. Restricted N-glycan conformational space in the PDB and its implication in glycan structure modeling.

    PubMed

    Jo, Sunhwan; Lee, Hui Sun; Skolnick, Jeffrey; Im, Wonpil

    2013-01-01

    Understanding glycan structure and dynamics is central to understanding protein-carbohydrate recognition and its role in protein-protein interactions. Given the difficulties in obtaining the glycan's crystal structure in glycoconjugates due to its flexibility and heterogeneity, computational modeling could play an important role in providing glycosylated protein structure models. To address if glycan structures available in the PDB can be used as templates or fragments for glycan modeling, we present a survey of the N-glycan structures of 35 different sequences in the PDB. Our statistical analysis shows that the N-glycan structures found on homologous glycoproteins are significantly conserved compared to the random background, suggesting that N-glycan chains can be confidently modeled with template glycan structures whose parent glycoproteins share sequence similarity. On the other hand, N-glycan structures found on non-homologous glycoproteins do not show significant global structural similarity. Nonetheless, the internal substructures of these N-glycans, particularly, the substructures that are closer to the protein, show significantly similar structures, suggesting that such substructures can be used as fragments in glycan modeling. Increased interactions with protein might be responsible for the restricted conformational space of N-glycan chains. Our results suggest that structure prediction/modeling of N-glycans of glycoconjugates using structure database could be effective and different modeling approaches would be needed depending on the availability of template structures.

  18. Restricted N-glycan Conformational Space in the PDB and Its Implication in Glycan Structure Modeling

    PubMed Central

    Jo, Sunhwan; Lee, Hui Sun; Skolnick, Jeffrey; Im, Wonpil

    2013-01-01

    Understanding glycan structure and dynamics is central to understanding protein-carbohydrate recognition and its role in protein-protein interactions. Given the difficulties in obtaining the glycan's crystal structure in glycoconjugates due to its flexibility and heterogeneity, computational modeling could play an important role in providing glycosylated protein structure models. To address if glycan structures available in the PDB can be used as templates or fragments for glycan modeling, we present a survey of the N-glycan structures of 35 different sequences in the PDB. Our statistical analysis shows that the N-glycan structures found on homologous glycoproteins are significantly conserved compared to the random background, suggesting that N-glycan chains can be confidently modeled with template glycan structures whose parent glycoproteins share sequence similarity. On the other hand, N-glycan structures found on non-homologous glycoproteins do not show significant global structural similarity. Nonetheless, the internal substructures of these N-glycans, particularly, the substructures that are closer to the protein, show significantly similar structures, suggesting that such substructures can be used as fragments in glycan modeling. Increased interactions with protein might be responsible for the restricted conformational space of N-glycan chains. Our results suggest that structure prediction/modeling of N-glycans of glycoconjugates using structure database could be effective and different modeling approaches would be needed depending on the availability of template structures. PMID:23516343

  19. SNAPPI-DB: a database and API of Structures, iNterfaces and Alignments for Protein–Protein Interactions

    PubMed Central

    Jefferson, Emily R.; Walsh, Thomas P.; Roberts, Timothy J.; Barton, Geoffrey J.

    2007-01-01

    SNAPPI-DB, a high performance database of Structures, iNterfaces and Alignments of Protein–Protein Interactions, and its associated Java Application Programming Interface (API) is described. SNAPPI-DB contains structural data, down to the level of atom co-ordinates, for each structure in the Protein Data Bank (PDB) together with associated data including SCOP, CATH, Pfam, SWISSPROT, InterPro, GO terms, Protein Quaternary Structures (PQS) and secondary structure information. Domain–domain interactions are stored for multiple domain definitions and are classified by their Superfamily/Family pair and interaction interface. Each set of classified domain–domain interactions has an associated multiple structure alignment for each partner. The API facilitates data access via PDB entries, domains and domain–domain interactions. Rapid development, fast database access and the ability to perform advanced queries without the requirement for complex SQL statements are provided via an object oriented database and the Java Data Objects (JDO) API. SNAPPI-DB contains many features which are not available in other databases of structural protein–protein interactions. It has been applied in three studies on the properties of protein–protein interactions and is currently being employed to train a protein–protein interaction predictor and a functional residue predictor. The database, API and manual are available for download at: . PMID:17202171

  20. Structure prediction of polyglutamine disease proteins: comparison of methods

    PubMed Central

    2014-01-01

    Background The expansion of polyglutamine (poly-Q) repeats in several unrelated proteins is associated with at least ten neurodegenerative diseases. The length of the poly-Q regions plays an important role in the progression of the diseases. The number of glutamines (Q) is inversely related to the onset age of these polyglutamine diseases, and the expansion of poly-Q repeats has been associated with protein misfolding. However, very little is known about the structural changes induced by the expansion of the repeats. Computational methods can provide an alternative to determine the structure of these poly-Q proteins, but it is important to evaluate their performance before large scale prediction work is done. Results In this paper, two popular protein structure prediction programs, I-TASSER and Rosetta, have been used to predict the structure of the N-terminal fragment of a protein associated with Huntington's disease with 17 glutamines. Results show that both programs have the ability to find the native structures, but I-TASSER performs better for the overall task. Conclusions Both I-TASSER and Rosetta can be used for structure prediction of proteins with poly-Q repeats. Knowledge of poly-Q structure may significantly contribute to development of therapeutic strategies for poly-Q diseases. PMID:25080018

  1. POLYVIEW-MM: web-based platform for animation and analysis of molecular simulations

    PubMed Central

    Porollo, Aleksey; Meller, Jaroslaw

    2010-01-01

    Molecular simulations offer important mechanistic and functional clues in studies of proteins and other macromolecules. However, interpreting the results of such simulations increasingly requires tools that can combine information from multiple structural databases and other web resources, and provide highly integrated and versatile analysis tools. Here, we present a new web server that integrates high-quality animation of molecular motion (MM) with structural and functional analysis of macromolecules. The new tool, dubbed POLYVIEW-MM, enables animation of trajectories generated by molecular dynamics and related simulation techniques, as well as visualization of alternative conformers, e.g. obtained as a result of protein structure prediction methods or small molecule docking. To facilitate structural analysis, POLYVIEW-MM combines interactive view and analysis of conformational changes using Jmol and its tailored extensions, publication quality animation using PyMol, and customizable 2D summary plots that provide an overview of MM, e.g. in terms of changes in secondary structure states and relative solvent accessibility of individual residues in proteins. Furthermore, POLYVIEW-MM integrates visualization with various structural annotations, including automated mapping of known inter-action sites from structural homologs, mapping of cavities and ligand binding sites, transmembrane regions and protein domains. URL: http://polyview.cchmc.org/conform.html. PMID:20504857

  2. pE-DB: a database of structural ensembles of intrinsically disordered and of unfolded proteins.

    PubMed

    Varadi, Mihaly; Kosol, Simone; Lebrun, Pierre; Valentini, Erica; Blackledge, Martin; Dunker, A Keith; Felli, Isabella C; Forman-Kay, Julie D; Kriwacki, Richard W; Pierattelli, Roberta; Sussman, Joel; Svergun, Dmitri I; Uversky, Vladimir N; Vendruscolo, Michele; Wishart, David; Wright, Peter E; Tompa, Peter

    2014-01-01

    The goal of pE-DB (http://pedb.vib.be) is to serve as an openly accessible database for the deposition of structural ensembles of intrinsically disordered proteins (IDPs) and of denatured proteins based on nuclear magnetic resonance spectroscopy, small-angle X-ray scattering and other data measured in solution. Owing to the inherent flexibility of IDPs, solution techniques are particularly appropriate for characterizing their biophysical properties, and structural ensembles in agreement with these data provide a convenient tool for describing the underlying conformational sampling. Database entries consist of (i) primary experimental data with descriptions of the acquisition methods and algorithms used for the ensemble calculations, and (ii) the structural ensembles consistent with these data, provided as a set of models in a Protein Data Bank format. PE-DB is open for submissions from the community, and is intended as a forum for disseminating the structural ensembles and the methodologies used to generate them. While the need to represent the IDP structures is clear, methods for determining and evaluating the structural ensembles are still evolving. The availability of the pE-DB database is expected to promote the development of new modeling methods and leads to a better understanding of how function arises from disordered states.

  3. Alteration of fluorescent protein spectroscopic properties upon cryoprotection.

    PubMed

    von Stetten, David; Batot, Gaëlle O; Noirclerc-Savoye, Marjolaine; Royant, Antoine

    2012-11-01

    Cryoprotection of a protein crystal by addition of small-molecule compounds may sometimes affect the structure of its active site. The spectroscopic and structural effects of the two cryoprotectants glycerol and ethylene glycol on the cyan fluorescent protein Cerulean were investigated. While glycerol had almost no noticeable effect, ethylene glycol was shown to induce a systematic red shift of the UV-vis absorption and fluorescence emission spectra. Additionally, ethylene glycol molecules were shown to enter the core of the protein, with one of them binding in close vicinity to the chromophore, which provides a sound explanation for the observed spectroscopic changes. These results highlight the need to systematically record spectroscopic data on crystals of light-absorbing proteins and reinforce the notion that fluorescent proteins must not been seen as rigid structures.

  4. Mechanistic aspects of protein corona formation: insulin adsorption onto gold nanoparticle surfaces

    NASA Astrophysics Data System (ADS)

    Grass, Stefan; Treuel, Lennart

    2014-02-01

    In biological fluids, an adsorption layer of proteins, a "protein corona" forms around nanoparticles (NPs) largely determining their biological identity. In many interactions with NPs proteins can undergo structural changes. Here, we study the adsorption of insulin onto gold NPs (mean hydrodynamic particle diameter 80 ± 18 nm), focusing on the structural consequences of the adsorption process for the protein. We use surface enhanced Raman scattering (SERS) spectroscopy to study changes in the protein's secondary structure as well as the impact on integrity and conformations of disulfide bonds immediately on the NP surface. A detailed comparison to SERS spectra of cysteine and cystine provides first mechanistic insights into the causes for these conformational changes. Potential biological and toxicological implications of these findings are also discussed.

  5. The Significance of G Protein-Coupled Receptor Crystallography for Drug Discovery

    PubMed Central

    Salon, John A.; Lodowski, David T.

    2011-01-01

    Crucial as molecular sensors for many vital physiological processes, seven-transmembrane domain G protein-coupled receptors (GPCRs) comprise the largest family of proteins targeted by drug discovery. Together with structures of the prototypical GPCR rhodopsin, solved structures of other liganded GPCRs promise to provide insights into the structural basis of the superfamily's biochemical functions and assist in the development of new therapeutic modalities and drugs. One of the greatest technical and theoretical challenges to elucidating and exploiting structure-function relationships in these systems is the emerging concept of GPCR conformational flexibility and its cause-effect relationship for receptor-receptor and receptor-effector interactions. Such conformational changes can be subtle and triggered by relatively small binding energy effects, leading to full or partial efficacy in the activation or inactivation of the receptor system at large. Pharmacological dogma generally dictates that these changes manifest themselves through kinetic modulation of the receptor's G protein partners. Atomic resolution information derived from increasingly available receptor structures provides an entrée to the understanding of these events and practically applying it to drug design. Supported by structure-activity relationship information arising from empirical screening, a unified structural model of GPCR activation/inactivation promises to both accelerate drug discovery in this field and improve our fundamental understanding of structure-based drug design in general. This review discusses fundamental problems that persist in drug design and GPCR structural determination. PMID:21969326

  6. Protein structure determination by exhaustive search of Protein Data Bank derived databases.

    PubMed

    Stokes-Rees, Ian; Sliz, Piotr

    2010-12-14

    Parallel sequence and structure alignment tools have become ubiquitous and invaluable at all levels in the study of biological systems. We demonstrate the application and utility of this same parallel search paradigm to the process of protein structure determination, benefitting from the large and growing corpus of known structures. Such searches were previously computationally intractable. Through the method of Wide Search Molecular Replacement, developed here, they can be completed in a few hours with the aide of national-scale federated cyberinfrastructure. By dramatically expanding the range of models considered for structure determination, we show that small (less than 12% structural coverage) and low sequence identity (less than 20% identity) template structures can be identified through multidimensional template scoring metrics and used for structure determination. Many new macromolecular complexes can benefit significantly from such a technique due to the lack of known homologous protein folds or sequences. We demonstrate the effectiveness of the method by determining the structure of a full-length p97 homologue from Trichoplusia ni. Example cases with the MHC/T-cell receptor complex and the EmoB protein provide systematic estimates of minimum sequence identity, structure coverage, and structural similarity required for this method to succeed. We describe how this structure-search approach and other novel computationally intensive workflows are made tractable through integration with the US national computational cyberinfrastructure, allowing, for example, rapid processing of the entire Structural Classification of Proteins protein fragment database.

  7. Towards control of aggregational behaviour of alpha-lactalbumin at acidic pH.

    PubMed

    Pedersen, Jane B; Fojan, Peter; Sorensen, John; Petersen, Steffen B

    2006-07-01

    alpha-Lactalbumin (alpha-La) undergoes considerable structural changes upon loss of bound Ca2+ at acidic pH, leaving alpha-La in a molten globule structure. Using fluorescence the present work provides more insight into the structural transition of alpha-La at acidic pH leading to protein aggregation, most likely caused by a combination of hydrophobic and electrostatic interactions. The rate of aggregation is determined by the protein concentration and temperature applied. Availability of Ca2+ stabilises the protein, and thus prevent aggregation at pH values as low as pH 2.9. In contrast, presence of Cu2+ induces a destabilisation of the protein, which can be explained by a binding to the Zn2+ binding site in alpha-La, possibly resulting in structural alterations of the protein. In general, presence of anions destabilize alpha-La at pH values below pI, with SO4(2-) exhibiting the strongest effect on the protein stability, thus correlating well with the Hofmeister series. At more acidic pH values far from pI, alpha-La becomes more stable towards ion induced aggregation, since higher ion activity is required to efficiently screen the charges on the protein surface. The results presented in this paper provide detailed knowledge on the external parameters leading to aggregation of alpha-La at acidic pH, thus permitting rational design of the aggregation process.

  8. A method for partitioning the information contained in a protein sequence between its structure and function.

    PubMed

    Possenti, Andrea; Vendruscolo, Michele; Camilloni, Carlo; Tiana, Guido

    2018-05-23

    Proteins employ the information stored in the genetic code and translated into their sequences to carry out well-defined functions in the cellular environment. The possibility to encode for such functions is controlled by the balance between the amount of information supplied by the sequence and that left after that the protein has folded into its structure. We study the amount of information necessary to specify the protein structure, providing an estimate that keeps into account the thermodynamic properties of protein folding. We thus show that the information remaining in the protein sequence after encoding for its structure (the 'information gap') is very close to what needed to encode for its function and interactions. Then, by predicting the information gap directly from the protein sequence, we show that it may be possible to use these insights from information theory to discriminate between ordered and disordered proteins, to identify unknown functions, and to optimize artificially-designed protein sequences. This article is protected by copyright. All rights reserved. © 2018 Wiley Periodicals, Inc.

  9. Visualizing water molecules in transmembrane proteins using radiolytic labeling methods†

    PubMed Central

    Orban, Tivadar; Gupta, Sayan; Palczewski, Krzysztof; Chance, Mark R.

    2010-01-01

    Essential to cells and their organelles, water is both shuttled to where it is needed and trapped within cellular compartments and structures. Moreover, ordered waters within protein structures often co-localize with strategically placed polar or charged groups critical for protein function. Yet it is unclear if these ordered water molecules provide structural stabilization, mediate conformational changes in signaling, neutralize charged residues, or carry out a combination of all these functions. Structures of many integral membrane proteins, including G protein-coupled receptors (GPCRs), reveal the presence of ordered water molecules that may act like prosthetic groups in a manner quite unlike bulk water. Identification of ‘ordered’ waters within a crystalline protein structure requires sufficient occupancy of water to enable its detection in the protein's X-ray diffraction pattern and thus the observed waters likely represent a subset of tightly-bound functional waters. In this review, we highlight recent studies that suggest the structures of ordered waters within GPCRs are as conserved (and thus as important) as conserved side chains. In addition, methods of radiolysis, coupled to structural mass spectrometry (protein footprinting), reveal dynamic changes in water structure that mediate transmembrane signaling. The idea of water as a prosthetic group mediating chemical reaction dynamics is not new in fields such as catalysis. However, the concept of water as a mediator of conformational dynamics in signaling is just emerging, owing to advances in both crystallographic structure determination and new methods of protein footprinting. Although oil and water do not mix, understanding the roles of water is essential to understanding the function of membrane proteins. PMID:20047303

  10. 3D Complex: A Structural Classification of Protein Complexes

    PubMed Central

    Levy, Emmanuel D; Pereira-Leal, Jose B; Chothia, Cyrus; Teichmann, Sarah A

    2006-01-01

    Most of the proteins in a cell assemble into complexes to carry out their function. It is therefore crucial to understand the physicochemical properties as well as the evolution of interactions between proteins. The Protein Data Bank represents an important source of information for such studies, because more than half of the structures are homo- or heteromeric protein complexes. Here we propose the first hierarchical classification of whole protein complexes of known 3-D structure, based on representing their fundamental structural features as a graph. This classification provides the first overview of all the complexes in the Protein Data Bank and allows nonredundant sets to be derived at different levels of detail. This reveals that between one-half and two-thirds of known structures are multimeric, depending on the level of redundancy accepted. We also analyse the structures in terms of the topological arrangement of their subunits and find that they form a small number of arrangements compared with all theoretically possible ones. This is because most complexes contain four subunits or less, and the large majority are homomeric. In addition, there is a strong tendency for symmetry in complexes, even for heteromeric complexes. Finally, through comparison of Biological Units in the Protein Data Bank with the Protein Quaternary Structure database, we identified many possible errors in quaternary structure assignments. Our classification, available as a database and Web server at http://www.3Dcomplex.org, will be a starting point for future work aimed at understanding the structure and evolution of protein complexes. PMID:17112313

  11. Struct2Net: a web service to predict protein–protein interactions using a structure-based approach

    PubMed Central

    Singh, Rohit; Park, Daniel; Xu, Jinbo; Hosur, Raghavendra; Berger, Bonnie

    2010-01-01

    Struct2Net is a web server for predicting interactions between arbitrary protein pairs using a structure-based approach. Prediction of protein–protein interactions (PPIs) is a central area of interest and successful prediction would provide leads for experiments and drug design; however, the experimental coverage of the PPI interactome remains inadequate. We believe that Struct2Net is the first community-wide resource to provide structure-based PPI predictions that go beyond homology modeling. Also, most web-resources for predicting PPIs currently rely on functional genomic data (e.g. GO annotation, gene expression, cellular localization, etc.). Our structure-based approach is independent of such methods and only requires the sequence information of the proteins being queried. The web service allows multiple querying options, aimed at maximizing flexibility. For the most commonly studied organisms (fly, human and yeast), predictions have been pre-computed and can be retrieved almost instantaneously. For proteins from other species, users have the option of getting a quick-but-approximate result (using orthology over pre-computed results) or having a full-blown computation performed. The web service is freely available at http://struct2net.csail.mit.edu. PMID:20513650

  12. A Circular Dichroism Reference Database for Membrane Proteins

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wallace,B.; Wien, F.; Stone, T.

    2006-01-01

    Membrane proteins are a major product of most genomes and the target of a large number of current pharmaceuticals, yet little information exists on their structures because of the difficulty of crystallising them; hence for the most part they have been excluded from structural genomics programme targets. Furthermore, even methods such as circular dichroism (CD) spectroscopy which seek to define secondary structure have not been fully exploited because of technical limitations to their interpretation for membrane embedded proteins. Empirical analyses of circular dichroism (CD) spectra are valuable for providing information on secondary structures of proteins. However, the accuracy of themore » results depends on the appropriateness of the reference databases used in the analyses. Membrane proteins have different spectral characteristics than do soluble proteins as a result of the low dielectric constants of membrane bilayers relative to those of aqueous solutions (Chen & Wallace (1997) Biophys. Chem. 65:65-74). To date, no CD reference database exists exclusively for the analysis of membrane proteins, and hence empirical analyses based on current reference databases derived from soluble proteins are not adequate for accurate analyses of membrane protein secondary structures (Wallace et al (2003) Prot. Sci. 12:875-884). We have therefore created a new reference database of CD spectra of integral membrane proteins whose crystal structures have been determined. To date it contains more than 20 proteins, and spans the range of secondary structures from mostly helical to mostly sheet proteins. This reference database should enable more accurate secondary structure determinations of membrane embedded proteins and will become one of the reference database options in the CD calculation server DICHROWEB (Whitmore & Wallace (2004) NAR 32:W668-673).« less

  13. Functional classification of protein structures by local structure matching in graph representation.

    PubMed

    Mills, Caitlyn L; Garg, Rohan; Lee, Joslynn S; Tian, Liang; Suciu, Alexandru; Cooperman, Gene; Beuning, Penny J; Ondrechen, Mary Jo

    2018-03-31

    As a result of high-throughput protein structure initiatives, over 14,400 protein structures have been solved by structural genomics (SG) centers and participating research groups. While the totality of SG data represents a tremendous contribution to genomics and structural biology, reliable functional information for these proteins is generally lacking. Better functional predictions for SG proteins will add substantial value to the structural information already obtained. Our method described herein, Graph Representation of Active Sites for Prediction of Function (GRASP-Func), predicts quickly and accurately the biochemical function of proteins by representing residues at the predicted local active site as graphs rather than in Cartesian coordinates. We compare the GRASP-Func method to our previously reported method, structurally aligned local sites of activity (SALSA), using the ribulose phosphate binding barrel (RPBB), 6-hairpin glycosidase (6-HG), and Concanavalin A-like Lectins/Glucanase (CAL/G) superfamilies as test cases. In each of the superfamilies, SALSA and the much faster method GRASP-Func yield similar correct classification of previously characterized proteins, providing a validated benchmark for the new method. In addition, we analyzed SG proteins using our SALSA and GRASP-Func methods to predict function. Forty-one SG proteins in the RPBB superfamily, nine SG proteins in the 6-HG superfamily, and one SG protein in the CAL/G superfamily were successfully classified into one of the functional families in their respective superfamily by both methods. This improved, faster, validated computational method can yield more reliable predictions of function that can be used for a wide variety of applications by the community. © 2018 The Authors Protein Science published by Wiley Periodicals, Inc. on behalf of The Protein Society.

  14. iATTRACT: simultaneous global and local interface optimization for protein-protein docking refinement.

    PubMed

    Schindler, Christina E M; de Vries, Sjoerd J; Zacharias, Martin

    2015-02-01

    Protein-protein interactions are abundant in the cell but to date structural data for a large number of complexes is lacking. Computational docking methods can complement experiments by providing structural models of complexes based on structures of the individual partners. A major caveat for docking success is accounting for protein flexibility. Especially, interface residues undergo significant conformational changes upon binding. This limits the performance of docking methods that keep partner structures rigid or allow limited flexibility. A new docking refinement approach, iATTRACT, has been developed which combines simultaneous full interface flexibility and rigid body optimizations during docking energy minimization. It employs an atomistic molecular mechanics force field for intermolecular interface interactions and a structure-based force field for intramolecular contributions. The approach was systematically evaluated on a large protein-protein docking benchmark, starting from an enriched decoy set of rigidly docked protein-protein complexes deviating by up to 15 Å from the native structure at the interface. Large improvements in sampling and slight but significant improvements in scoring/discrimination of near native docking solutions were observed. Complexes with initial deviations at the interface of up to 5.5 Å were refined to significantly better agreement with the native structure. Improvements in the fraction of native contacts were especially favorable, yielding increases of up to 70%. © 2014 Wiley Periodicals, Inc.

  15. GIRAF: a method for fast search and flexible alignment of ligand binding interfaces in proteins at atomic resolution

    PubMed Central

    Kinjo, Akira R.; Nakamura, Haruki

    2012-01-01

    Comparison and classification of protein structures are fundamental means to understand protein functions. Due to the computational difficulty and the ever-increasing amount of structural data, however, it is in general not feasible to perform exhaustive all-against-all structure comparisons necessary for comprehensive classifications. To efficiently handle such situations, we have previously proposed a method, now called GIRAF. We herein describe further improvements in the GIRAF protein structure search and alignment method. The GIRAF method achieves extremely efficient search of similar structures of ligand binding sites of proteins by exploiting database indexing of structural features of local coordinate frames. In addition, it produces refined atom-wise alignments by iterative applications of the Hungarian method to the bipartite graph defined for a pair of superimposed structures. By combining the refined alignments based on different local coordinate frames, it is made possible to align structures involving domain movements. We provide detailed accounts for the database design, the search and alignment algorithms as well as some benchmark results. PMID:27493524

  16. @TOME-2: a new pipeline for comparative modeling of protein-ligand complexes.

    PubMed

    Pons, Jean-Luc; Labesse, Gilles

    2009-07-01

    @TOME 2.0 is new web pipeline dedicated to protein structure modeling and small ligand docking based on comparative analyses. @TOME 2.0 allows fold recognition, template selection, structural alignment editing, structure comparisons, 3D-model building and evaluation. These tasks are routinely used in sequence analyses for structure prediction. In our pipeline the necessary software is efficiently interconnected in an original manner to accelerate all the processes. Furthermore, we have also connected comparative docking of small ligands that is performed using protein-protein superposition. The input is a simple protein sequence in one-letter code with no comment. The resulting 3D model, protein-ligand complexes and structural alignments can be visualized through dedicated Web interfaces or can be downloaded for further studies. These original features will aid in the functional annotation of proteins and the selection of templates for molecular modeling and virtual screening. Several examples are described to highlight some of the new functionalities provided by this pipeline. The server and its documentation are freely available at http://abcis.cbs.cnrs.fr/AT2/

  17. Tracing Primordial Protein Evolution through Structurally Guided Stepwise Segment Elongation*

    PubMed Central

    Watanabe, Hideki; Yamasaki, Kazuhiko; Honda, Shinya

    2014-01-01

    The understanding of how primordial proteins emerged has been a fundamental and longstanding issue in biology and biochemistry. For a better understanding of primordial protein evolution, we synthesized an artificial protein on the basis of an evolutionary hypothesis, segment-based elongation starting from an autonomously foldable short peptide. A 10-residue protein, chignolin, the smallest foldable polypeptide ever reported, was used as a structural support to facilitate higher structural organization and gain-of-function in the development of an artificial protein. Repetitive cycles of segment elongation and subsequent phage display selection successfully produced a 25-residue protein, termed AF.2A1, with nanomolar affinity against the Fc region of immunoglobulin G. AF.2A1 shows exquisite molecular recognition ability such that it can distinguish conformational differences of the same molecule. The structure determined by NMR measurements demonstrated that AF.2A1 forms a globular protein-like conformation with the chignolin-derived β-hairpin and a tryptophan-mediated hydrophobic core. Using sequence analysis and a mutation study, we discovered that the structural organization and gain-of-function emerged from the vicinity of the chignolin segment, revealing that the structural support served as the core in both structural and functional development. Here, we propose an evolutionary model for primordial proteins in which a foldable segment serves as the evolving core to facilitate structural and functional evolution. This study provides insights into primordial protein evolution and also presents a novel methodology for designing small sized proteins useful for industrial and pharmaceutical applications. PMID:24356963

  18. Permeating disciplines: Overcoming barriers between molecular simulations and classical structure-function approaches in biological ion transport.

    PubMed

    Howard, Rebecca J; Carnevale, Vincenzo; Delemotte, Lucie; Hellmich, Ute A; Rothberg, Brad S

    2018-04-01

    Ion translocation across biological barriers is a fundamental requirement for life. In many cases, controlling this process-for example with neuroactive drugs-demands an understanding of rapid and reversible structural changes in membrane-embedded proteins, including ion channels and transporters. Classical approaches to electrophysiology and structural biology have provided valuable insights into several such proteins over macroscopic, often discontinuous scales of space and time. Integrating these observations into meaningful mechanistic models now relies increasingly on computational methods, particularly molecular dynamics simulations, while surfacing important challenges in data management and conceptual alignment. Here, we seek to provide contemporary context, concrete examples, and a look to the future for bridging disciplinary gaps in biological ion transport. This article is part of a Special Issue entitled: Beyond the Structure-Function Horizon of Membrane Proteins edited by Ute Hellmich, Rupak Doshi and Benjamin McIlwain. Copyright © 2017 Elsevier B.V. All rights reserved.

  19. ZifBASE: a database of zinc finger proteins and associated resources.

    PubMed

    Jayakanthan, Mannu; Muthukumaran, Jayaraman; Chandrasekar, Sanniyasi; Chawla, Konika; Punetha, Ankita; Sundar, Durai

    2009-09-09

    Information on the occurrence of zinc finger protein motifs in genomes is crucial to the developing field of molecular genome engineering. The knowledge of their target DNA-binding sequences is vital to develop chimeric proteins for targeted genome engineering and site-specific gene correction. There is a need to develop a computational resource of zinc finger proteins (ZFP) to identify the potential binding sites and its location, which reduce the time of in vivo task, and overcome the difficulties in selecting the specific type of zinc finger protein and the target site in the DNA sequence. ZifBASE provides an extensive collection of various natural and engineered ZFP. It uses standard names and a genetic and structural classification scheme to present data retrieved from UniProtKB, GenBank, Protein Data Bank, ModBase, Protein Model Portal and the literature. It also incorporates specialized features of ZFP including finger sequences and positions, number of fingers, physiochemical properties, classes, framework, PubMed citations with links to experimental structures (PDB, if available) and modeled structures of natural zinc finger proteins. ZifBASE provides information on zinc finger proteins (both natural and engineered ones), the number of finger units in each of the zinc finger proteins (with multiple fingers), the synergy between the adjacent fingers and their positions. Additionally, it gives the individual finger sequence and their target DNA site to which it binds for better and clear understanding on the interactions of adjacent fingers. The current version of ZifBASE contains 139 entries of which 89 are engineered ZFPs, containing 3-7F totaling to 296 fingers. There are 50 natural zinc finger protein entries ranging from 2-13F, totaling to 307 fingers. It has sequences and structures from literature, Protein Data Bank, ModBase and Protein Model Portal. The interface is cross linked to other public databases like UniprotKB, PDB, ModBase and Protein Model Portal and PubMed for making it more informative. A database is established to maintain the information of the sequence features, including the class, framework, number of fingers, residues, position, recognition site and physio-chemical properties (molecular weight, isoelectric point) of both natural and engineered zinc finger proteins and dissociation constant of few. ZifBASE can provide more effective and efficient way of accessing the zinc finger protein sequences and their target binding sites with the links to their three-dimensional structures. All the data and functions are available at the advanced web-based search interface http://web.iitd.ac.in/~sundar/zifbase.

  20. Protein nanocrystallography: growth mechanism and atomic structure of crystals induced by nanotemplates.

    PubMed

    Pechkova, E; Vasile, F; Spera, R; Fiordoro, S; Nicolini, C

    2005-11-01

    Protein nanocrystallography, a new technology for crystal growth based on protein nanotemplates, has recently been shown to produce diffracting, stable and radiation-resistant lysozyme crystals. This article, by computing these lysozyme crystals' atomic structures, obtained by the diffraction patterns of microfocused synchrotron radiation, provides a possible mechanism for this increased stability, namely a significant decrease in water content accompanied by a minor but significant alpha-helix increase. These data are shown to be compatible with the circular dichroism and two-dimensional Fourier transform spectra of high-resolution H NMR of proteins dissolved from the same nanotemplate-based crystal versus those from a classical crystal. Finally, evidence for protein direct transfer from the nanotemplate to the drop and the participation of the template proteins in crystal nucleation and growth is provided by high-resolution NMR spectrometry and mass spectrometry. Furthermore, the lysozyme nanotemplate appears stable up to 523 K, as confirmed by a thermal denaturation study using spectropolarimetry. The overall data suggest that heat-proof lysozyme presence in the crystal provides a possible explanation of the crystal's resistance to synchrotron radiation.

  1. Reaction trajectory revealed by a joint analysis of protein data bank.

    PubMed

    Ren, Zhong

    2013-01-01

    Structural motions along a reaction pathway hold the secret about how a biological macromolecule functions. If each static structure were considered as a snapshot of the protein molecule in action, a large collection of structures would constitute a multidimensional conformational space of an enormous size. Here I present a joint analysis of hundreds of known structures of human hemoglobin in the Protein Data Bank. By applying singular value decomposition to distance matrices of these structures, I demonstrate that this large collection of structural snapshots, derived under a wide range of experimental conditions, arrange orderly along a reaction pathway. The structural motions along this extensive trajectory, including several helical transformations, arrive at a reverse engineered mechanism of the cooperative machinery (Ren, companion article), and shed light on pathological properties of the abnormal homotetrameric hemoglobins from α-thalassemia. This method of meta-analysis provides a general approach to structural dynamics based on static protein structures in this post genomics era.

  2. Reaction Trajectory Revealed by a Joint Analysis of Protein Data Bank

    PubMed Central

    Ren, Zhong

    2013-01-01

    Structural motions along a reaction pathway hold the secret about how a biological macromolecule functions. If each static structure were considered as a snapshot of the protein molecule in action, a large collection of structures would constitute a multidimensional conformational space of an enormous size. Here I present a joint analysis of hundreds of known structures of human hemoglobin in the Protein Data Bank. By applying singular value decomposition to distance matrices of these structures, I demonstrate that this large collection of structural snapshots, derived under a wide range of experimental conditions, arrange orderly along a reaction pathway. The structural motions along this extensive trajectory, including several helical transformations, arrive at a reverse engineered mechanism of the cooperative machinery (Ren, companion article), and shed light on pathological properties of the abnormal homotetrameric hemoglobins from α-thalassemia. This method of meta-analysis provides a general approach to structural dynamics based on static protein structures in this post genomics era. PMID:24244274

  3. Crystal Structure of the Human, FIC-Domain Containing Protein HYPE and Implications for Its Functions

    PubMed Central

    Bunney, Tom D.; Cole, Ambrose R.; Broncel, Malgorzata; Esposito, Diego; Tate, Edward W.; Katan, Matilda

    2014-01-01

    Summary Protein AMPylation, the transfer of AMP from ATP to protein targets, has been recognized as a new mechanism of host-cell disruption by some bacterial effectors that typically contain a FIC-domain. Eukaryotic genomes also encode one FIC-domain protein, HYPE, which has remained poorly characterized. Here we describe the structure of human HYPE, solved by X-ray crystallography, representing the first structure of a eukaryotic FIC-domain protein. We demonstrate that HYPE forms stable dimers with structurally and functionally integrated FIC-domains and with TPR-motifs exposed for protein-protein interactions. As HYPE also uniquely possesses a transmembrane helix, dimerization is likely to affect its positioning and function in the membrane vicinity. The low rate of autoAMPylation of the wild-type HYPE could be due to autoinhibition, consistent with the mechanism proposed for a number of putative FIC AMPylators. Our findings also provide a basis to further consider possible alternative cofactors of HYPE and distinct modes of target-recognition. PMID:25435325

  4. Crystal structure of the human, FIC-domain containing protein HYPE and implications for its functions.

    PubMed

    Bunney, Tom D; Cole, Ambrose R; Broncel, Malgorzata; Esposito, Diego; Tate, Edward W; Katan, Matilda

    2014-12-02

    Protein AMPylation, the transfer of AMP from ATP to protein targets, has been recognized as a new mechanism of host-cell disruption by some bacterial effectors that typically contain a FIC-domain. Eukaryotic genomes also encode one FIC-domain protein,HYPE, which has remained poorly characterized.Here we describe the structure of human HYPE, solved by X-ray crystallography, representing the first structure of a eukaryotic FIC-domain protein. We demonstrate that HYPE forms stable dimers with structurally and functionally integrated FIC-domains and with TPR-motifs exposed for protein-protein interactions. As HYPE also uniquely possesses a transmembrane helix, dimerization is likely to affect its positioning and function in the membrane vicinity. The low rate of auto AMPylation of the wild-type HYPE could be due to autoinhibition, consistent with the mechanism proposed for a number of putative FIC AMPylators. Our findings also provide a basis to further consider possible alternative cofactors of HYPE and distinct modes of target-recognition.

  5. Accurate prediction of interfacial residues in two-domain proteins using evolutionary information: implications for three-dimensional modeling.

    PubMed

    Bhaskara, Ramachandra M; Padhi, Amrita; Srinivasan, Narayanaswamy

    2014-07-01

    With the preponderance of multidomain proteins in eukaryotic genomes, it is essential to recognize the constituent domains and their functions. Often function involves communications across the domain interfaces, and the knowledge of the interacting sites is essential to our understanding of the structure-function relationship. Using evolutionary information extracted from homologous domains in at least two diverse domain architectures (single and multidomain), we predict the interface residues corresponding to domains from the two-domain proteins. We also use information from the three-dimensional structures of individual domains of two-domain proteins to train naïve Bayes classifier model to predict the interfacial residues. Our predictions are highly accurate (∼85%) and specific (∼95%) to the domain-domain interfaces. This method is specific to multidomain proteins which contain domains in at least more than one protein architectural context. Using predicted residues to constrain domain-domain interaction, rigid-body docking was able to provide us with accurate full-length protein structures with correct orientation of domains. We believe that these results can be of considerable interest toward rational protein and interaction design, apart from providing us with valuable information on the nature of interactions. © 2013 Wiley Periodicals, Inc.

  6. Rotavirus architecture at subnanometer resolution.

    PubMed

    Li, Zongli; Baker, Matthew L; Jiang, Wen; Estes, Mary K; Prasad, B V Venkataram

    2009-02-01

    Rotavirus, a nonturreted member of the Reoviridae, is the causative agent of severe infantile diarrhea. The double-stranded RNA genome encodes six structural proteins that make up the triple-layer particle. X-ray crystallography has elucidated the structure of one of these capsid proteins, VP6, and two domains from VP4, the spike protein. Complementing this work, electron cryomicroscopy (cryoEM) has provided relatively low-resolution structures for the triple-layer capsid in several biochemical states. However, a complete, high-resolution structural model of rotavirus remains unresolved. Combining new structural analysis techniques with the subnanometer-resolution cryoEM structure of rotavirus, we now provide a more detailed structural model for the major capsid proteins and their interactions within the triple-layer particle. Through a series of intersubunit interactions, the spike protein (VP4) adopts a dimeric appearance above the capsid surface, while forming a trimeric base anchored inside one of the three types of aqueous channels between VP7 and VP6 capsid layers. While the trimeric base suggests the presence of three VP4 molecules in one spike, only hints of the third molecule are observed above the capsid surface. Beyond their interactions with VP4, the interactions between VP6 and VP7 subunits could also be readily identified. In the innermost T=1 layer composed of VP2, visualization of the secondary structure elements allowed us to identify the polypeptide fold for VP2 and examine the complex network of interactions between this layer and the T=13 VP6 layer. This integrated structural approach has resulted in a relatively high-resolution structural model for the complete, infectious structure of rotavirus, as well as revealing the subtle nuances required for maintaining interactions in such a large macromolecular assembly.

  7. Alanine and proline content modulate global sensitivity to discrete perturbations in disordered proteins

    PubMed Central

    Perez, Romel B.; Tischer, Alexander; Auton, Matthew; Whitten, Steven T.

    2014-01-01

    Molecular transduction of biological signals is understood primarily in terms of the cooperative structural transitions of protein macromolecules, providing a mechanism through which discrete local structure perturbations affect global macromolecular properties. The recognition that proteins lacking tertiary stability, commonly referred to as intrinsically disordered proteins, mediate key signaling pathways suggests that protein structures without cooperative intramolecular interactions may also have the ability to couple local and global structure changes. Presented here are results from experiments that measured and tested the ability of disordered proteins to couple local changes in structure to global changes in structure. Using the intrinsically disordered N-terminal region of the p53 protein as an experimental model, a set of proline and alanine to glycine substitution variants were designed to modulate backbone conformational propensities without introducing non-native intramolecular interactions. The hydrodynamic radius (Rh) was used to monitor changes in global structure. Circular dichroism spectroscopy showed that the glycine substitutions decreased polyproline II (PPII) propensities relative to the wild type, as expected, and fluorescence methods indicated that substitution-induced changes in Rh were not associated with folding. The experiments showed that changes in local PPII structure cause changes in Rh that are variable and that depend on the intrinsic chain propensities of proline and alanine residues, demonstrating a mechanism for coupling local and global structure changes. Molecular simulations that model our results were used to extend the analysis to other proteins and illustrate the generality of the observed proline and alanine effects on the structures of intrinsically disordered proteins. PMID:25244701

  8. Multifunctional recombinant phycobiliprotein-based fluorescent constructs and phycobilisome display

    DOEpatents

    Glazer, Alexander N.; Cai, Yuping

    2007-01-30

    The invention provides multifunctional fusion constructs which are rapidly incorporated into a macromolecular structure such as a phycobilisome such that the fusion proteins are separated from one another and unable to self-associate. The invention provides methods and compositions for displaying a functional polypeptide domain on an oligomeric phycobiliprotein, including fusion proteins comprising a functional displayed domain and a functional phycobiliprotein domain incorporated in a functional oligomeric phycobiliprotein. The fusion proteins provide novel specific labeling reagents.

  9. Multifunctional recombinant phycobiliprotein-based fluorescent constructs and phycobilisome display

    DOEpatents

    Glazer, Alexander N.; Cai, Yuping

    2007-02-13

    The invention provides multifunctional fusion constructs which are rapidly incorporated into a macromolecular structure such as a phycobilisome such that the fusion proteins are separated from one another and unable to self-associate. The invention provides methods and compositions for displaying a functional polypeptide domain on an oligomeric phycobiliprotein. including fusion proteins comprising a functional displayed domain and a functional phycobiliprotein domain incorporated in a functional oligomeric phycobiliprotein. The fusion proteins provide novel specific labeling reagents.

  10. Multifunctional recombinant phycobiliprotein-based fluorescent constructs and phycobilisome display

    DOEpatents

    Glazer, Alexander N.; Cai, Yuping

    2003-11-18

    The invention provides multifunctional fusion constructs which are rapidly incorporated into a macromolecular structure such as a phycobilisome such that the fusion proteins are separated from one another and unable to self-associate. The invention provides methods and compositions for displaying a functional polypeptide domain on an oligomeric phycobiliprotein, including fusion proteins comprising a functional displayed domain and a functional phycobiliprotein domain incorporated in a functional oligomeric phycobiliprotein. The fusion proteins provide novel specific labeling reagents.

  11. Models of protein-ligand crystal structures: trust, but verify.

    PubMed

    Deller, Marc C; Rupp, Bernhard

    2015-09-01

    X-ray crystallography provides the most accurate models of protein-ligand structures. These models serve as the foundation of many computational methods including structure prediction, molecular modelling, and structure-based drug design. The success of these computational methods ultimately depends on the quality of the underlying protein-ligand models. X-ray crystallography offers the unparalleled advantage of a clear mathematical formalism relating the experimental data to the protein-ligand model. In the case of X-ray crystallography, the primary experimental evidence is the electron density of the molecules forming the crystal. The first step in the generation of an accurate and precise crystallographic model is the interpretation of the electron density of the crystal, typically carried out by construction of an atomic model. The atomic model must then be validated for fit to the experimental electron density and also for agreement with prior expectations of stereochemistry. Stringent validation of protein-ligand models has become possible as a result of the mandatory deposition of primary diffraction data, and many computational tools are now available to aid in the validation process. Validation of protein-ligand complexes has revealed some instances of overenthusiastic interpretation of ligand density. Fundamental concepts and metrics of protein-ligand quality validation are discussed and we highlight software tools to assist in this process. It is essential that end users select high quality protein-ligand models for their computational and biological studies, and we provide an overview of how this can be achieved.

  12. Atomic Force Microscopy of virus capsids uncover the interplay between mechanics, structure and function

    NASA Astrophysics Data System (ADS)

    de Pablo, Pedro J.

    The basic architecture of a virus consists of the capsid, a shell made up of repeating protein subunits, which packs, shuttles and delivers their genome at the right place and moment. Viral particles are endorsed with specific physicochemical properties which confer to their structures certain meta-stability whose modulation permits fulfilling each task of the viral cycle. These natural designed capabilities have impelled using viral capsids as protein containers of artificial cargoes (drugs, polymers, enzymes, minerals) with applications in biomedical and materials sciences. Both natural and artificial protein cages have to protect their cargo against a variety of physicochemical aggressive environments, including molecular impacts of highly crowded media, thermal and chemical stresses, and osmotic shocks. Viral cages stability under these ambiences depend not only on the ultimate structure of the external capsid, which rely on the interactions between protein subunits, but also on the nature of the cargo. During the last decade our lab has focused on the study of protein cages with Atomic Force Microscopy (AFM) (figure 1). We are interested in stablishing links of their mechanical properties with their structure and function. In particular, mechanics provide information about the cargo storage strategies of both natural and virus-derived protein cages. Mechanical fatigue has revealed as a nanosurgery tool to unveil the strength of the capisd subunit bonds. We also interrogated the electrostatics of individual protein shells. Our AFM-fluorescence combination provided information about DNA diffusing out cracked-open protein cages in real time.

  13. Prediction of Carbohydrate Binding Sites on Protein Surfaces with 3-Dimensional Probability Density Distributions of Interacting Atoms

    PubMed Central

    Tsai, Keng-Chang; Jian, Jhih-Wei; Yang, Ei-Wen; Hsu, Po-Chiang; Peng, Hung-Pin; Chen, Ching-Tai; Chen, Jun-Bo; Chang, Jeng-Yih; Hsu, Wen-Lian; Yang, An-Suei

    2012-01-01

    Non-covalent protein-carbohydrate interactions mediate molecular targeting in many biological processes. Prediction of non-covalent carbohydrate binding sites on protein surfaces not only provides insights into the functions of the query proteins; information on key carbohydrate-binding residues could suggest site-directed mutagenesis experiments, design therapeutics targeting carbohydrate-binding proteins, and provide guidance in engineering protein-carbohydrate interactions. In this work, we show that non-covalent carbohydrate binding sites on protein surfaces can be predicted with relatively high accuracy when the query protein structures are known. The prediction capabilities were based on a novel encoding scheme of the three-dimensional probability density maps describing the distributions of 36 non-covalent interacting atom types around protein surfaces. One machine learning model was trained for each of the 30 protein atom types. The machine learning algorithms predicted tentative carbohydrate binding sites on query proteins by recognizing the characteristic interacting atom distribution patterns specific for carbohydrate binding sites from known protein structures. The prediction results for all protein atom types were integrated into surface patches as tentative carbohydrate binding sites based on normalized prediction confidence level. The prediction capabilities of the predictors were benchmarked by a 10-fold cross validation on 497 non-redundant proteins with known carbohydrate binding sites. The predictors were further tested on an independent test set with 108 proteins. The residue-based Matthews correlation coefficient (MCC) for the independent test was 0.45, with prediction precision and sensitivity (or recall) of 0.45 and 0.49 respectively. In addition, 111 unbound carbohydrate-binding protein structures for which the structures were determined in the absence of the carbohydrate ligands were predicted with the trained predictors. The overall prediction MCC was 0.49. Independent tests on anti-carbohydrate antibodies showed that the carbohydrate antigen binding sites were predicted with comparable accuracy. These results demonstrate that the predictors are among the best in carbohydrate binding site predictions to date. PMID:22848404

  14. Protein engineering and the use of molecular modeling and simulation: the case of heterodimeric Fc engineering.

    PubMed

    Spreter Von Kreudenstein, Thomas; Lario, Paula I; Dixit, Surjit B

    2014-01-01

    Computational and structure guided methods can make significant contributions to the development of solutions for difficult protein engineering problems, including the optimization of next generation of engineered antibodies. In this paper, we describe a contemporary industrial antibody engineering program, based on hypothesis-driven in silico protein optimization method. The foundational concepts and methods of computational protein engineering are discussed, and an example of a computational modeling and structure-guided protein engineering workflow is provided for the design of best-in-class heterodimeric Fc with high purity and favorable biophysical properties. We present the engineering rationale as well as structural and functional characterization data on these engineered designs. Copyright © 2013 Elsevier Inc. All rights reserved.

  15. Structures of the major capsid proteins of the human Karolinska Institutet and Washington University polyomaviruses.

    PubMed

    Neu, Ursula; Wang, Jianbo; Macejak, Dennis; Garcea, Robert L; Stehle, Thilo

    2011-07-01

    The Karolinska Institutet and Washington University polyomaviruses (KIPyV and WUPyV, respectively) are recently discovered human viruses that infect the respiratory tract. Although they have not yet been linked to disease, they are prevalent in populations worldwide, with initial infection occurring in early childhood. Polyomavirus capsids consist of 72 pentamers of the major capsid protein viral protein 1 (VP1), which determines antigenicity and receptor specificity. The WUPyV and KIPyV VP1 proteins are distant in evolution from VP1 proteins of known structure such as simian virus 40 or murine polyomavirus. We present here the crystal structures of unassembled recombinant WUPyV and KIPyV VP1 pentamers at resolutions of 2.9 and 2.55 Å, respectively. The WUPyV and KIPyV VP1 core structures fold into the same β-sandwich that is a hallmark of all polyomavirus VP1 proteins crystallized to date. However, differences in sequence translate into profoundly different surface loop structures in KIPyV and WUPyV VP1 proteins. Such loop structures have not been observed for other polyomaviruses, and they provide initial clues about the possible interactions of these viruses with cell surface receptors.

  16. Prediction of protein tertiary structure from sequences using a very large back-propagation neural network

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Liu, X.; Wilcox, G.L.

    1993-12-31

    We have implemented large scale back-propagation neural networks on a 544 node Connection Machine, CM-5, using the C language in MIMD mode. The program running on 512 processors performs backpropagation learning at 0.53 Gflops, which provides 76 million connection updates per second. We have applied the network to the prediction of protein tertiary structure from sequence information alone. A neural network with one hidden layer and 40 million connections is trained to learn the relationship between sequence and tertiary structure. The trained network yields predicted structures of some proteins on which it has not been trained given only their sequences.more » Presentation of the Fourier transform of the sequences accentuates periodicity in the sequence and yields good generalization with greatly increased training efficiency. Training simulations with a large, heterologous set of protein structures (111 proteins from CM-5 time) to solutions with under 2% RMS residual error within the training set (random responses give an RMS error of about 20%). Presentation of 15 sequences of related proteins in a testing set of 24 proteins yields predicted structures with less than 8% RMS residual error, indicating good apparent generalization.« less

  17. Developing a Multiplexed Quantitative Cross-Linking Mass Spectrometry Platform for Comparative Structural Analysis of Protein Complexes.

    PubMed

    Yu, Clinton; Huszagh, Alexander; Viner, Rosa; Novitsky, Eric J; Rychnovsky, Scott D; Huang, Lan

    2016-10-18

    Cross-linking mass spectrometry (XL-MS) represents a recently popularized hybrid methodology for defining protein-protein interactions (PPIs) and analyzing structures of large protein assemblies. In particular, XL-MS strategies have been demonstrated to be effective in elucidating molecular details of PPIs at the peptide resolution, providing a complementary set of structural data that can be utilized to refine existing complex structures or direct de novo modeling of unknown protein structures. To study structural and interaction dynamics of protein complexes, quantitative cross-linking mass spectrometry (QXL-MS) strategies based on isotope-labeled cross-linkers have been developed. Although successful, these approaches are mostly limited to pairwise comparisons. In order to establish a robust workflow enabling comparative analysis of multiple cross-linked samples simultaneously, we have developed a multiplexed QXL-MS strategy, namely, QMIX (Quantitation of Multiplexed, Isobaric-labeled cross (X)-linked peptides) by integrating MS-cleavable cross-linkers with isobaric labeling reagents. This study has established a new analytical platform for quantitative analysis of cross-linked peptides, which can be directly applied for multiplexed comparisons of the conformational dynamics of protein complexes and PPIs at the proteome scale in future studies.

  18. Structural Genomics and Drug Discovery for Infectious Diseases

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Anderson, W.F.

    The application of structural genomics methods and approaches to proteins from organisms causing infectious diseases is making available the three dimensional structures of many proteins that are potential drug targets and laying the groundwork for structure aided drug discovery efforts. There are a number of structural genomics projects with a focus on pathogens that have been initiated worldwide. The Center for Structural Genomics of Infectious Diseases (CSGID) was recently established to apply state-of-the-art high throughput structural biology technologies to the characterization of proteins from the National Institute for Allergy and Infectious Diseases (NIAID) category A-C pathogens and organisms causing emerging,more » or re-emerging infectious diseases. The target selection process emphasizes potential biomedical benefits. Selected proteins include known drug targets and their homologs, essential enzymes, virulence factors and vaccine candidates. The Center also provides a structure determination service for the infectious disease scientific community. The ultimate goal is to generate a library of structures that are available to the scientific community and can serve as a starting point for further research and structure aided drug discovery for infectious diseases. To achieve this goal, the CSGID will determine protein crystal structures of 400 proteins and protein-ligand complexes using proven, rapid, highly integrated, and cost-effective methods for such determination, primarily by X-ray crystallography. High throughput crystallographic structure determination is greatly aided by frequent, convenient access to high-performance beamlines at third-generation synchrotron X-ray sources.« less

  19. CNA web server: rigidity theory-based thermal unfolding simulations of proteins for linking structure, (thermo-)stability, and function.

    PubMed

    Krüger, Dennis M; Rathi, Prakash Chandra; Pfleger, Christopher; Gohlke, Holger

    2013-07-01

    The Constraint Network Analysis (CNA) web server provides a user-friendly interface to the CNA approach developed in our laboratory for linking results from rigidity analyses to biologically relevant characteristics of a biomolecular structure. The CNA web server provides a refined modeling of thermal unfolding simulations that considers the temperature dependence of hydrophobic tethers and computes a set of global and local indices for quantifying biomacromolecular stability. From the global indices, phase transition points are identified where the structure switches from a rigid to a floppy state; these phase transition points can be related to a protein's (thermo-)stability. Structural weak spots (unfolding nuclei) are automatically identified, too; this knowledge can be exploited in data-driven protein engineering. The local indices are useful in linking flexibility and function and to understand the impact of ligand binding on protein flexibility. The CNA web server robustly handles small-molecule ligands in general. To overcome issues of sensitivity with respect to the input structure, the CNA web server allows performing two ensemble-based variants of thermal unfolding simulations. The web server output is provided as raw data, plots and/or Jmol representations. The CNA web server, accessible at http://cpclab.uni-duesseldorf.de/cna or http://www.cnanalysis.de, is free and open to all users with no login requirement.

  20. X-ray scattering data and structural genomics

    NASA Astrophysics Data System (ADS)

    Doniach, Sebastian

    2003-03-01

    High throughput structural genomics has the ambitious goal of determining the structure of all, or a very large number of protein folds using the high-resolution techniques of protein crystallography and NMR. However, the program is facing significant bottlenecks in reaching this goal, which include problems of protein expression and crystallization. In this talk, some preliminary results on how the low-resolution technique of small-angle X-ray solution scattering (SAXS) can help ameliorate some of these bottlenecks will be presented. One of the most significant bottlenecks arises from the difficulty of crystallizing integral membrane proteins, where only a handful of structures are available compared to thousands of structures for soluble proteins. By 3-dimensional reconstruction from SAXS data, the size and shape of detergent-solubilized integral membrane proteins can be characterized. This information can then be used to classify membrane proteins which constitute some 25% of all genomes. SAXS may also be used to study the dependence of interparticle interference scattering on solvent conditions so that regions of the protein solution phase diagram which favor crystallization can be elucidated. As a further application, SAXS may be used to provide physical constraints on computational methods for protein structure prediction based on primary sequence information. This in turn can help in identifying structural homologs of a given protein, which can then give clues to its function. D. Walther, F. Cohen and S. Doniach. "Reconstruction of low resolution three-dimensional density maps from one-dimensional small angle x-ray scattering data for biomolecules." J. Appl. Cryst. 33(2):350-363 (2000). Protein structure prediction constrained by solution X-ray scattering data and structural homology identification Zheng WJ, Doniach S JOURNAL OF MOLECULAR BIOLOGY , v. 316(#1) pp. 173-187 FEB 8, 2002

  1. Protein-Protein Interface and Disease: Perspective from Biomolecular Networks.

    PubMed

    Hu, Guang; Xiao, Fei; Li, Yuqian; Li, Yuan; Vongsangnak, Wanwipa

    Protein-protein interactions are involved in many important biological processes and molecular mechanisms of disease association. Structural studies of interfacial residues in protein complexes provide information on protein-protein interactions. Characterizing protein-protein interfaces, including binding sites and allosteric changes, thus pose an imminent challenge. With special focus on protein complexes, approaches based on network theory are proposed to meet this challenge. In this review we pay attention to protein-protein interfaces from the perspective of biomolecular networks and their roles in disease. We first describe the different roles of protein complexes in disease through several structural aspects of interfaces. We then discuss some recent advances in predicting hot spots and communication pathway analysis in terms of amino acid networks. Finally, we highlight possible future aspects of this area with respect to both methodology development and applications for disease treatment.

  2. A Stochastic Evolutionary Model for Protein Structure Alignment and Phylogeny

    PubMed Central

    Challis, Christopher J.; Schmidler, Scott C.

    2012-01-01

    We present a stochastic process model for the joint evolution of protein primary and tertiary structure, suitable for use in alignment and estimation of phylogeny. Indels arise from a classic Links model, and mutations follow a standard substitution matrix, whereas backbone atoms diffuse in three-dimensional space according to an Ornstein–Uhlenbeck process. The model allows for simultaneous estimation of evolutionary distances, indel rates, structural drift rates, and alignments, while fully accounting for uncertainty. The inclusion of structural information enables phylogenetic inference on time scales not previously attainable with sequence evolution models. The model also provides a tool for testing evolutionary hypotheses and improving our understanding of protein structural evolution. PMID:22723302

  3. Small-angle X-Ray analysis of macromolecular structure: the structure of protein NS2 (NEP) in solution

    NASA Astrophysics Data System (ADS)

    Shtykova, E. V.; Bogacheva, E. N.; Dadinova, L. A.; Jeffries, C. M.; Fedorova, N. V.; Golovko, A. O.; Baratova, L. A.; Batishchev, O. V.

    2017-11-01

    A complex structural analysis of nuclear export protein NS2 (NEP) of influenza virus A has been performed using bioinformatics predictive methods and small-angle X-ray scattering data. The behavior of NEP molecules in a solution (their aggregation, oligomerization, and dissociation, depending on the buffer composition) has been investigated. It was shown that stable associates are formed even in a conventional aqueous salt solution at physiological pH value. For the first time we have managed to get NEP dimers in solution, to analyze their structure, and to compare the models obtained using the method of the molecular tectonics with the spatial protein structure predicted by us using the bioinformatics methods. The results of the study provide a new insight into the structural features of nuclear export protein NS2 (NEP) of the influenza virus A, which is very important for viral infection development.

  4. Conformational Sampling in Template-Free Protein Loop Structure Modeling: An Overview

    PubMed Central

    Li, Yaohang

    2013-01-01

    Accurately modeling protein loops is an important step to predict three-dimensional structures as well as to understand functions of many proteins. Because of their high flexibility, modeling the three-dimensional structures of loops is difficult and is usually treated as a “mini protein folding problem” under geometric constraints. In the past decade, there has been remarkable progress in template-free loop structure modeling due to advances of computational methods as well as stably increasing number of known structures available in PDB. This mini review provides an overview on the recent computational approaches for loop structure modeling. In particular, we focus on the approaches of sampling loop conformation space, which is a critical step to obtain high resolution models in template-free methods. We review the potential energy functions for loop modeling, loop buildup mechanisms to satisfy geometric constraints, and loop conformation sampling algorithms. The recent loop modeling results are also summarized. PMID:24688696

  5. Conformational sampling in template-free protein loop structure modeling: an overview.

    PubMed

    Li, Yaohang

    2013-01-01

    Accurately modeling protein loops is an important step to predict three-dimensional structures as well as to understand functions of many proteins. Because of their high flexibility, modeling the three-dimensional structures of loops is difficult and is usually treated as a "mini protein folding problem" under geometric constraints. In the past decade, there has been remarkable progress in template-free loop structure modeling due to advances of computational methods as well as stably increasing number of known structures available in PDB. This mini review provides an overview on the recent computational approaches for loop structure modeling. In particular, we focus on the approaches of sampling loop conformation space, which is a critical step to obtain high resolution models in template-free methods. We review the potential energy functions for loop modeling, loop buildup mechanisms to satisfy geometric constraints, and loop conformation sampling algorithms. The recent loop modeling results are also summarized.

  6. Origins of the protein synthesis cycle

    NASA Technical Reports Server (NTRS)

    Fox, S. W.

    1981-01-01

    Largely derived from experiments in molecular evolution, a theory of protein synthesis cycles has been constructed. The sequence begins with ordered thermal proteins resulting from the self-sequencing of mixed amino acids. Ordered thermal proteins then aggregate to cell-like structures. When they contained proteinoids sufficiently rich in lysine, the structures were able to synthesize offspring peptides. Since lysine-rich proteinoid (LRP) also catalyzes the polymerization of nucleoside triphosphate to polynucleotides, the same microspheres containing LRP could have synthesized both original cellular proteins and cellular nucleic acids. The LRP within protocells would have provided proximity advantageous for the origin and evolution of the genetic code.

  7. mTM-align: a server for fast protein structure database search and multiple protein structure alignment.

    PubMed

    Dong, Runze; Pan, Shuo; Peng, Zhenling; Zhang, Yang; Yang, Jianyi

    2018-05-21

    With the rapid increase of the number of protein structures in the Protein Data Bank, it becomes urgent to develop algorithms for efficient protein structure comparisons. In this article, we present the mTM-align server, which consists of two closely related modules: one for structure database search and the other for multiple structure alignment. The database search is speeded up based on a heuristic algorithm and a hierarchical organization of the structures in the database. The multiple structure alignment is performed using the recently developed algorithm mTM-align. Benchmark tests demonstrate that our algorithms outperform other peering methods for both modules, in terms of speed and accuracy. One of the unique features for the server is the interplay between database search and multiple structure alignment. The server provides service not only for performing fast database search, but also for making accurate multiple structure alignment with the structures found by the search. For the database search, it takes about 2-5 min for a structure of a medium size (∼300 residues). For the multiple structure alignment, it takes a few seconds for ∼10 structures of medium sizes. The server is freely available at: http://yanglab.nankai.edu.cn/mTM-align/.

  8. A new twist in the coil: functions of the coiled-coil domain of structural maintenance of chromosome (SMC) proteins.

    PubMed

    Matityahu, Avi; Onn, Itay

    2018-02-01

    The higher-order organization of chromosomes ensures their stability and functionality. However, the molecular mechanism by which higher order structure is established is poorly understood. Dissecting the activity of the relevant proteins provides information essential for achieving a comprehensive understanding of chromosome structure. Proteins of the structural maintenance of chromosome (SMC) family of ATPases are the core of evolutionary conserved complexes. SMC complexes are involved in regulating genome dynamics and in maintaining genome stability. The structure of all SMC proteins resembles an elongated rod that contains a central coiled-coil domain, a common protein structural motif in which two α-helices twist together. In recent years, the imperative role of the coiled-coil domain to SMC protein activity and regulation has become evident. Here, we discuss recent advances in the function of the SMC coiled coils. We describe the structure of the coiled-coil domain of SMC proteins, modifications and interactions that are mediated by it. Furthermore, we assess the role of the coiled-coil domain in conformational switches of SMC proteins, and in determining the architecture of the SMC dimer. Finally, we review the interplay between mutations in the coiled-coil domain and human disorders. We suggest that distinctive properties of coiled coils of different SMC proteins contribute to their distinct functions. The discussion clarifies the mechanisms underlying the activity of SMC proteins, and advocates future studies to elucidate the function of the SMC coiled coil domain.

  9. FTIR study of secondary structure of bovine serum albumin and ovalbumin

    NASA Astrophysics Data System (ADS)

    Abrosimova, K. V.; Shulenina, O. V.; Paston, S. V.

    2016-11-01

    Proteins structure is the critical factor for their functioning. Fourier transform infrared spectroscopy provides a possibility to obtain information about secondary structure of proteins in different states and also in a whole biological samples. Infrared spectra of egg white from the untreated and hard-boiled hen's egg, and also of chicken ovalbumin and bovine serum albumin in lyophilic form and in aqueous solution were studied. Lyophilization of investigated globular proteins is accompanied by the decrease of a-helix structures and the increase in amount of intermolecular β-sheets. Analysis of infrared spectrum of egg white allowed to make an estimation of OVA secondary structure and to observe α-to-β structural transformation as a result of the heat denaturation.

  10. Crystal structures of ASK1-inhibtor complexes provide a platform for structure-based drug design

    PubMed Central

    Singh, Onkar; Shillings, Anthony; Craggs, Peter; Wall, Ian; Rowland, Paul; Skarzynski, Tadeusz; Hobbs, Clare I; Hardwick, Phil; Tanner, Rob; Blunt, Michelle; Witty, David R; Smith, Kathrine J

    2013-01-01

    ASK1, a member of the MAPK Kinase Kinase family of proteins has been shown to play a key role in cancer, neurodegeneration and cardiovascular diseases and is emerging as a possible drug target. Here we describe a ‘replacement-soaking’ method that has enabled the high-throughput X-ray structure determination of ASK1/ligand complexes. Comparison of the X-ray structures of five ASK1/ligand complexes from 3 different chemotypes illustrates that the ASK1 ATP binding site is able to accommodate a range of chemical diversity and different binding modes. The replacement-soaking system is also able to tolerate some protein flexibility. This crystal system provides a robust platform for ASK1/ligand structure determination and future structure based drug design. PMID:23776076

  11. Structure of human Fe-S assembly subcomplex reveals unexpected cysteine desulfurase architecture and acyl-ACP-ISD11 interactions.

    PubMed

    Cory, Seth A; Van Vranken, Jonathan G; Brignole, Edward J; Patra, Shachin; Winge, Dennis R; Drennan, Catherine L; Rutter, Jared; Barondeau, David P

    2017-07-03

    In eukaryotes, sulfur is mobilized for incorporation into multiple biosynthetic pathways by a cysteine desulfurase complex that consists of a catalytic subunit (NFS1), LYR protein (ISD11), and acyl carrier protein (ACP). This NFS1-ISD11-ACP (SDA) complex forms the core of the iron-sulfur (Fe-S) assembly complex and associates with assembly proteins ISCU2, frataxin (FXN), and ferredoxin to synthesize Fe-S clusters. Here we present crystallographic and electron microscopic structures of the SDA complex coupled to enzyme kinetic and cell-based studies to provide structure-function properties of a mitochondrial cysteine desulfurase. Unlike prokaryotic cysteine desulfurases, the SDA structure adopts an unexpected architecture in which a pair of ISD11 subunits form the dimeric core of the SDA complex, which clarifies the critical role of ISD11 in eukaryotic assemblies. The different quaternary structure results in an incompletely formed substrate channel and solvent-exposed pyridoxal 5'-phosphate cofactor and provides a rationale for the allosteric activator function of FXN in eukaryotic systems. The structure also reveals the 4'-phosphopantetheine-conjugated acyl-group of ACP occupies the hydrophobic core of ISD11, explaining the basis of ACP stabilization. The unexpected architecture for the SDA complex provides a framework for understanding interactions with acceptor proteins for sulfur-containing biosynthetic pathways, elucidating mechanistic details of eukaryotic Fe-S cluster biosynthesis, and clarifying how defects in Fe-S cluster assembly lead to diseases such as Friedreich's ataxia. Moreover, our results support a lock-and-key model in which LYR proteins associate with acyl-ACP as a mechanism for fatty acid biosynthesis to coordinate the expression, Fe-S cofactor maturation, and activity of the respiratory complexes.

  12. Distance matrix-based approach to protein structure prediction.

    PubMed

    Kloczkowski, Andrzej; Jernigan, Robert L; Wu, Zhijun; Song, Guang; Yang, Lei; Kolinski, Andrzej; Pokarowski, Piotr

    2009-03-01

    Much structural information is encoded in the internal distances; a distance matrix-based approach can be used to predict protein structure and dynamics, and for structural refinement. Our approach is based on the square distance matrix D = [r(ij)(2)] containing all square distances between residues in proteins. This distance matrix contains more information than the contact matrix C, that has elements of either 0 or 1 depending on whether the distance r (ij) is greater or less than a cutoff value r (cutoff). We have performed spectral decomposition of the distance matrices D = sigma lambda(k)V(k)V(kT), in terms of eigenvalues lambda kappa and the corresponding eigenvectors v kappa and found that it contains at most five nonzero terms. A dominant eigenvector is proportional to r (2)--the square distance of points from the center of mass, with the next three being the principal components of the system of points. By predicting r (2) from the sequence we can approximate a distance matrix of a protein with an expected RMSD value of about 7.3 A, and by combining it with the prediction of the first principal component we can improve this approximation to 4.0 A. We can also explain the role of hydrophobic interactions for the protein structure, because r is highly correlated with the hydrophobic profile of the sequence. Moreover, r is highly correlated with several sequence profiles which are useful in protein structure prediction, such as contact number, the residue-wise contact order (RWCO) or mean square fluctuations (i.e. crystallographic temperature factors). We have also shown that the next three components are related to spatial directionality of the secondary structure elements, and they may be also predicted from the sequence, improving overall structure prediction. We have also shown that the large number of available HIV-1 protease structures provides a remarkable sampling of conformations, which can be viewed as direct structural information about the dynamics. After structure matching, we apply principal component analysis (PCA) to obtain the important apparent motions for both bound and unbound structures. There are significant similarities between the first few key motions and the first few low-frequency normal modes calculated from a static representative structure with an elastic network model (ENM) that is based on the contact matrix C (related to D), strongly suggesting that the variations among the observed structures and the corresponding conformational changes are facilitated by the low-frequency, global motions intrinsic to the structure. Similarities are also found when the approach is applied to an NMR ensemble, as well as to atomic molecular dynamics (MD) trajectories. Thus, a sufficiently large number of experimental structures can directly provide important information about protein dynamics, but ENM can also provide a similar sampling of conformations. Finally, we use distance constraints from databases of known protein structures for structure refinement. We use the distributions of distances of various types in known protein structures to obtain the most probable ranges or the mean-force potentials for the distances. We then impose these constraints on structures to be refined or include the mean-force potentials directly in the energy minimization so that more plausible structural models can be built. This approach has been successfully used by us in 2006 in the CASPR structure refinement (http://predictioncenter.org/caspR).

  13. PDB2CD: a web-based application for the generation of circular dichroism spectra from protein atomic coordinates.

    PubMed

    Mavridis, Lazaros; Janes, Robert W

    2017-01-01

    Circular dichroism (CD) spectroscopy is extensively utilized for determining the percentages of secondary structure content present in proteins. However, although a large contributor, secondary structure is not the only factor that influences the shape and magnitude of the CD spectrum produced. Other structural features can make contributions so an entire protein structural conformation can give rise to a CD spectrum. There is a need for an application capable of generating protein CD spectra from atomic coordinates. However, no empirically derived method to do this currently exists. PDB2CD has been created as an empirical-based approach to the generation of protein CD spectra from atomic coordinates. The method utilizes a combination of structural features within the conformation of a protein; not only its percentage secondary structure content, but also the juxtaposition of these structural components relative to one another, and the overall structure similarity of the query protein to proteins in our dataset, the SP175 dataset, the 'gold standard' set obtained from the Protein Circular Dichroism Data Bank (PCDDB). A significant number of the CD spectra associated with the 71 proteins in this dataset have been produced with excellent accuracy using a leave-one-out cross-validation process. The method also creates spectra in good agreement with those of a test set of 14 proteins from the PCDDB. The PDB2CD package provides a web-based, user friendly approach to enable researchers to produce CD spectra from protein atomic coordinates. http://pdb2cd.cryst.bbk.ac.uk CONTACT: r.w.janes@qmul.ac.ukSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.

  14. Protein collapse is encoded in the folded state architecture.

    PubMed

    Samanta, Himadri S; Zhuravlev, Pavel I; Hinczewski, Michael; Hori, Naoto; Chakrabarti, Shaon; Thirumalai, D

    2017-05-21

    Folded states of single domain globular proteins are compact with high packing density. The radius of gyration, R g , of both the folded and unfolded states increase as N ν where N is the number of amino acids in the protein. The values of the Flory exponent ν are, respectively, ≈⅓ and ≈0.6 in the folded and unfolded states, coinciding with those for homopolymers. However, the extent of compaction of the unfolded state of a protein under low denaturant concentration (collapsibility), conditions favoring the formation of the folded state, is unknown. We develop a theory that uses the contact map of proteins as input to quantitatively assess collapsibility of proteins. Although collapsibility is universal, the propensity to be compact depends on the protein architecture. Application of the theory to over two thousand proteins shows that collapsibility depends not only on N but also on the contact map reflecting the native structure. A major prediction of the theory is that β-sheet proteins are far more collapsible than structures dominated by α-helices. The theory and the accompanying simulations, validating the theoretical predictions, provide insights into the differing conclusions reached using different experimental probes assessing the extent of compaction of proteins. By calculating the criterion for collapsibility as a function of protein length we provide quantitative insights into the reasons why single domain proteins are small and the physical reasons for the origin of multi-domain proteins. Collapsibility of non-coding RNA molecules is similar β-sheet proteins structures adding support to "Compactness Selection Hypothesis".

  15. Random close packing in protein cores

    NASA Astrophysics Data System (ADS)

    Gaines, Jennifer C.; Smith, W. Wendell; Regan, Lynne; O'Hern, Corey S.

    2016-03-01

    Shortly after the determination of the first protein x-ray crystal structures, researchers analyzed their cores and reported packing fractions ϕ ≈0.75 , a value that is similar to close packing of equal-sized spheres. A limitation of these analyses was the use of extended atom models, rather than the more physically accurate explicit hydrogen model. The validity of the explicit hydrogen model was proved in our previous studies by its ability to predict the side chain dihedral angle distributions observed in proteins. In contrast, the extended atom model is not able to recapitulate the side chain dihedral angle distributions, and gives rise to large atomic clashes at side chain dihedral angle combinations that are highly probable in protein crystal structures. Here, we employ the explicit hydrogen model to calculate the packing fraction of the cores of over 200 high-resolution protein structures. We find that these protein cores have ϕ ≈0.56 , which is similar to results obtained from simulations of random packings of individual amino acids. This result provides a deeper understanding of the physical basis of protein structure that will enable predictions of the effects of amino acid mutations to protein cores and interfaces of known structure.

  16. Random close packing in protein cores.

    PubMed

    Gaines, Jennifer C; Smith, W Wendell; Regan, Lynne; O'Hern, Corey S

    2016-03-01

    Shortly after the determination of the first protein x-ray crystal structures, researchers analyzed their cores and reported packing fractions ϕ ≈ 0.75, a value that is similar to close packing of equal-sized spheres. A limitation of these analyses was the use of extended atom models, rather than the more physically accurate explicit hydrogen model. The validity of the explicit hydrogen model was proved in our previous studies by its ability to predict the side chain dihedral angle distributions observed in proteins. In contrast, the extended atom model is not able to recapitulate the side chain dihedral angle distributions, and gives rise to large atomic clashes at side chain dihedral angle combinations that are highly probable in protein crystal structures. Here, we employ the explicit hydrogen model to calculate the packing fraction of the cores of over 200 high-resolution protein structures. We find that these protein cores have ϕ ≈ 0.56, which is similar to results obtained from simulations of random packings of individual amino acids. This result provides a deeper understanding of the physical basis of protein structure that will enable predictions of the effects of amino acid mutations to protein cores and interfaces of known structure.

  17. CCBuilder 2.0: Powerful and accessible coiled‐coil modeling

    PubMed Central

    Wood, Christopher W.

    2017-01-01

    Abstract The increased availability of user‐friendly and accessible computational tools for biomolecular modeling would expand the reach and application of biomolecular engineering and design. For protein modeling, one key challenge is to reduce the complexities of 3D protein folds to sets of parametric equations that nonetheless capture the salient features of these structures accurately. At present, this is possible for a subset of proteins, namely, repeat proteins. The α‐helical coiled coil provides one such example, which represents ≈ 3–5% of all known protein‐encoding regions of DNA. Coiled coils are bundles of α helices that can be described by a small set of structural parameters. Here we describe how this parametric description can be implemented in an easy‐to‐use web application, called CCBuilder 2.0, for modeling and optimizing both α‐helical coiled coils and polyproline‐based collagen triple helices. This has many applications from providing models to aid molecular replacement for X‐ray crystallography, in silico model building and engineering of natural and designed protein assemblies, and through to the creation of completely de novo “dark matter” protein structures. CCBuilder 2.0 is available as a web‐based application, the code for which is open‐source and can be downloaded freely. http://coiledcoils.chm.bris.ac.uk/ccbuilder2. Lay Summary We have created CCBuilder 2.0, an easy to use web‐based application that can model structures for a whole class of proteins, the α‐helical coiled coil, which is estimated to account for 3–5% of all proteins in nature. CCBuilder 2.0 will be of use to a large number of protein scientists engaged in fundamental studies, such as protein structure determination, through to more‐applied research including designing and engineering novel proteins that have potential applications in biotechnology. PMID:28836317

  18. The first mammalian aldehyde oxidase crystal structure: insights into substrate specificity.

    PubMed

    Coelho, Catarina; Mahro, Martin; Trincão, José; Carvalho, Alexandra T P; Ramos, Maria João; Terao, Mineko; Garattini, Enrico; Leimkühler, Silke; Romão, Maria João

    2012-11-23

    Aldehyde oxidases have pharmacological relevance, and AOX3 is the major drug-metabolizing enzyme in rodents. The crystal structure of mouse AOX3 with kinetics and molecular docking studies provides insights into its enzymatic characteristics. Differences in substrate and inhibitor specificities can be rationalized by comparing the AOX3 and xanthine oxidase structures. The first aldehyde oxidase structure represents a major advance for drug design and mechanistic studies. Aldehyde oxidases (AOXs) are homodimeric proteins belonging to the xanthine oxidase family of molybdenum-containing enzymes. Each 150-kDa monomer contains a FAD redox cofactor, two spectroscopically distinct [2Fe-2S] clusters, and a molybdenum cofactor located within the protein active site. AOXs are characterized by broad range substrate specificity, oxidizing different aldehydes and aromatic N-heterocycles. Despite increasing recognition of its role in the metabolism of drugs and xenobiotics, the physiological function of the protein is still largely unknown. We have crystallized and solved the crystal structure of mouse liver aldehyde oxidase 3 to 2.9 Å. This is the first mammalian AOX whose structure has been solved. The structure provides important insights into the protein active center and further evidence on the catalytic differences characterizing AOX and xanthine oxidoreductase. The mouse liver aldehyde oxidase 3 three-dimensional structure combined with kinetic, mutagenesis data, molecular docking, and molecular dynamics studies make a decisive contribution to understand the molecular basis of its rather broad substrate specificity.

  19. Investigating the structural impact of S311C mutation in DRD2 receptor by molecular dynamics & docking studies.

    PubMed

    Podder, Avijit; Pandey, Deeksha; Latha, N

    2016-04-01

    Dopamine receptors (DR) are neuronal cell surface proteins that mediate the action of neurotransmitter dopamine in brain. Dopamine receptor D2 (DRD2) that belongs to G-protein coupled receptors (GPCR) family is a major therapeutic target for of various neurological and psychiatric disorders in human. The third inter cellular loop (ICL3) in DRD2 is essential for coupling G proteins and several signaling scaffold proteins. A mutation in ICL3 can interfere with this binding interface, thereby altering the DRD2 signaling. In this study we have examined the deleterious effect of serine to cysteine mutation at position 311 (S311C) in the ICL3 region that is implicated in diseases like schizophrenia and alcoholism. An in silico structure modeling approach was employed to determine the wild type (WT) and mutant S311C structures of DRD2, scaffold proteins - Gαi/o and NEB2. Protein-ligand docking protocol was exercised to predict the interactions of natural agonist dopamine with both the WT and mutant structures of DRD2. Besides, atomistic molecular dynamics (MD) simulations were performed to provide insights into essential dynamics of the systems-unbound and dopamine bound DRD2 (WT and mutant) and three independent simulations for Gαi, Gαo and NEB2 systems. To provide information on intra-molecular arrangement of the structures, a comprehensive residue interactions network of both dopamine bound WT and mutant DRD2 protein were studied. We also employed a protein-protein docking strategy to find the interactions of scaffold proteins - Gαi/o and NEB2 with both dopamine bound WT and mutant structures of DRD2. We observed a marginal effect of the mutation in dopamine binding mechanism on the trajectories analyzed. However, we noticed a significant structural alteration of the mutant receptor which affects Gαi/o and NEB2 binding that can be causal for malfunctioning in cAMP-dependent signaling and Ca(+) homeostasis in the brain dopaminergic system leading to neuropsychiatric disorders. Copyright © 2016 Elsevier B.V. and Société Française de Biochimie et Biologie Moléculaire (SFBBM). All rights reserved.

  20. Complementary MS Methods Assist Conformational Characterization of Antibodies with Altered S-S Bonding Networks

    NASA Astrophysics Data System (ADS)

    Jones, Lisa M.; Zhang, Hao; Cui, Weidong; Kumar, Sandeep; Sperry, Justin B.; Carroll, James A.; Gross, Michael L.

    2013-06-01

    As therapeutic monoclonal antibodies (mAbs) become a major focus in biotechnology and a source of the next-generation drugs, new analytical methods or combination methods are needed for monitoring changes in higher order structure and effects of post-translational modifications. The complexity of these molecules and their vulnerability to structural change provide a serious challenge. We describe here the use of complementary mass spectrometry methods that not only characterize mutant mAbs but also may provide a general framework for characterizing higher order structure of other protein therapeutics and biosimilars. To frame the challenge, we selected members of the IgG2 subclass that have distinct disulfide isomeric structures as a model to evaluate an overall approach that uses ion mobility, top-down MS sequencing, and protein footprinting in the form of fast photochemical oxidation of proteins (FPOP). These three methods are rapid, sensitive, respond to subtle changes in conformation of Cys → Ser mutants of an IgG2, each representing a single disulfide isoform, and may be used in series to probe higher order structure. The outcome suggests that this approach of using various methods in combination can assist the development and quality control of protein therapeutics.

  1. The E. coli thioredoxin folding mechanism: the key role of the C-terminal helix.

    PubMed

    Vazquez, Diego S; Sánchez, Ignacio E; Garrote, Ana; Sica, Mauricio P; Santos, Javier

    2015-02-01

    In this work, the unfolding mechanism of oxidized Escherichia coli thioredoxin (EcTRX) was investigated experimentally and computationally. We characterized seven point mutants distributed along the C-terminal α-helix (CTH) and the preceding loop. The mutations destabilized the protein against global unfolding while leaving the native structure unchanged. Global analysis of the unfolding kinetics of all variants revealed a linear unfolding route with a high-energy on-pathway intermediate state flanked by two transition state ensembles TSE1 and TSE2. The experiments show that CTH is mainly unfolded in TSE1 and the intermediate and becomes structured in TSE2. Structure-based molecular dynamics are in agreement with these experiments and provide protein-wide structural information on transient states. In our model, EcTRX folding starts with structure formation in the β-sheet, while the protein helices coalesce later. As a whole, our results indicate that the CTH is a critical module in the folding process, restraining a heterogeneous intermediate ensemble into a biologically active native state and providing the native protein with thermodynamic and kinetic stability. Copyright © 2014 Elsevier B.V. All rights reserved.

  2. Observing a late folding intermediate of Ubiquitin at atomic resolution by NMR

    PubMed Central

    Surana, Parag

    2016-01-01

    Abstract The study of intermediates in the protein folding pathway provides a wealth of information about the energy landscape. The intermediates also frequently initiate pathogenic fibril formations. While observing the intermediates is difficult due to their transient nature, extreme conditions can partially unfold the proteins and provide a glimpse of the intermediate states. Here, we observe the high resolution structure of a hydrophobic core mutant of Ubiquitin at an extreme acidic pH by nuclear magnetic resonance (NMR) spectroscopy. In the structure, the native secondary and tertiary structure is conserved for a major part of the protein. However, a long loop between the beta strands β3 and β5 is partially unfolded. The altered structure is supported by fluorescence data and the difference in free energies between the native state and the intermediate is reflected in the denaturant induced melting curves. The unfolded region includes amino acids that are critical for interaction with cofactors as well as for assembly of poly‐Ubiquitin chains. The structure at acidic pH resembles a late folding intermediate of Ubiquitin and indicates that upon stabilization of the protein's core, the long loop converges on the core in the final step of the folding process. PMID:27111887

  3. De novo protein structure prediction by dynamic fragment assembly and conformational space annealing.

    PubMed

    Lee, Juyong; Lee, Jinhyuk; Sasaki, Takeshi N; Sasai, Masaki; Seok, Chaok; Lee, Jooyoung

    2011-08-01

    Ab initio protein structure prediction is a challenging problem that requires both an accurate energetic representation of a protein structure and an efficient conformational sampling method for successful protein modeling. In this article, we present an ab initio structure prediction method which combines a recently suggested novel way of fragment assembly, dynamic fragment assembly (DFA) and conformational space annealing (CSA) algorithm. In DFA, model structures are scored by continuous functions constructed based on short- and long-range structural restraint information from a fragment library. Here, DFA is represented by the full-atom model by CHARMM with the addition of the empirical potential of DFIRE. The relative contributions between various energy terms are optimized using linear programming. The conformational sampling was carried out with CSA algorithm, which can find low energy conformations more efficiently than simulated annealing used in the existing DFA study. The newly introduced DFA energy function and CSA sampling algorithm are implemented into CHARMM. Test results on 30 small single-domain proteins and 13 template-free modeling targets of the 8th Critical Assessment of protein Structure Prediction show that the current method provides comparable and complementary prediction results to existing top methods. Copyright © 2011 Wiley-Liss, Inc.

  4. Membrane association of the PTEN tumor suppressor: Neutron scattering and MD simulations reveal the structure of protein-membranes complexes

    PubMed Central

    Nanda, Hirsh; Heinrich, Frank; Lösche, Mathias

    2014-01-01

    Neutron reflection (NR) from planar interfaces is an emerging technology that provides unique and otherwise inaccessible structural information on disordered molecular systems such as membrane proteins associated with fluid bilayers, thus addressing one of the remaining challenges of structural biology. Although intrinsically a low-resolution technique, using structural information from crystallography or NMR allows the construction of NR models that describe the architecture of protein-membrane complexes at high resolution. In addition, a combination of these methods with molecular dynamics (MD) simulations has the potential to reveal the dynamics of protein interactions with the bilayer in atomistic detail. We review recent advances in this area by discussing the application of these techniques to the complex formed by the PTEN phosphatase with the plasma membrane. These studies provide insights in the cellular regulation of PTEN, its interaction with PI(4,5)P2 in the inner plasma membrane and the pathway by which its substrate, PI(3,4,5)P3, accesses the PTEN catalytic site. PMID:25461777

  5. Carbohydrate-protein interactions: molecular modeling insights.

    PubMed

    Pérez, Serge; Tvaroška, Igor

    2014-01-01

    The article reviews the significant contributions to, and the present status of, applications of computational methods for the characterization and prediction of protein-carbohydrate interactions. After a presentation of the specific features of carbohydrate modeling, along with a brief description of the experimental data and general features of carbohydrate-protein interactions, the survey provides a thorough coverage of the available computational methods and tools. At the quantum-mechanical level, the use of both molecular orbitals and density-functional theory is critically assessed. These are followed by a presentation and critical evaluation of the applications of semiempirical and empirical methods: QM/MM, molecular dynamics, free-energy calculations, metadynamics, molecular robotics, and others. The usefulness of molecular docking in structural glycobiology is evaluated by considering recent docking- validation studies on a range of protein targets. The range of applications of these theoretical methods provides insights into the structural, energetic, and mechanistic facets that occur in the course of the recognition processes. Selected examples are provided to exemplify the usefulness and the present limitations of these computational methods in their ability to assist in elucidation of the structural basis underlying the diverse function and biological roles of carbohydrates in their dialogue with proteins. These test cases cover the field of both carbohydrate biosynthesis and glycosyltransferases, as well as glycoside hydrolases. The phenomenon of (macro)molecular recognition is illustrated for the interactions of carbohydrates with such proteins as lectins, monoclonal antibodies, GAG-binding proteins, porins, and viruses. © 2014 Elsevier Inc. All rights reserved.

  6. ExDom: an integrated database for comparative analysis of the exon–intron structures of protein domains in eukaryotes

    PubMed Central

    Bhasi, Ashwini; Philip, Philge; Manikandan, Vinu; Senapathy, Periannan

    2009-01-01

    We have developed ExDom, a unique database for the comparative analysis of the exon–intron structures of 96 680 protein domains from seven eukaryotic organisms (Homo sapiens, Mus musculus, Bos taurus, Rattus norvegicus, Danio rerio, Gallus gallus and Arabidopsis thaliana). ExDom provides integrated access to exon-domain data through a sophisticated web interface which has the following analytical capabilities: (i) intergenomic and intragenomic comparative analysis of exon–intron structure of domains; (ii) color-coded graphical display of the domain architecture of proteins correlated with their corresponding exon-intron structures; (iii) graphical analysis of multiple sequence alignments of amino acid and coding nucleotide sequences of homologous protein domains from seven organisms; (iv) comparative graphical display of exon distributions within the tertiary structures of protein domains; and (v) visualization of exon–intron structures of alternative transcripts of a gene correlated to variations in the domain architecture of corresponding protein isoforms. These novel analytical features are highly suited for detailed investigations on the exon–intron structure of domains and make ExDom a powerful tool for exploring several key questions concerning the function, origin and evolution of genes and proteins. ExDom database is freely accessible at: http://66.170.16.154/ExDom/. PMID:18984624

  7. Dynamical Coupling of Intrinsically Disordered Proteins and Their Hydration Water: Comparison with Folded Soluble and Membrane Proteins

    PubMed Central

    Gallat, F.-X.; Laganowsky, A.; Wood, K.; Gabel, F.; van Eijck, L.; Wuttke, J.; Moulin, M.; Härtlein, M.; Eisenberg, D.; Colletier, J.-P.; Zaccai, G.; Weik, M.

    2012-01-01

    Hydration water is vital for various macromolecular biological activities, such as specific ligand recognition, enzyme activity, response to receptor binding, and energy transduction. Without hydration water, proteins would not fold correctly and would lack the conformational flexibility that animates their three-dimensional structures. Motions in globular, soluble proteins are thought to be governed to a certain extent by hydration-water dynamics, yet it is not known whether this relationship holds true for other protein classes in general and whether, in turn, the structural nature of a protein also influences water motions. Here, we provide insight into the coupling between hydration-water dynamics and atomic motions in intrinsically disordered proteins (IDP), a largely unexplored class of proteins that, in contrast to folded proteins, lack a well-defined three-dimensional structure. We investigated the human IDP tau, which is involved in the pathogenic processes accompanying Alzheimer disease. Combining neutron scattering and protein perdeuteration, we found similar atomic mean-square displacements over a large temperature range for the tau protein and its hydration water, indicating intimate coupling between them. This is in contrast to the behavior of folded proteins of similar molecular weight, such as the globular, soluble maltose-binding protein and the membrane protein bacteriorhodopsin, which display moderate to weak coupling, respectively. The extracted mean square displacements also reveal a greater motional flexibility of IDP compared with globular, folded proteins and more restricted water motions on the IDP surface. The results provide evidence that protein and hydration-water motions mutually affect and shape each other, and that there is a gradient of coupling across different protein classes that may play a functional role in macromolecular activity in a cellular context. PMID:22828339

  8. Conservation of coevolving protein interfaces bridges prokaryote–eukaryote homologies in the twilight zone

    PubMed Central

    Rodriguez-Rivas, Juan; Marsili, Simone; Juan, David; Valencia, Alfonso

    2016-01-01

    Protein–protein interactions are fundamental for the proper functioning of the cell. As a result, protein interaction surfaces are subject to strong evolutionary constraints. Recent developments have shown that residue coevolution provides accurate predictions of heterodimeric protein interfaces from sequence information. So far these approaches have been limited to the analysis of families of prokaryotic complexes for which large multiple sequence alignments of homologous sequences can be compiled. We explore the hypothesis that coevolution points to structurally conserved contacts at protein–protein interfaces, which can be reliably projected to homologous complexes with distantly related sequences. We introduce a domain-centered protocol to study the interplay between residue coevolution and structural conservation of protein–protein interfaces. We show that sequence-based coevolutionary analysis systematically identifies residue contacts at prokaryotic interfaces that are structurally conserved at the interface of their eukaryotic counterparts. In turn, this allows the prediction of conserved contacts at eukaryotic protein–protein interfaces with high confidence using solely mutational patterns extracted from prokaryotic genomes. Even in the context of high divergence in sequence (the twilight zone), where standard homology modeling of protein complexes is unreliable, our approach provides sequence-based accurate information about specific details of protein interactions at the residue level. Selected examples of the application of prokaryotic coevolutionary analysis to the prediction of eukaryotic interfaces further illustrate the potential of this approach. PMID:27965389

  9. The TIM Barrel Architecture Facilitated the Early Evolution of Protein-Mediated Metabolism.

    PubMed

    Goldman, Aaron David; Beatty, Joshua T; Landweber, Laura F

    2016-01-01

    The triosephosphate isomerase (TIM) barrel protein fold is a structurally repetitive architecture that is present in approximately 10% of all enzymes. It is generally assumed that this ubiquity in modern proteomes reflects an essential historical role in early protein-mediated metabolism. Here, we provide quantitative and comparative analyses to support several hypotheses about the early importance of the TIM barrel architecture. An information theoretical analysis of protein structures supports the hypothesis that the TIM barrel architecture could arise more easily by duplication and recombination compared to other mixed α/β structures. We show that TIM barrel enzymes corresponding to the most taxonomically broad superfamilies also have the broadest range of functions, often aided by metal and nucleotide-derived cofactors that are thought to reflect an earlier stage of metabolic evolution. By comparison to other putatively ancient protein architectures, we find that the functional diversity of TIM barrel proteins cannot be explained simply by their antiquity. Instead, the breadth of TIM barrel functions can be explained, in part, by the incorporation of a broad range of cofactors, a trend that does not appear to be shared by proteins in general. These results support the hypothesis that the simple and functionally general TIM barrel architecture may have arisen early in the evolution of protein biosynthesis and provided an ideal scaffold to facilitate the metabolic transition from ribozymes, peptides, and geochemical catalysts to modern protein enzymes.

  10. From protein sequence to dynamics and disorder with DynaMine.

    PubMed

    Cilia, Elisa; Pancsa, Rita; Tompa, Peter; Lenaerts, Tom; Vranken, Wim F

    2013-01-01

    Protein function and dynamics are closely related; however, accurate dynamics information is difficult to obtain. Here based on a carefully assembled data set derived from experimental data for proteins in solution, we quantify backbone dynamics properties on the amino-acid level and develop DynaMine--a fast, high-quality predictor of protein backbone dynamics. DynaMine uses only protein sequence information as input and shows great potential in distinguishing regions of different structural organization, such as folded domains, disordered linkers, molten globules and pre-structured binding motifs of different sizes. It also identifies disordered regions within proteins with an accuracy comparable to the most sophisticated existing predictors, without depending on prior disorder knowledge or three-dimensional structural information. DynaMine provides molecular biologists with an important new method that grasps the dynamical characteristics of any protein of interest, as we show here for human p53 and E1A from human adenovirus 5.

  11. Overview of electron crystallography of membrane proteins: crystallization and screening strategies using negative stain electron microscopy.

    PubMed

    Nannenga, Brent L; Iadanza, Matthew G; Vollmar, Breanna S; Gonen, Tamir

    2013-01-01

    Electron cryomicroscopy, or cryoEM, is an emerging technique for studying the three-dimensional structures of proteins and large macromolecular machines. Electron crystallography is a branch of cryoEM in which structures of proteins can be studied at resolutions that rival those achieved by X-ray crystallography. Electron crystallography employs two-dimensional crystals of a membrane protein embedded within a lipid bilayer. The key to a successful electron crystallographic experiment is the crystallization, or reconstitution, of the protein of interest. This unit describes ways in which protein can be expressed, purified, and reconstituted into well-ordered two-dimensional crystals. A protocol is also provided for negative stain electron microscopy as a tool for screening crystallization trials. When large and well-ordered crystals are obtained, the structures of both protein and its surrounding membrane can be determined to atomic resolution.

  12. Identification of Extracellular Segments by Mass Spectrometry Improves Topology Prediction of Transmembrane Proteins.

    PubMed

    Langó, Tamás; Róna, Gergely; Hunyadi-Gulyás, Éva; Turiák, Lilla; Varga, Julia; Dobson, László; Várady, György; Drahos, László; Vértessy, Beáta G; Medzihradszky, Katalin F; Szakács, Gergely; Tusnády, Gábor E

    2017-02-13

    Transmembrane proteins play crucial role in signaling, ion transport, nutrient uptake, as well as in maintaining the dynamic equilibrium between the internal and external environment of cells. Despite their important biological functions and abundance, less than 2% of all determined structures are transmembrane proteins. Given the persisting technical difficulties associated with high resolution structure determination of transmembrane proteins, additional methods, including computational and experimental techniques remain vital in promoting our understanding of their topologies, 3D structures, functions and interactions. Here we report a method for the high-throughput determination of extracellular segments of transmembrane proteins based on the identification of surface labeled and biotin captured peptide fragments by LC/MS/MS. We show that reliable identification of extracellular protein segments increases the accuracy and reliability of existing topology prediction algorithms. Using the experimental topology data as constraints, our improved prediction tool provides accurate and reliable topology models for hundreds of human transmembrane proteins.

  13. Impact of Protein-Metal Ion Interactions on the Crystallization of Silk Fibroin Protein

    NASA Astrophysics Data System (ADS)

    Hu, Xiao; Lu, Qiang; Kaplan, David; Cebe, Peggy

    2009-03-01

    Proteins can easily form bonds with a variety of metal ions, which provides many unique biological functions for the protein structures, and therefore controls the overall structural transformation of proteins. We use advanced thermal analysis methods such as temperature modulated differential scanning calorimetry and quasi-isothermal TMDSC, combined with Fourier transform infrared spectroscopy, and scanning electron microscopy, to investigate the protein-metallic ion interactions in Bombyx mori silk fibroin proteins. Silk samples were mixed with different metal ions (Ca^2+, K^+, Ma^2+, Na^+, Cu^2+, Mn^2+) with different mass ratios, and compared with the physical conditions in the silkworm gland. Results show that all metallic ions can directly affect the crystallization behavior and glass transition of silk fibroin. However, different ions tend to have different structural impact, including their role as plasticizer or anti-plasticizer. Detailed studies reveal important information allowing us better to understand the natural silk spinning and crystallization process.

  14. Functional Advantages of Conserved Intrinsic Disorder in RNA-Binding Proteins.

    PubMed

    Varadi, Mihaly; Zsolyomi, Fruzsina; Guharoy, Mainak; Tompa, Peter

    2015-01-01

    Proteins form large macromolecular assemblies with RNA that govern essential molecular processes. RNA-binding proteins have often been associated with conformational flexibility, yet the extent and functional implications of their intrinsic disorder have never been fully assessed. Here, through large-scale analysis of comprehensive protein sequence and structure datasets we demonstrate the prevalence of intrinsic structural disorder in RNA-binding proteins and domains. We addressed their functionality through a quantitative description of the evolutionary conservation of disordered segments involved in binding, and investigated the structural implications of flexibility in terms of conformational stability and interface formation. We conclude that the functional role of intrinsically disordered protein segments in RNA-binding is two-fold: first, these regions establish extended, conserved electrostatic interfaces with RNAs via induced fit. Second, conformational flexibility enables them to target different RNA partners, providing multi-functionality, while also ensuring specificity. These findings emphasize the functional importance of intrinsically disordered regions in RNA-binding proteins.

  15. A complementation assay for in vivo protein structure/function analysis in Physcomitrella patens (Funariaceae)

    DOE PAGES

    Scavuzzo-Duggan, Tess R.; Chaves, Arielle M.; Roberts, Alison W.

    2015-07-14

    Here, a method for rapid in vivo functional analysis of engineered proteins was developed using Physcomitrella patens. A complementation assay was designed for testing structure/function relationships in cellulose synthase (CESA) proteins. The components of the assay include (1) construction of test vectors that drive expression of epitope-tagged PpCESA5 carrying engineered mutations, (2) transformation of a ppcesa5 knockout line that fails to produce gametophores with test and control vectors, (3) scoring the stable transformants for gametophore production, (4) statistical analysis comparing complementation rates for test vectors to positive and negative control vectors, and (5) analysis of transgenic protein expression by Westernmore » blotting. The assay distinguished mutations that generate fully functional, nonfunctional, and partially functional proteins. In conclusion, compared with existing methods for in vivo testing of protein function, this complementation assay provides a rapid method for investigating protein structure/function relationships in plants.« less

  16. Mapping of ligand-binding cavities in proteins.

    PubMed

    Andersson, C David; Chen, Brian Y; Linusson, Anna

    2010-05-01

    The complex interactions between proteins and small organic molecules (ligands) are intensively studied because they play key roles in biological processes and drug activities. Here, we present a novel approach to characterize and map the ligand-binding cavities of proteins without direct geometric comparison of structures, based on Principal Component Analysis of cavity properties (related mainly to size, polarity, and charge). This approach can provide valuable information on the similarities and dissimilarities, of binding cavities due to mutations, between-species differences and flexibility upon ligand-binding. The presented results show that information on ligand-binding cavity variations can complement information on protein similarity obtained from sequence comparisons. The predictive aspect of the method is exemplified by successful predictions of serine proteases that were not included in the model construction. The presented strategy to compare ligand-binding cavities of related and unrelated proteins has many potential applications within protein and medicinal chemistry, for example in the characterization and mapping of "orphan structures", selection of protein structures for docking studies in structure-based design, and identification of proteins for selectivity screens in drug design programs. 2009 Wiley-Liss, Inc.

  17. Predicting protein crystallization propensity from protein sequence

    PubMed Central

    2011-01-01

    The high-throughput structure determination pipelines developed by structural genomics programs offer a unique opportunity for data mining. One important question is how protein properties derived from a primary sequence correlate with the protein’s propensity to yield X-ray quality crystals (crystallizability) and 3D X-ray structures. A set of protein properties were computed for over 1,300 proteins that expressed well but were insoluble, and for ~720 unique proteins that resulted in X-ray structures. The correlation of the protein’s iso-electric point and grand average hydropathy (GRAVY) with crystallizability was analyzed for full length and domain constructs of protein targets. In a second step, several additional properties that can be calculated from the protein sequence were added and evaluated. Using statistical analyses we have identified a set of the attributes correlating with a protein’s propensity to crystallize and implemented a Support Vector Machine (SVM) classifier based on these. We have created applications to analyze and provide optimal boundary information for query sequences and to visualize the data. These tools are available via the web site http://bioinformatics.anl.gov/cgi-bin/tools/pdpredictor. PMID:20177794

  18. GalaxyRefineComplex: Refinement of protein-protein complex model structures driven by interface repacking.

    PubMed

    Heo, Lim; Lee, Hasup; Seok, Chaok

    2016-08-18

    Protein-protein docking methods have been widely used to gain an atomic-level understanding of protein interactions. However, docking methods that employ low-resolution energy functions are popular because of computational efficiency. Low-resolution docking tends to generate protein complex structures that are not fully optimized. GalaxyRefineComplex takes such low-resolution docking structures and refines them to improve model accuracy in terms of both interface contact and inter-protein orientation. This refinement method allows flexibility at the protein interface and in the overall docking structure to capture conformational changes that occur upon binding. Symmetric refinement is also provided for symmetric homo-complexes. This method was validated by refining models produced by available docking programs, including ZDOCK and M-ZDOCK, and was successfully applied to CAPRI targets in a blind fashion. An example of using the refinement method with an existing docking method for ligand binding mode prediction of a drug target is also presented. A web server that implements the method is freely available at http://galaxy.seoklab.org/refinecomplex.

  19. Protein Models Docking Benchmark 2

    PubMed Central

    Anishchenko, Ivan; Kundrotas, Petras J.; Tuzikov, Alexander V.; Vakser, Ilya A.

    2015-01-01

    Structural characterization of protein-protein interactions is essential for our ability to understand life processes. However, only a fraction of known proteins have experimentally determined structures. Such structures provide templates for modeling of a large part of the proteome, where individual proteins can be docked by template-free or template-based techniques. Still, the sensitivity of the docking methods to the inherent inaccuracies of protein models, as opposed to the experimentally determined high-resolution structures, remains largely untested, primarily due to the absence of appropriate benchmark set(s). Structures in such a set should have pre-defined inaccuracy levels and, at the same time, resemble actual protein models in terms of structural motifs/packing. The set should also be large enough to ensure statistical reliability of the benchmarking results. We present a major update of the previously developed benchmark set of protein models. For each interactor, six models were generated with the model-to-native Cα RMSD in the 1 to 6 Å range. The models in the set were generated by a new approach, which corresponds to the actual modeling of new protein structures in the “real case scenario,” as opposed to the previous set, where a significant number of structures were model-like only. In addition, the larger number of complexes (165 vs. 63 in the previous set) increases the statistical reliability of the benchmarking. We estimated the highest accuracy of the predicted complexes (according to CAPRI criteria), which can be attained using the benchmark structures. The set is available at http://dockground.bioinformatics.ku.edu. PMID:25712716

  20. The unfoldomics decade: an update on intrinsically disordered proteins.

    PubMed

    Dunker, A Keith; Oldfield, Christopher J; Meng, Jingwei; Romero, Pedro; Yang, Jack Y; Chen, Jessica Walton; Vacic, Vladimir; Obradovic, Zoran; Uversky, Vladimir N

    2008-09-16

    Our first predictor of protein disorder was published just over a decade ago in the Proceedings of the IEEE International Conference on Neural Networks (Romero P, Obradovic Z, Kissinger C, Villafranca JE, Dunker AK (1997) Identifying disordered regions in proteins from amino acid sequence. Proceedings of the IEEE International Conference on Neural Networks, 1: 90-95). By now more than twenty other laboratory groups have joined the efforts to improve the prediction of protein disorder. While the various prediction methodologies used for protein intrinsic disorder resemble those methodologies used for secondary structure prediction, the two types of structures are entirely different. For example, the two structural classes have very different dynamic properties, with the irregular secondary structure class being much less mobile than the disorder class. The prediction of secondary structure has been useful. On the other hand, the prediction of intrinsic disorder has been revolutionary, leading to major modifications of the more than 100 year-old views relating protein structure and function. Experimentalists have been providing evidence over many decades that some proteins lack fixed structure or are disordered (or unfolded) under physiological conditions. In addition, experimentalists are also showing that, for many proteins, their functions depend on the unstructured rather than structured state; such results are in marked contrast to the greater than hundred year old views such as the lock and key hypothesis. Despite extensive data on many important examples, including disease-associated proteins, the importance of disorder for protein function has been largely ignored. Indeed, to our knowledge, current biochemistry books don't present even one acknowledged example of a disorder-dependent function, even though some reports of disorder-dependent functions are more than 50 years old. The results from genome-wide predictions of intrinsic disorder and the results from other bioinformatics studies of intrinsic disorder are demanding attention for these proteins. Disorder prediction has been important for showing that the relatively few experimentally characterized examples are members of a very large collection of related disordered proteins that are wide-spread over all three domains of life. Many significant biological functions are now known to depend directly on, or are importantly associated with, the unfolded or partially folded state. Here our goal is to review the key discoveries and to weave these discoveries together to support novel approaches for understanding sequence-function relationships. Intrinsically disordered protein is common across the three domains of life, but especially common among the eukaryotic proteomes. Signaling sequences and sites of posttranslational modifications are frequently, or very likely most often, located within regions of intrinsic disorder. Disorder-to-order transitions are coupled with the adoption of different structures with different partners. Also, the flexibility of intrinsic disorder helps different disordered regions to bind to a common binding site on a common partner. Such capacity for binding diversity plays important roles in both protein-protein interaction networks and likely also in gene regulation networks. Such disorder-based signaling is further modulated in multicellular eukaryotes by alternative splicing, for which such splicing events map to regions of disorder much more often than to regions of structure. Associating alternative splicing with disorder rather than structure alleviates theoretical and experimentally observed problems associated with the folding of different length, isomeric amino acid sequences. The combination of disorder and alternative splicing is proposed to provide a mechanism for easily "trying out" different signaling pathways, thereby providing the mechanism for generating signaling diversity and enabling the evolution of cell differentiation and multicellularity. Finally, several recent small molecules of interest as potential drugs have been shown to act by blocking protein-protein interactions based on intrinsic disorder of one of the partners. Study of these examples has led to a new approach for drug discovery, and bioinformatics analysis of the human proteome suggests that various disease-associated proteins are very rich in such disorder-based drug discovery targets.

  1. Isolation and in silico analysis of a novel H+-pyrophosphatase gene orthologue from the halophytic grass Leptochloa fusca

    NASA Astrophysics Data System (ADS)

    Rauf, Muhammad; Saeed, Nasir A.; Habib, Imran; Ahmed, Moddassir; Shahzad, Khurram; Mansoor, Shahid; Ali, Rashid

    2017-02-01

    Structure prediction can provide information about function and active sites of protein which helps to design new functional proteins. H+-pyrophosphatase is transmembrane protein involved in establishing proton motive force for active transport of Na+ across membrane by Na+/H+ antiporters. A full length novel H+-pyrophosphatase gene was isolated from halophytic grass Leptochloa fusca using RT-PCR and RACE method. Full length LfVP1 gene sequence of 2292 nucleotides encodes protein of 764 amino acids. DNA and protein sequences were used for characterization using bioinformatics tools. Various important potential sites were predicted by PROSITE webserver. Primary structural analysis showed LfVP1 as stable protein and Grand average hydropathy (GRAVY) indicated that LfVP1 protein has good hydrosolubility. Secondary structure analysis showed that LfVP1 protein sequence contains significant proportion of alpha helix and random coil. Protein membrane topology suggested the presence of 14 transmembrane domains and presence of catalytic domain in TM3. Three dimensional structure from LfVP1 protein sequence also indicated the presence of 14 transmembrane domains and hydrophobicity surface model showed amino acid hydrophobicity. Ramachandran plot showed that 98% amino acid residues were predicted in the favored region.

  2. SA-Mot: a web server for the identification of motifs of interest extracted from protein loops

    PubMed Central

    Regad, Leslie; Saladin, Adrien; Maupetit, Julien; Geneix, Colette; Camproux, Anne-Claude

    2011-01-01

    The detection of functional motifs is an important step for the determination of protein functions. We present here a new web server SA-Mot (Structural Alphabet Motif) for the extraction and location of structural motifs of interest from protein loops. Contrary to other methods, SA-Mot does not focus only on functional motifs, but it extracts recurrent and conserved structural motifs involved in structural redundancy of loops. SA-Mot uses the structural word notion to extract all structural motifs from uni-dimensional sequences corresponding to loop structures. Then, SA-Mot provides a description of these structural motifs using statistics computed in the loop data set and in SCOP superfamily, sequence and structural parameters. SA-Mot results correspond to an interactive table listing all structural motifs extracted from a target structure and their associated descriptors. Using this information, the users can easily locate loop regions that are important for the protein folding and function. The SA-Mot web server is available at http://sa-mot.mti.univ-paris-diderot.fr. PMID:21665924

  3. SA-Mot: a web server for the identification of motifs of interest extracted from protein loops.

    PubMed

    Regad, Leslie; Saladin, Adrien; Maupetit, Julien; Geneix, Colette; Camproux, Anne-Claude

    2011-07-01

    The detection of functional motifs is an important step for the determination of protein functions. We present here a new web server SA-Mot (Structural Alphabet Motif) for the extraction and location of structural motifs of interest from protein loops. Contrary to other methods, SA-Mot does not focus only on functional motifs, but it extracts recurrent and conserved structural motifs involved in structural redundancy of loops. SA-Mot uses the structural word notion to extract all structural motifs from uni-dimensional sequences corresponding to loop structures. Then, SA-Mot provides a description of these structural motifs using statistics computed in the loop data set and in SCOP superfamily, sequence and structural parameters. SA-Mot results correspond to an interactive table listing all structural motifs extracted from a target structure and their associated descriptors. Using this information, the users can easily locate loop regions that are important for the protein folding and function. The SA-Mot web server is available at http://sa-mot.mti.univ-paris-diderot.fr.

  4. Applications of NMR and computational methodologies to study protein dynamics.

    PubMed

    Narayanan, Chitra; Bafna, Khushboo; Roux, Louise D; Agarwal, Pratul K; Doucet, Nicolas

    2017-08-15

    Overwhelming evidence now illustrates the defining role of atomic-scale protein flexibility in biological events such as allostery, cell signaling, and enzyme catalysis. Over the years, spin relaxation nuclear magnetic resonance (NMR) has provided significant insights on the structural motions occurring on multiple time frames over the course of a protein life span. The present review article aims to illustrate to the broader community how this technique continues to shape many areas of protein science and engineering, in addition to being an indispensable tool for studying atomic-scale motions and functional characterization. Continuing developments in underlying NMR technology alongside software and hardware developments for complementary computational approaches now enable methodologies to routinely provide spatial directionality and structural representations traditionally harder to achieve solely using NMR spectroscopy. In addition to its well-established role in structural elucidation, we present recent examples that illustrate the combined power of selective isotope labeling, relaxation dispersion experiments, chemical shift analyses, and computational approaches for the characterization of conformational sub-states in proteins and enzymes. Copyright © 2017 Elsevier Inc. All rights reserved.

  5. Visualizing ligand molecules in Twilight electron density.

    PubMed

    Weichenberger, Christian X; Pozharski, Edwin; Rupp, Bernhard

    2013-02-01

    Three-dimensional models of protein structures determined by X-ray crystallography are based on the interpretation of experimentally derived electron-density maps. The real-space correlation coefficient (RSCC) provides an easily comprehensible, objective measure of the residue-based fit of atom coordinates to electron density. Among protein structure models, protein-ligand complexes are of special interest, given their contribution to understanding the molecular underpinnings of biological activity and to drug design. For consumers of such models, it is not trivial to determine the degree to which ligand-structure modelling is biased by subjective electron-density interpretation. A standalone script, Twilight, is presented for the analysis, visualization and annotation of a pre-filtered set of 2815 protein-ligand complexes deposited with the PDB as of 15 January 2012 with ligand RSCC values that are below a threshold of 0.6. It also provides simplified access to the visualization of any protein-ligand complex available from the PDB and annotated by the Uppsala Electron Density Server. The script runs on various platforms and is available for download at http://www.ruppweb.org/twilight/.

  6. Human Embryonic Kidney 293 Cells: A Vehicle for Biopharmaceutical Manufacturing, Structural Biology, and Electrophysiology.

    PubMed

    Hu, Jianwen; Han, Jizhong; Li, Haoran; Zhang, Xian; Liu, Lan Lan; Chen, Fei; Zeng, Bin

    2018-01-01

    Mammalian cells, e.g., CHO, BHK, HEK293, HT-1080, and NS0 cells, represent important manufacturing platforms in bioengineering. They are widely used for the production of recombinant therapeutic proteins, vaccines, anticancer agents, and other clinically relevant drugs. HEK293 (human embryonic kidney 293) cells and their derived cell lines provide an attractive heterologous system for the development of recombinant proteins or adenovirus productions, not least due to their human-like posttranslational modification of protein molecules to provide the desired biological activity. Secondly, they also exhibit high transfection efficiency yielding high-quality recombinant proteins. They are easy to maintain and express with high fidelity membrane proteins, such as ion channels and transporters, and thus are attractive for structural biology and electrophysiology studies. In this article, we review the literature on HEK293 cells regarding their origins but also stress their advancements into the different cell lines engineered and discuss some significant aspects which make them versatile systems for biopharmaceutical manufacturing, drug screening, structural biology research, and electrophysiology applications. © 2018 S. Karger AG, Basel.

  7. Neisseria conserved protein DMP19 is a DNA mimic protein that prevents DNA binding to a hypothetical nitrogen-response transcription factor

    PubMed Central

    Wang, Hao-Ching; Ko, Tzu-Ping; Wu, Mao-Lun; Ku, Shan-Chi; Wu, Hsing-Ju; Wang, Andrew H.-J.

    2012-01-01

    DNA mimic proteins occupy the DNA binding sites of DNA-binding proteins, and prevent these sites from being accessed by DNA. We show here that the Neisseria conserved hypothetical protein DMP19 acts as a DNA mimic. The crystal structure of DMP19 shows a dsDNA-like negative charge distribution on the surface, suggesting that this protein should be added to the short list of known DNA mimic proteins. The crystal structure of another related protein, NHTF (Neisseria hypothetical transcription factor), provides evidence that it is a member of the xenobiotic-response element (XRE) family of transcriptional factors. NHTF binds to a palindromic DNA sequence containing a 5′-TGTNAN11TNACA-3′ recognition box that controls the expression of an NHTF-related operon in which the conserved nitrogen-response protein [i.e. (Protein-PII) uridylyltransferase] is encoded. The complementary surface charges between DMP19 and NHTF suggest specific charge–charge interaction. In a DNA-binding assay, we found that DMP19 can prevent NHTF from binding to its DNA-binding sites. Finally, we used an in situ gene regulation assay to provide evidence that NHTF is a repressor of its down-stream genes and that DMP19 can neutralize this effect. We therefore conclude that the interaction of DMP19 and NHTF provides a novel gene regulation mechanism in Neisseria spps. PMID:22373915

  8. Protein Folding and Self-Organized Criticality

    NASA Astrophysics Data System (ADS)

    Bajracharya, Arun; Murray, Joelle

    Proteins are known to fold into tertiary structures that determine their functionality in living organisms. However, the complex dynamics of protein folding and the way they consistently fold into the same structures is not fully understood. Self-organized criticality (SOC) has provided a framework for understanding complex systems in various systems (earthquakes, forest fires, financial markets, and epidemics) through scale invariance and the associated power law behavior. In this research, we use a simple hydrophobic-polar lattice-bound computational model to investigate self-organized criticality as a possible mechanism for generating complexity in protein folding.

  9. Teaching resources. Protein phosphatases.

    PubMed

    Salton, Stephen R

    2005-03-01

    This Teaching Resource provides lecture notes and slides for a class covering the structure and function of protein phosphatases and is part of the course "Cell Signaling Systems: A Course for Graduate Students." The lecture begins with a discussion of the importance of phosphatases in physiology, recognized by the award of a Nobel Prize in 1992, and then proceeds to describe the two types of protein phosphatases: serine/threonine and tyrosine phosphatases. The information covered includes the structure, regulation, and substrate specificity of protein phosphatases, with an emphasis on their importance in disease and clinical settings.

  10. A general protocol for the generation of Nanobodies for structural biology

    PubMed Central

    Pardon, Els; Laeremans, Toon; Triest, Sarah; Rasmussen, Søren G. F.; Wohlkönig, Alexandre; Ruf, Armin; Muyldermans, Serge; Hol, Wim G. J.; Kobilka, Brian K.; Steyaert, Jan

    2015-01-01

    There is growing interest in using antibodies as auxiliary proteins to crystallize proteins. Here, we describe a general protocol for the generation of Nanobodies to be used as crystallization chaperones for the structural investigation of diverse conformational states of flexible (membrane) proteins and complexes thereof. Our technology has the competitive advantage over other recombinant crystallization chaperones in that we fully exploit the natural humoral response against native antigens. Accordingly, we provide detailed protocols for the immunization with native proteins and for the selection by phage display of in vivo matured Nanobodies that bind conformational epitopes of functional proteins. Three representative examples illustrate that the outlined procedures are robust, enabling to solve the structures of the most challenging proteins by Nanobody-assisted X-ray crystallography in a time span of 6 to 12 months. PMID:24577359

  11. MoonProt: a database for proteins that are known to moonlight

    PubMed Central

    Mani, Mathew; Chen, Chang; Amblee, Vaishak; Liu, Haipeng; Mathur, Tanu; Zwicke, Grant; Zabad, Shadi; Patel, Bansi; Thakkar, Jagravi; Jeffery, Constance J.

    2015-01-01

    Moonlighting proteins comprise a class of multifunctional proteins in which a single polypeptide chain performs multiple biochemical functions that are not due to gene fusions, multiple RNA splice variants or pleiotropic effects. The known moonlighting proteins perform a variety of diverse functions in many different cell types and species, and information about their structures and functions is scattered in many publications. We have constructed the manually curated, searchable, internet-based MoonProt Database (http://www.moonlightingproteins.org) with information about the over 200 proteins that have been experimentally verified to be moonlighting proteins. The availability of this organized information provides a more complete picture of what is currently known about moonlighting proteins. The database will also aid researchers in other fields, including determining the functions of genes identified in genome sequencing projects, interpreting data from proteomics projects and annotating protein sequence and structural databases. In addition, information about the structures and functions of moonlighting proteins can be helpful in understanding how novel protein functional sites evolved on an ancient protein scaffold, which can also help in the design of proteins with novel functions. PMID:25324305

  12. Structural and kinetic analysis of the unnatural fusion protein 4-coumaroyl-CoA ligase::stilbene synthase

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wang, Yechun; Yi, Hankuil; Wang, Melissa

    2012-10-24

    To increase the biochemical efficiency of biosynthetic systems, metabolic engineers have explored different approaches for organizing enzymes, including the generation of unnatural fusion proteins. Previous work aimed at improving the biosynthesis of resveratrol, a stilbene associated a range of health-promoting activities, in yeast used an unnatural engineered fusion protein of Arabidopsis thaliana (thale cress) 4-coumaroyl-CoA ligase (At4CL1) and Vitis vinifera (grape) stilbene synthase (VvSTS) to increase resveratrol levels 15-fold relative to yeast expressing the individual enzymes. Here we present the crystallographic and biochemical analysis of the 4CL::STS fusion protein. Determination of the X-ray crystal structure of 4CL::STS provides the firstmore » molecular view of an artificial didomain adenylation/ketosynthase fusion protein. Comparison of the steady-state kinetic properties of At4CL1, VvSTS, and 4CL::STS demonstrates that the fusion protein improves catalytic efficiency of either reaction less than 3-fold. Structural and kinetic analysis suggests that colocalization of the two enzyme active sites within 70 {angstrom} of each other provides the basis for enhanced in vivo synthesis of resveratrol.« less

  13. Navigating through the Jungle of Allergens: Features and Applications of Allergen Databases.

    PubMed

    Radauer, Christian

    2017-01-01

    The increasing number of available data on allergenic proteins demanded the establishment of structured, freely accessible allergen databases. In this review article, features and applications of 6 of the most widely used allergen databases are discussed. The WHO/IUIS Allergen Nomenclature Database is the official resource of allergen designations. Allergome is the most comprehensive collection of data on allergens and allergen sources. AllergenOnline is aimed at providing a peer-reviewed database of allergen sequences for prediction of allergenicity of proteins, such as those planned to be inserted into genetically modified crops. The Structural Database of Allergenic Proteins (SDAP) provides a database of allergen sequences, structures, and epitopes linked to bioinformatics tools for sequence analysis and comparison. The Immune Epitope Database (IEDB) is the largest repository of T-cell, B-cell, and major histocompatibility complex protein epitopes including epitopes of allergens. AllFam classifies allergens into families of evolutionarily related proteins using definitions from the Pfam protein family database. These databases contain mostly overlapping data, but also show differences in terms of their targeted users, the criteria for including allergens, data shown for each allergen, and the availability of bioinformatics tools. © 2017 S. Karger AG, Basel.

  14. Protein Bricks: 2D and 3D Bio-Nanostructures with Shape and Function on Demand.

    PubMed

    Jiang, Jianjuan; Zhang, Shaoqing; Qian, Zhigang; Qin, Nan; Song, Wenwen; Sun, Long; Zhou, Zhitao; Shi, Zhifeng; Chen, Liang; Li, Xinxin; Mao, Ying; Kaplan, David L; Gilbert Corder, Stephanie N; Chen, Xinzhong; Liu, Mengkun; Omenetto, Fiorenzo G; Xia, Xiaoxia; Tao, Tiger H

    2018-05-01

    Precise patterning of polymer-based biomaterials for functional bio-nanostructures has extensive applications including biosensing, tissue engineering, and regenerative medicine. Remarkable progress is made in both top-down (based on lithographic methods) and bottom-up (via self-assembly) approaches with natural and synthetic biopolymers. However, most methods only yield 2D and pseudo-3D structures with restricted geometries and functionalities. Here, it is reported that precise nanostructuring on genetically engineered spider silk by accurately directing ion and electron beam interactions with the protein's matrix at the nanoscale to create well-defined 2D bionanopatterns and further assemble 3D bionanoarchitectures with shape and function on demand, termed "Protein Bricks." The added control over protein sequence and molecular weight of recombinant spider silk via genetic engineering provides unprecedented lithographic resolution (approaching the molecular limit), sharpness, and biological functions compared to natural proteins. This approach provides a facile method for patterning and immobilizing functional molecules within nanoscopic, hierarchical protein structures, which sheds light on a wide range of biomedical applications such as structure-enhanced fluorescence and biomimetic microenvironments for controlling cell fate. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  15. Structural and Functional Studies of H. seropedicae RecA Protein – Insights into the Polymerization of RecA Protein as Nucleoprotein Filament

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Leite, Wellington C.; Galvão, Carolina W.; Saab, Sérgio C.

    The bacterial RecA protein plays a role in the complex system of DNA damage repair. Here, we report the functional and structural characterization of the Herbaspirillum seropedicae RecA protein (HsRecA). HsRecA protein is more efficient at displacing SSB protein from ssDNA than Escherichia coli RecA protein. HsRecA also promotes DNA strand exchange more efficiently. The three dimensional structure of HsRecA-ADP/ATP complex has been solved to 1.7 Å resolution. HsRecA protein contains a small N-terminal domain, a central core ATPase domain and a large C-terminal domain, that are similar to homologous bacterial RecA proteins. Comparative structural analysis showed that the N-terminalmore » polymerization motif of archaeal and eukaryotic RecA family proteins are also present in bacterial RecAs. Reconstruction of electrostatic potential from the hexameric structure of HsRecA-ADP/ATP revealed a high positive charge along the inner side, where ssDNA is bound inside the filament. The properties of this surface may explain the greater capacity of HsRecA protein to bind ssDNA, forming a contiguous nucleoprotein filament, displace SSB and promote DNA exchange relative to EcRecA. In conclusion, our functional and structural analyses provide insight into the molecular mechanisms of polymerization of bacterial RecA as a helical nucleoprotein filament.« less

  16. Structural and Functional Studies of H. seropedicae RecA Protein – Insights into the Polymerization of RecA Protein as Nucleoprotein Filament

    PubMed Central

    Galvão, Carolina W.; Saab, Sérgio C.; Iulek, Jorge; Etto, Rafael M.; Steffens, Maria B. R.; Chitteni-Pattu, Sindhu; Stanage, Tyler; Keck, James L.; Cox, Michael M.

    2016-01-01

    The bacterial RecA protein plays a role in the complex system of DNA damage repair. Here, we report the functional and structural characterization of the Herbaspirillum seropedicae RecA protein (HsRecA). HsRecA protein is more efficient at displacing SSB protein from ssDNA than Escherichia coli RecA protein. HsRecA also promotes DNA strand exchange more efficiently. The three dimensional structure of HsRecA-ADP/ATP complex has been solved to 1.7 Å resolution. HsRecA protein contains a small N-terminal domain, a central core ATPase domain and a large C-terminal domain, that are similar to homologous bacterial RecA proteins. Comparative structural analysis showed that the N-terminal polymerization motif of archaeal and eukaryotic RecA family proteins are also present in bacterial RecAs. Reconstruction of electrostatic potential from the hexameric structure of HsRecA-ADP/ATP revealed a high positive charge along the inner side, where ssDNA is bound inside the filament. The properties of this surface may explain the greater capacity of HsRecA protein to bind ssDNA, forming a contiguous nucleoprotein filament, displace SSB and promote DNA exchange relative to EcRecA. Our functional and structural analyses provide insight into the molecular mechanisms of polymerization of bacterial RecA as a helical nucleoprotein filament. PMID:27447485

  17. Key Structures and Interactions for Binding of Mycobacterium tuberculosis Protein Kinase B Inhibitors from Molecular Dynamics Simulation.

    PubMed

    Punkvang, Auradee; Kamsri, Pharit; Saparpakorn, Patchreenart; Hannongbua, Supa; Wolschann, Peter; Irle, Stephan; Pungpo, Pornpan

    2015-07-01

    Substituted aminopyrimidine inhibitors have recently been introduced as antituberculosis agents. These inhibitors show impressive activity against protein kinase B, a Ser/Thr protein kinase that is essential for cell growth of M. tuberculosis. However, up to now, X-ray structures of the protein kinase B enzyme complexes with the substituted aminopyrimidine inhibitors are currently unavailable. Consequently, structural details of their binding modes are questionable, prohibiting the structural-based design of more potent protein kinase B inhibitors in the future. Here, molecular dynamics simulations, in conjunction with molecular mechanics/Poisson-Boltzmann surface area binding free-energy analysis, were employed to gain insight into the complex structures of the protein kinase B inhibitors and their binding energetics. The complex structures obtained by the molecular dynamics simulations show binding free energies in good agreement with experiment. The detailed analysis of molecular dynamics results shows that Glu93, Val95, and Leu17 are key residues responsible to the binding of the protein kinase B inhibitors. The aminopyrazole group and the pyrimidine core are the crucial moieties of substituted aminopyrimidine inhibitors for interaction with the key residues. Our results provide a structural concept that can be used as a guide for the future design of protein kinase B inhibitors with highly increased antagonistic activity. © 2014 John Wiley & Sons A/S.

  18. The inverted free energy landscape of an intrinsically disordered peptide by simulations and experiments.

    PubMed

    Granata, Daniele; Baftizadeh, Fahimeh; Habchi, Johnny; Galvagnion, Celine; De Simone, Alfonso; Camilloni, Carlo; Laio, Alessandro; Vendruscolo, Michele

    2015-10-26

    The free energy landscape theory has been very successful in rationalizing the folding behaviour of globular proteins, as this representation provides intuitive information on the number of states involved in the folding process, their populations and pathways of interconversion. We extend here this formalism to the case of the Aβ40 peptide, a 40-residue intrinsically disordered protein fragment associated with Alzheimer's disease. By using an advanced sampling technique that enables free energy calculations to reach convergence also in the case of highly disordered states of proteins, we provide a precise structural characterization of the free energy landscape of this peptide. We find that such landscape has inverted features with respect to those typical of folded proteins. While the global free energy minimum consists of highly disordered structures, higher free energy regions correspond to a large variety of transiently structured conformations with secondary structure elements arranged in several different manners, and are not separated from each other by sizeable free energy barriers. From this peculiar structure of the free energy landscape we predict that this peptide should become more structured and not only more compact, with increasing temperatures, and we show that this is the case through a series of biophysical measurements.

  19. The inverted free energy landscape of an intrinsically disordered peptide by simulations and experiments

    PubMed Central

    Granata, Daniele; Baftizadeh, Fahimeh; Habchi, Johnny; Galvagnion, Celine; De Simone, Alfonso; Camilloni, Carlo; Laio, Alessandro; Vendruscolo, Michele

    2015-01-01

    The free energy landscape theory has been very successful in rationalizing the folding behaviour of globular proteins, as this representation provides intuitive information on the number of states involved in the folding process, their populations and pathways of interconversion. We extend here this formalism to the case of the Aβ40 peptide, a 40-residue intrinsically disordered protein fragment associated with Alzheimer’s disease. By using an advanced sampling technique that enables free energy calculations to reach convergence also in the case of highly disordered states of proteins, we provide a precise structural characterization of the free energy landscape of this peptide. We find that such landscape has inverted features with respect to those typical of folded proteins. While the global free energy minimum consists of highly disordered structures, higher free energy regions correspond to a large variety of transiently structured conformations with secondary structure elements arranged in several different manners, and are not separated from each other by sizeable free energy barriers. From this peculiar structure of the free energy landscape we predict that this peptide should become more structured and not only more compact, with increasing temperatures, and we show that this is the case through a series of biophysical measurements. PMID:26498066

  20. Prediction of protein structural classes by Chou's pseudo amino acid composition: approached using continuous wavelet transform and principal component analysis.

    PubMed

    Li, Zhan-Chao; Zhou, Xi-Bin; Dai, Zong; Zou, Xiao-Yong

    2009-07-01

    A prior knowledge of protein structural classes can provide useful information about its overall structure, so it is very important for quick and accurate determination of protein structural class with computation method in protein science. One of the key for computation method is accurate protein sample representation. Here, based on the concept of Chou's pseudo-amino acid composition (AAC, Chou, Proteins: structure, function, and genetics, 43:246-255, 2001), a novel method of feature extraction that combined continuous wavelet transform (CWT) with principal component analysis (PCA) was introduced for the prediction of protein structural classes. Firstly, the digital signal was obtained by mapping each amino acid according to various physicochemical properties. Secondly, CWT was utilized to extract new feature vector based on wavelet power spectrum (WPS), which contains more abundant information of sequence order in frequency domain and time domain, and PCA was then used to reorganize the feature vector to decrease information redundancy and computational complexity. Finally, a pseudo-amino acid composition feature vector was further formed to represent primary sequence by coupling AAC vector with a set of new feature vector of WPS in an orthogonal space by PCA. As a showcase, the rigorous jackknife cross-validation test was performed on the working datasets. The results indicated that prediction quality has been improved, and the current approach of protein representation may serve as a useful complementary vehicle in classifying other attributes of proteins, such as enzyme family class, subcellular localization, membrane protein types and protein secondary structure, etc.

  1. RCK: accurate and efficient inference of sequence- and structure-based protein-RNA binding models from RNAcompete data.

    PubMed

    Orenstein, Yaron; Wang, Yuhao; Berger, Bonnie

    2016-06-15

    Protein-RNA interactions, which play vital roles in many processes, are mediated through both RNA sequence and structure. CLIP-based methods, which measure protein-RNA binding in vivo, suffer from experimental noise and systematic biases, whereas in vitro experiments capture a clearer signal of protein RNA-binding. Among them, RNAcompete provides binding affinities of a specific protein to more than 240 000 unstructured RNA probes in one experiment. The computational challenge is to infer RNA structure- and sequence-based binding models from these data. The state-of-the-art in sequence models, Deepbind, does not model structural preferences. RNAcontext models both sequence and structure preferences, but is outperformed by GraphProt. Unfortunately, GraphProt cannot detect structural preferences from RNAcompete data due to the unstructured nature of the data, as noted by its developers, nor can it be tractably run on the full RNACompete dataset. We develop RCK, an efficient, scalable algorithm that infers both sequence and structure preferences based on a new k-mer based model. Remarkably, even though RNAcompete data is designed to be unstructured, RCK can still learn structural preferences from it. RCK significantly outperforms both RNAcontext and Deepbind in in vitro binding prediction for 244 RNAcompete experiments. Moreover, RCK is also faster and uses less memory, which enables scalability. While currently on par with existing methods in in vivo binding prediction on a small scale test, we demonstrate that RCK will increasingly benefit from experimentally measured RNA structure profiles as compared to computationally predicted ones. By running RCK on the entire RNAcompete dataset, we generate and provide as a resource a set of protein-RNA structure-based models on an unprecedented scale. Software and models are freely available at http://rck.csail.mit.edu/ bab@mit.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.

  2. APID interactomes: providing proteome-based interactomes with controlled quality for multiple species and derived networks

    PubMed Central

    Alonso-López, Diego; Gutiérrez, Miguel A.; Lopes, Katia P.; Prieto, Carlos; Santamaría, Rodrigo; De Las Rivas, Javier

    2016-01-01

    APID (Agile Protein Interactomes DataServer) is an interactive web server that provides unified generation and delivery of protein interactomes mapped to their respective proteomes. This resource is a new, fully redesigned server that includes a comprehensive collection of protein interactomes for more than 400 organisms (25 of which include more than 500 interactions) produced by the integration of only experimentally validated protein–protein physical interactions. For each protein–protein interaction (PPI) the server includes currently reported information about its experimental validation to allow selection and filtering at different quality levels. As a whole, it provides easy access to the interactomes from specific species and includes a global uniform compendium of 90,379 distinct proteins and 678,441 singular interactions. APID integrates and unifies PPIs from major primary databases of molecular interactions, from other specific repositories and also from experimentally resolved 3D structures of protein complexes where more than two proteins were identified. For this purpose, a collection of 8,388 structures were analyzed to identify specific PPIs. APID also includes a new graph tool (based on Cytoscape.js) for visualization and interactive analyses of PPI networks. The server does not require registration and it is freely available for use at http://apid.dep.usal.es. PMID:27131791

  3. Dynamic nuclear polarization methods in solids and solutions to explore membrane proteins and membrane systems.

    PubMed

    Cheng, Chi-Yuan; Han, Songi

    2013-01-01

    Membrane proteins regulate vital cellular processes, including signaling, ion transport, and vesicular trafficking. Obtaining experimental access to their structures, conformational fluctuations, orientations, locations, and hydration in membrane environments, as well as the lipid membrane properties, is critical to understanding their functions. Dynamic nuclear polarization (DNP) of frozen solids can dramatically boost the sensitivity of current solid-state nuclear magnetic resonance tools to enhance access to membrane protein structures in native membrane environments. Overhauser DNP in the solution state can map out the local and site-specific hydration dynamics landscape of membrane proteins and lipid membranes, critically complementing the structural and dynamics information obtained by electron paramagnetic resonance spectroscopy. Here, we provide an overview of how DNP methods in solids and solutions can significantly increase our understanding of membrane protein structures, dynamics, functions, and hydration in complex biological membrane environments.

  4. RepeatsDB-lite: a web server for unit annotation of tandem repeat proteins.

    PubMed

    Hirsh, Layla; Paladin, Lisanna; Piovesan, Damiano; Tosatto, Silvio C E

    2018-05-09

    RepeatsDB-lite (http://protein.bio.unipd.it/repeatsdb-lite) is a web server for the prediction of repetitive structural elements and units in tandem repeat (TR) proteins. TRs are a widespread but poorly annotated class of non-globular proteins carrying heterogeneous functions. RepeatsDB-lite extends the prediction to all TR types and strongly improves the performance both in terms of computational time and accuracy over previous methods, with precision above 95% for solenoid structures. The algorithm exploits an improved TR unit library derived from the RepeatsDB database to perform an iterative structural search and assignment. The web interface provides tools for analyzing the evolutionary relationships between units and manually refine the prediction by changing unit positions and protein classification. An all-against-all structure-based sequence similarity matrix is calculated and visualized in real-time for every user edit. Reviewed predictions can be submitted to RepeatsDB for review and inclusion.

  5. Correlation of fitness landscapes from three orthologous TIM barrels originates from sequence and structure constraints

    PubMed Central

    Chan, Yvonne H.; Venev, Sergey V.; Zeldovich, Konstantin B.; Matthews, C. Robert

    2017-01-01

    Sequence divergence of orthologous proteins enables adaptation to environmental stresses and promotes evolution of novel functions. Limits on evolution imposed by constraints on sequence and structure were explored using a model TIM barrel protein, indole-3-glycerol phosphate synthase (IGPS). Fitness effects of point mutations in three phylogenetically divergent IGPS proteins during adaptation to temperature stress were probed by auxotrophic complementation of yeast with prokaryotic, thermophilic IGPS. Analysis of beneficial mutations pointed to an unexpected, long-range allosteric pathway towards the active site of the protein. Significant correlations between the fitness landscapes of distant orthologues implicate both sequence and structure as primary forces in defining the TIM barrel fitness landscape and suggest that fitness landscapes can be translocated in sequence space. Exploration of fitness landscapes in the context of a protein fold provides a strategy for elucidating the sequence-structure-fitness relationships in other common motifs. PMID:28262665

  6. Crystal Structure of AGR_C_4470p from Agrobacterium tumefaciens

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Vorobiev,S.; Neely, H.; Seetharaman, J.

    2007-01-01

    We report here the crystal structure at 2.0 {angstrom} resolution of the AGR{_}C{_}4470p protein from the Gram-negative bacterium Agrobacterium tumefaciens. The protein is a tightly associated dimer, each subunit of which bears strong structural homology with the two domains of the heme utilization protein ChuS from Escherichia coli and HemS from Yersinia enterocolitica. Remarkably, the organization of the AGR{_}C{_}4470p dimer is the same as that of the two domains in ChuS and HemS, providing structural evidence that these two proteins evolved by gene duplication. However, the binding site for heme, while conserved in HemS and ChuS, is not conserved inmore » AGR{_}C{_}4470p, suggesting that it probably has a different function. This is supported by the presence of two homologs of AGR{_}C{_}4470p in E. coli, in addition to the ChuS protein.« less

  7. The Sla2p/HIP1/HIP1R family: similar structure, similar function in endocytosis?

    PubMed

    Gottfried, Irit; Ehrlich, Marcelo; Ashery, Uri

    2010-02-01

    HIP1 (huntingtin interacting protein 1) has two close relatives: HIP1R (HIP1-related) and yeast Sla2p. All three members of the family have a conserved domain structure, suggesting a common function. Over the past decade, a number of studies have characterized these proteins using a combination of biochemical, imaging, structural and genetic techniques. These studies provide valuable information on binding partners, structure and dynamics of HIP1/HIP1R/Sla2p. In general, all suggest a role in CME (clathrin-mediated endocytosis) for the three proteins, though some differences have emerged. In this mini-review we summarize the current views on the roles of these proteins, while emphasizing the unique attributes of each family member.

  8. IMGT/3Dstructure-DB and IMGT/StructuralQuery, a database and a tool for immunoglobulin, T cell receptor and MHC structural data

    PubMed Central

    Kaas, Quentin; Ruiz, Manuel; Lefranc, Marie-Paule

    2004-01-01

    IMGT/3Dstructure-DB and IMGT/Structural-Query are a novel 3D structure database and a new tool for immunological proteins. They are part of IMGT, the international ImMunoGenetics information system®, a high-quality integrated knowledge resource specializing in immunoglobulins (IG), T cell receptors (TR), major histocompatibility complex (MHC) and related proteins of the immune system (RPI) of human and other vertebrate species, which consists of databases, Web resources and interactive on-line tools. IMGT/3Dstructure-DB data are described according to the IMGT Scientific chart rules based on the IMGT-ONTOLOGY concepts. IMGT/3Dstructure-DB provides IMGT gene and allele identification of IG, TR and MHC proteins with known 3D structures, domain delimitations, amino acid positions according to the IMGT unique numbering and renumbered coordinate flat files. Moreover IMGT/3Dstructure-DB provides 2D graphical representations (or Collier de Perles) and results of contact analysis. The IMGT/StructuralQuery tool allows search of this database based on specific structural characteristics. IMGT/3Dstructure-DB and IMGT/StructuralQuery are freely available at http://imgt.cines.fr. PMID:14681396

  9. Effects of Convective Transport of Solute and Impurities on Defect-Causing Kinetics Instabilities in Protein Crystallization

    NASA Technical Reports Server (NTRS)

    Vekilov, Peter G.

    2003-01-01

    Insight into the crystallization processes of biological macromolecules into crystals or aggregates can provide valuable guidelines in many fundamental and applied fields. Such insight will prompt new means to regulate protein phase transitions in-vivo, e.g., polymerization of hemoglobin S in the red cells, crystallization of crystallins in the eye lens, etc. Understanding of protein crystal nucleation will help achieve narrow crystallite size distributions, needed for sustained release of pharmaceutical protein preparations such as insulin or interferon. Traditionally, protein crystallization studies have been related to the pursuit of crystal perfection needed to improve the structure details provided by x-ray, electron or neutron diffraction methods. Crystallization trials for the purposes of structural biology carried out in space have posed an intriguing question related to the inconsistency of the effects of the microgravity growth on the quality of the crystals.

  10. Protein Delivery into Plant Cells: Toward In vivo Structural Biology

    PubMed Central

    Cedeño, Cesyen; Pauwels, Kris; Tompa, Peter

    2017-01-01

    Understanding the biologically relevant structural and functional behavior of proteins inside living plant cells is only possible through the combination of structural biology and cell biology. The state-of-the-art structural biology techniques are typically applied to molecules that are isolated from their native context. Although most experimental conditions can be easily controlled while dealing with an isolated, purified protein, a serious shortcoming of such in vitro work is that we cannot mimic the extremely complex intracellular environment in which the protein exists and functions. Therefore, it is highly desirable to investigate proteins in their natural habitat, i.e., within live cells. This is the major ambition of in-cell NMR, which aims to approach structure-function relationship under true in vivo conditions following delivery of labeled proteins into cells under physiological conditions. With a multidisciplinary approach that includes recombinant protein production, confocal fluorescence microscopy, nuclear magnetic resonance (NMR) spectroscopy and different intracellular protein delivery strategies, we explore the possibility to develop in-cell NMR studies in living plant cells. While we provide a comprehensive framework to set-up in-cell NMR, we identified the efficient intracellular introduction of isotope-labeled proteins as the major bottleneck. Based on experiments with the paradigmatic intrinsically disordered proteins (IDPs) Early Response to Dehydration protein 10 and 14, we also established the subcellular localization of ERD14 under abiotic stress. PMID:28469623

  11. Modeling Protein Excited-state Structures from "Over-length" Chemical Cross-links.

    PubMed

    Ding, Yue-He; Gong, Zhou; Dong, Xu; Liu, Kan; Liu, Zhu; Liu, Chao; He, Si-Min; Dong, Meng-Qiu; Tang, Chun

    2017-01-27

    Chemical cross-linking coupled with mass spectroscopy (CXMS) provides proximity information for the cross-linked residues and is used increasingly for modeling protein structures. However, experimentally identified cross-links are sometimes incompatible with the known structure of a protein, as the distance calculated between the cross-linked residues far exceeds the maximum length of the cross-linker. The discrepancies may persist even after eliminating potentially false cross-links and excluding intermolecular ones. Thus the "over-length" cross-links may arise from alternative excited-state conformation of the protein. Here we present a method and associated software DynaXL for visualizing the ensemble structures of multidomain proteins based on intramolecular cross-links identified by mass spectrometry with high confidence. Representing the cross-linkers and cross-linking reactions explicitly, we show that the protein excited-state structure can be modeled with as few as two over-length cross-links. We demonstrate the generality of our method with three systems: calmodulin, enzyme I, and glutamine-binding protein, and we show that these proteins alternate between different conformations for interacting with other proteins and ligands. Taken together, the over-length chemical cross-links contain valuable information about protein dynamics, and our findings here illustrate the relationship between dynamic domain movement and protein function. © 2017 by The American Society for Biochemistry and Molecular Biology, Inc.

  12. Membrane protein structure determination — The next generation☆☆☆

    PubMed Central

    Moraes, Isabel; Evans, Gwyndaf; Sanchez-Weatherby, Juan; Newstead, Simon; Stewart, Patrick D. Shaw

    2014-01-01

    The field of Membrane Protein Structural Biology has grown significantly since its first landmark in 1985 with the first three-dimensional atomic resolution structure of a membrane protein. Nearly twenty-six years later, the crystal structure of the beta2 adrenergic receptor in complex with G protein has contributed to another landmark in the field leading to the 2012 Nobel Prize in Chemistry. At present, more than 350 unique membrane protein structures solved by X-ray crystallography (http://blanco.biomol.uci.edu/mpstruc/exp/list, Stephen White Lab at UC Irvine) are available in the Protein Data Bank. The advent of genomics and proteomics initiatives combined with high-throughput technologies, such as automation, miniaturization, integration and third-generation synchrotrons, has enhanced membrane protein structure determination rate. X-ray crystallography is still the only method capable of providing detailed information on how ligands, cofactors, and ions interact with proteins, and is therefore a powerful tool in biochemistry and drug discovery. Yet the growth of membrane protein crystals suitable for X-ray diffraction studies amazingly remains a fine art and a major bottleneck in the field. It is often necessary to apply as many innovative approaches as possible. In this review we draw attention to the latest methods and strategies for the production of suitable crystals for membrane protein structure determination. In addition we also highlight the impact that third-generation synchrotron radiation has made in the field, summarizing the latest strategies used at synchrotron beamlines for screening and data collection from such demanding crystals. This article is part of a Special Issue entitled: Structural and biophysical characterisation of membrane protein-ligand binding. PMID:23860256

  13. Vertebrate Membrane Proteins: Structure, Function, and Insights from Biophysical Approaches

    PubMed Central

    MÜLLER, DANIEL J.; WU, NAN; PALCZEWSKI, KRZYSZTOF

    2008-01-01

    Membrane proteins are key targets for pharmacological intervention because they are vital for cellular function. Here, we analyze recent progress made in the understanding of the structure and function of membrane proteins with a focus on rhodopsin and development of atomic force microscopy techniques to study biological membranes. Membrane proteins are compartmentalized to carry out extra- and intracellular processes. Biological membranes are densely populated with membrane proteins that occupy approximately 50% of their volume. In most cases membranes contain lipid rafts, protein patches, or paracrystalline formations that lack the higher-order symmetry that would allow them to be characterized by diffraction methods. Despite many technical difficulties, several crystal structures of membrane proteins that illustrate their internal structural organization have been determined. Moreover, high-resolution atomic force microscopy, near-field scanning optical microscopy, and other lower resolution techniques have been used to investigate these structures. Single-molecule force spectroscopy tracks interactions that stabilize membrane proteins and those that switch their functional state; this spectroscopy can be applied to locate a ligand-binding site. Recent development of this technique also reveals the energy landscape of a membrane protein, defining its folding, reaction pathways, and kinetics. Future development and application of novel approaches during the coming years should provide even greater insights to the understanding of biological membrane organization and function. PMID:18321962

  14. Effects of urea induced protein conformational changes on ion exchange chromatographic behavior.

    PubMed

    Hou, Ying; Hansen, Thomas B; Staby, Arne; Cramer, Steven M

    2010-11-19

    Urea is widely employed to facilitate protein separations in ion exchange chromatography at various scales. In this work, five model proteins were used to examine the chromatographic effects of protein conformational changes induced by urea in ion exchange chromatography. Linear gradient experiments were carried out at various urea concentrations and the protein secondary and tertiary structures were evaluated by far UV CD and fluorescence measurements, respectively. The results indicated that chromatographic retention times were well correlated with structural changes and that they were more sensitive to tertiary structural change. Steric Mass Action (SMA) isotherm parameters were also examined and the results indicated that urea induced protein conformational changes could affect both the characteristic charge and equilibrium constants in these systems. Dynamic light scattering analysis of changes in protein size due to urea-induced unfolding indicated that the size of the protein was not correlated with SMA parameter changes. These results indicate that while urea-induced structural changes can have a marked effect on protein chromatographic behavior in IEX, this behavior can be quite complicated and protein specific. These differences in protein behavior may provide insight into how these partially unfolded proteins are interacting with the resin material. Copyright © 2010 Elsevier B.V. All rights reserved.

  15. Alanine and proline content modulate global sensitivity to discrete perturbations in disordered proteins.

    PubMed

    Perez, Romel B; Tischer, Alexander; Auton, Matthew; Whitten, Steven T

    2014-12-01

    Molecular transduction of biological signals is understood primarily in terms of the cooperative structural transitions of protein macromolecules, providing a mechanism through which discrete local structure perturbations affect global macromolecular properties. The recognition that proteins lacking tertiary stability, commonly referred to as intrinsically disordered proteins (IDPs), mediate key signaling pathways suggests that protein structures without cooperative intramolecular interactions may also have the ability to couple local and global structure changes. Presented here are results from experiments that measured and tested the ability of disordered proteins to couple local changes in structure to global changes in structure. Using the intrinsically disordered N-terminal region of the p53 protein as an experimental model, a set of proline (PRO) and alanine (ALA) to glycine (GLY) substitution variants were designed to modulate backbone conformational propensities without introducing non-native intramolecular interactions. The hydrodynamic radius (R(h)) was used to monitor changes in global structure. Circular dichroism spectroscopy showed that the GLY substitutions decreased polyproline II (PP(II)) propensities relative to the wild type, as expected, and fluorescence methods indicated that substitution-induced changes in R(h) were not associated with folding. The experiments showed that changes in local PP(II) structure cause changes in R(h) that are variable and that depend on the intrinsic chain propensities of PRO and ALA residues, demonstrating a mechanism for coupling local and global structure changes. Molecular simulations that model our results were used to extend the analysis to other proteins and illustrate the generality of the observed PRO and alanine effects on the structures of IDPs. © 2014 Wiley Periodicals, Inc.

  16. Analysis of the Free-Energy Surface of Proteins from Reversible Folding Simulations

    PubMed Central

    Allen, Lucy R.; Krivov, Sergei V.; Paci, Emanuele

    2009-01-01

    Computer generated trajectories can, in principle, reveal the folding pathways of a protein at atomic resolution and possibly suggest general and simple rules for predicting the folded structure of a given sequence. While such reversible folding trajectories can only be determined ab initio using all-atom transferable force-fields for a few small proteins, they can be determined for a large number of proteins using coarse-grained and structure-based force-fields, in which a known folded structure is by construction the absolute energy and free-energy minimum. Here we use a model of the fast folding helical λ-repressor protein to generate trajectories in which native and non-native states are in equilibrium and transitions are accurately sampled. Yet, representation of the free-energy surface, which underlies the thermodynamic and dynamic properties of the protein model, from such a trajectory remains a challenge. Projections over one or a small number of arbitrarily chosen progress variables often hide the most important features of such surfaces. The results unequivocally show that an unprojected representation of the free-energy surface provides important and unbiased information and allows a simple and meaningful description of many-dimensional, heterogeneous trajectories, providing new insight into the possible mechanisms of fast-folding proteins. PMID:19593364

  17. Analysis of the free-energy surface of proteins from reversible folding simulations.

    PubMed

    Allen, Lucy R; Krivov, Sergei V; Paci, Emanuele

    2009-07-01

    Computer generated trajectories can, in principle, reveal the folding pathways of a protein at atomic resolution and possibly suggest general and simple rules for predicting the folded structure of a given sequence. While such reversible folding trajectories can only be determined ab initio using all-atom transferable force-fields for a few small proteins, they can be determined for a large number of proteins using coarse-grained and structure-based force-fields, in which a known folded structure is by construction the absolute energy and free-energy minimum. Here we use a model of the fast folding helical lambda-repressor protein to generate trajectories in which native and non-native states are in equilibrium and transitions are accurately sampled. Yet, representation of the free-energy surface, which underlies the thermodynamic and dynamic properties of the protein model, from such a trajectory remains a challenge. Projections over one or a small number of arbitrarily chosen progress variables often hide the most important features of such surfaces. The results unequivocally show that an unprojected representation of the free-energy surface provides important and unbiased information and allows a simple and meaningful description of many-dimensional, heterogeneous trajectories, providing new insight into the possible mechanisms of fast-folding proteins.

  18. Predicting residue-wise contact orders in proteins by support vector regression.

    PubMed

    Song, Jiangning; Burrage, Kevin

    2006-10-03

    The residue-wise contact order (RWCO) describes the sequence separations between the residues of interest and its contacting residues in a protein sequence. It is a new kind of one-dimensional protein structure that represents the extent of long-range contacts and is considered as a generalization of contact order. Together with secondary structure, accessible surface area, the B factor, and contact number, RWCO provides comprehensive and indispensable important information to reconstructing the protein three-dimensional structure from a set of one-dimensional structural properties. Accurately predicting RWCO values could have many important applications in protein three-dimensional structure prediction and protein folding rate prediction, and give deep insights into protein sequence-structure relationships. We developed a novel approach to predict residue-wise contact order values in proteins based on support vector regression (SVR), starting from primary amino acid sequences. We explored seven different sequence encoding schemes to examine their effects on the prediction performance, including local sequence in the form of PSI-BLAST profiles, local sequence plus amino acid composition, local sequence plus molecular weight, local sequence plus secondary structure predicted by PSIPRED, local sequence plus molecular weight and amino acid composition, local sequence plus molecular weight and predicted secondary structure, and local sequence plus molecular weight, amino acid composition and predicted secondary structure. When using local sequences with multiple sequence alignments in the form of PSI-BLAST profiles, we could predict the RWCO distribution with a Pearson correlation coefficient (CC) between the predicted and observed RWCO values of 0.55, and root mean square error (RMSE) of 0.82, based on a well-defined dataset with 680 protein sequences. Moreover, by incorporating global features such as molecular weight and amino acid composition we could further improve the prediction performance with the CC to 0.57 and an RMSE of 0.79. In addition, combining the predicted secondary structure by PSIPRED was found to significantly improve the prediction performance and could yield the best prediction accuracy with a CC of 0.60 and RMSE of 0.78, which provided at least comparable performance compared with the other existing methods. The SVR method shows a prediction performance competitive with or at least comparable to the previously developed linear regression-based methods for predicting RWCO values. In contrast to support vector classification (SVC), SVR is very good at estimating the raw value profiles of the samples. The successful application of the SVR approach in this study reinforces the fact that support vector regression is a powerful tool in extracting the protein sequence-structure relationship and in estimating the protein structural profiles from amino acid sequences.

  19. Conformational dynamics of activation for the pentameric complex of dimeric G protein – coupled receptor and heterotrimeric G protein

    PubMed Central

    Orban, Tivadar; Jastrzebska, Beata; Gupta, Sayan; Wang, Benlian; Miyagi, Masaru; Chance, Mark R.; Palczewski, Krzysztof

    2012-01-01

    Summary Photoactivation of rhodopsin (Rho), a G protein-coupled receptor (GPCR), causes conformational changes that provide a specific binding site for the rod G protein, Gt. In this work we employed structural mass spectrometry (MS) techniques to elucidate the structural changes accompanying transition of ground state Rho to photoactivated Rho (Rho*) and in the pentameric complex between dimeric Rho* and heterotrimeric Gt. Observed differences in hydroxyl radical labeling and deuterium uptake between Rho* and the (Rho*)2-Gt complex suggest that photoactivation causes structural relaxation of Rho following its initial tightening upon Gt coupling. In contrast, nucleotide-free Gt in the complex is significantly more accessible to deuterium uptake allowing it to accept GTP and mediating complex dissociation. Thus, we provide direct evidence that in the critical step of signal amplification, Rho* and Gt exhibit dissimilar conformational changes when they are coupled in the (Rho*)2-Gt complex. PMID:22579250

  20. Protein structure recognition: From eigenvector analysis to structural threading method

    NASA Astrophysics Data System (ADS)

    Cao, Haibo

    In this work, we try to understand the protein folding problem using pair-wise hydrophobic interaction as the dominant interaction for the protein folding process. We found a strong correlation between amino acid sequence and the corresponding native structure of the protein. Some applications of this correlation were discussed in this dissertation include the domain partition and a new structural threading method as well as the performance of this method in the CASP5 competition. In the first part, we give a brief introduction to the protein folding problem. Some essential knowledge and progress from other research groups was discussed. This part include discussions of interactions among amino acids residues, lattice HP model, and the designablity principle. In the second part, we try to establish the correlation between amino acid sequence and the corresponding native structure of the protein. This correlation was observed in our eigenvector study of protein contact matrix. We believe the correlation is universal, thus it can be used in automatic partition of protein structures into folding domains. In the third part, we discuss a threading method based on the correlation between amino acid sequence and ominant eigenvector of the structure contact-matrix. A mathematically straightforward iteration scheme provides a self-consistent optimum global sequence-structure alignment. The computational efficiency of this method makes it possible to search whole protein structure databases for structural homology without relying on sequence similarity. The sensitivity and specificity of this method is discussed, along with a case of blind test prediction. In the appendix, we list the overall performance of this threading method in CASP5 blind test in comparison with other existing approaches.

  1. P-proteins in Arabidopsis are heteromeric structures involved in rapid sieve tube sealing.

    PubMed

    Jekat, Stephan B; Ernst, Antonia M; von Bohl, Andreas; Zielonka, Sascia; Twyman, Richard M; Noll, Gundula A; Prüfer, Dirk

    2013-01-01

    Structural phloem proteins (P-proteins) are characteristic components of the sieve elements in all dicotyledonous and many monocotyledonous angiosperms. Tobacco P-proteins were recently confirmed to be encoded by the widespread sieve element occlusion (SEO) gene family, and tobacco SEO proteins were shown to be directly involved in sieve tube sealing thus preventing the loss of photosynthate. Analysis of the two Arabidopsis SEO proteins (AtSEOa and AtSEOb) indicated that the corresponding P-protein subunits do not act in a redundant manner. However, there are still pending questions regarding the interaction properties and specific functions of AtSEOa and AtSEOb as well as the general function of structural P-proteins in Arabidopsis. In this study, we characterized the Arabidopsis P-proteins in more detail. We used in planta bimolecular fluorescence complementation assays to confirm the predicted heteromeric interactions between AtSEOa and AtSEOb. Arabidopsis mutants depleted for one or both AtSEO proteins lacked the typical P-protein structures normally found in sieve elements, underlining the identity of AtSEO proteins as P-proteins and furthermore providing the means to determine the role of Arabidopsis P-proteins in sieve tube sealing. We therefore developed an assay based on phloem exudation. Mutants with reduced AtSEO expression levels lost twice as much photosynthate following injury as comparable wild-type plants, confirming that Arabidopsis P-proteins are indeed involved in sieve tube sealing.

  2. Structural Influence on the Dominance of Virus-Specific CD4 T Cell Epitopes in Zika Virus Infection.

    PubMed

    Koblischke, Maximilian; Stiasny, Karin; Aberle, Stephan W; Malafa, Stefan; Tschouchnikas, Georgios; Schwaiger, Julia; Kundi, Michael; Heinz, Franz X; Aberle, Judith H

    2018-01-01

    Zika virus (ZIKV) has recently caused explosive outbreaks in Pacific islands, South- and Central America. Like with other flaviviruses, protective immunity is strongly dependent on potently neutralizing antibodies (Abs) directed against the viral envelope protein E. Such Ab formation is promoted by CD4 T cells through direct interaction with B cells that present epitopes derived from E or other structural proteins of the virus. Here, we examined the extent and epitope dominance of CD4 T cell responses to capsid (C) and envelope proteins in Zika patients. All patients developed ZIKV-specific CD4 T cell responses, with substantial contributions of C and E. In both proteins, immunodominant epitopes clustered at sites that are structurally conserved among flaviviruses but have highly variable sequences, suggesting a strong impact of protein structural features on immunodominant CD4 T cell responses. Our data are particularly relevant for designing flavivirus vaccines and their evaluation in T cell assays and provide insights into the importance of viral protein structure for epitope selection and antigenicity.

  3. Isotopic Analysis of Sporocarp Protein and Structural Material Improves Resolution of Fungal Carbon Sources

    PubMed Central

    Chen, Janet; Hofmockel, Kirsten S.; Hobbie, Erik A.

    2016-01-01

    Fungal acquisition of resources is difficult to assess in the field. To determine whether fungi received carbon from recent plant photosynthate, litter or soil-derived organic (C:N bonded) nitrogen, we examined differences in δ13C among bulk tissue, structural carbon, and protein extracts of sporocarps of three fungal types: saprotrophic fungi, fungi with hydrophobic ectomycorrhizae, or fungi with hydrophilic ectomycorrhizae. Sporocarps were collected from experimental plots of the Duke Free-air CO2 enrichment experiment during and after CO2 enrichment. The differential 13C labeling of ecosystem pools in CO2 enrichment experiments was tracked into fungi and provided novel insights into organic nitrogen use. Specifically, sporocarp δ13C as well as δ15N of protein and structural material indicated that fungi with hydrophobic ectomycorrhizae used soil-derived organic nitrogen sources for protein carbon, fungi with hydrophilic ectomycorrhizae used recent plant photosynthates for protein carbon and both fungal groups used photosynthates for structural carbon. Saprotrophic fungi depended on litter produced during fumigation for both protein and structural material. PMID:28082951

  4. Automatic Classification of Protein Structure Using the Maximum Contact Map Overlap Metric

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Andonov, Rumen; Djidjev, Hristo Nikolov; Klau, Gunnar W.

    In this paper, we propose a new distance measure for comparing two protein structures based on their contact map representations. We show that our novel measure, which we refer to as the maximum contact map overlap (max-CMO) metric, satisfies all properties of a metric on the space of protein representations. Having a metric in that space allows one to avoid pairwise comparisons on the entire database and, thus, to significantly accelerate exploring the protein space compared to no-metric spaces. We show on a gold standard superfamily classification benchmark set of 6759 proteins that our exact k-nearest neighbor (k-NN) scheme classifiesmore » up to 224 out of 236 queries correctly and on a larger, extended version of the benchmark with 60; 850 additional structures, up to 1361 out of 1369 queries. Finally, our k-NN classification thus provides a promising approach for the automatic classification of protein structures based on flexible contact map overlap alignments.« less

  5. Carbonization of a stable β-sheet-rich silk protein into a pseudographitic pyroprotein

    PubMed Central

    Cho, Se Youn; Yun, Young Soo; Lee, Sungho; Jang, Dawon; Park, Kyu-Young; Kim, Jae Kyung; Kim, Byung Hoon; Kang, Kisuk; Kaplan, David L.; Jin, Hyoung-Joon

    2015-01-01

    Silk proteins are of great interest to the scientific community owing to their unique mechanical properties and interesting biological functionality. In addition, the silk proteins are not burned out following heating, rather they are transformed into a carbonaceous solid, pyroprotein; several studies have identified potential carbon precursors for state-of-the-art technologies. However, no mechanism for the carbonization of proteins has yet been reported. Here we examine the structural and chemical changes of silk proteins systematically at temperatures above the onset of thermal degradation. We find that the β-sheet structure is transformed into an sp2-hybridized carbon hexagonal structure by simple heating to 350 °C. The pseudographitic crystalline layers grew to form highly ordered graphitic structures following further heating to 2,800 °C. Our results provide a mechanism for the thermal transition of the protein and demonstrate a potential strategy for designing pyroproteins using a clean system with a catalyst-free aqueous wet process for in vivo applications. PMID:25990218

  6. Automatic Classification of Protein Structure Using the Maximum Contact Map Overlap Metric

    DOE PAGES

    Andonov, Rumen; Djidjev, Hristo Nikolov; Klau, Gunnar W.; ...

    2015-10-09

    In this paper, we propose a new distance measure for comparing two protein structures based on their contact map representations. We show that our novel measure, which we refer to as the maximum contact map overlap (max-CMO) metric, satisfies all properties of a metric on the space of protein representations. Having a metric in that space allows one to avoid pairwise comparisons on the entire database and, thus, to significantly accelerate exploring the protein space compared to no-metric spaces. We show on a gold standard superfamily classification benchmark set of 6759 proteins that our exact k-nearest neighbor (k-NN) scheme classifiesmore » up to 224 out of 236 queries correctly and on a larger, extended version of the benchmark with 60; 850 additional structures, up to 1361 out of 1369 queries. Finally, our k-NN classification thus provides a promising approach for the automatic classification of protein structures based on flexible contact map overlap alignments.« less

  7. Sequence-similar, structure-dissimilar protein pairs in the PDB.

    PubMed

    Kosloff, Mickey; Kolodny, Rachel

    2008-05-01

    It is often assumed that in the Protein Data Bank (PDB), two proteins with similar sequences will also have similar structures. Accordingly, it has proved useful to develop subsets of the PDB from which "redundant" structures have been removed, based on a sequence-based criterion for similarity. Similarly, when predicting protein structure using homology modeling, if a template structure for modeling a target sequence is selected by sequence alone, this implicitly assumes that all sequence-similar templates are equivalent. Here, we show that this assumption is often not correct and that standard approaches to create subsets of the PDB can lead to the loss of structurally and functionally important information. We have carried out sequence-based structural superpositions and geometry-based structural alignments of a large number of protein pairs to determine the extent to which sequence similarity ensures structural similarity. We find many examples where two proteins that are similar in sequence have structures that differ significantly from one another. The source of the structural differences usually has a functional basis. The number of such proteins pairs that are identified and the magnitude of the dissimilarity depend on the approach that is used to calculate the differences; in particular sequence-based structure superpositioning will identify a larger number of structurally dissimilar pairs than geometry-based structural alignments. When two sequences can be aligned in a statistically meaningful way, sequence-based structural superpositioning provides a meaningful measure of structural differences. This approach and geometry-based structure alignments reveal somewhat different information and one or the other might be preferable in a given application. Our results suggest that in some cases, notably homology modeling, the common use of nonredundant datasets, culled from the PDB based on sequence, may mask important structural and functional information. We have established a data base of sequence-similar, structurally dissimilar protein pairs that will help address this problem (http://luna.bioc.columbia.edu/rachel/seqsimstrdiff.htm).

  8. Fast and accurate non-sequential protein structure alignment using a new asymmetric linear sum assignment heuristic.

    PubMed

    Brown, Peter; Pullan, Wayne; Yang, Yuedong; Zhou, Yaoqi

    2016-02-01

    The three dimensional tertiary structure of a protein at near atomic level resolution provides insight alluding to its function and evolution. As protein structure decides its functionality, similarity in structure usually implies similarity in function. As such, structure alignment techniques are often useful in the classifications of protein function. Given the rapidly growing rate of new, experimentally determined structures being made available from repositories such as the Protein Data Bank, fast and accurate computational structure comparison tools are required. This paper presents SPalignNS, a non-sequential protein structure alignment tool using a novel asymmetrical greedy search technique. The performance of SPalignNS was evaluated against existing sequential and non-sequential structure alignment methods by performing trials with commonly used datasets. These benchmark datasets used to gauge alignment accuracy include (i) 9538 pairwise alignments implied by the HOMSTRAD database of homologous proteins; (ii) a subset of 64 difficult alignments from set (i) that have low structure similarity; (iii) 199 pairwise alignments of proteins with similar structure but different topology; and (iv) a subset of 20 pairwise alignments from the RIPC set. SPalignNS is shown to achieve greater alignment accuracy (lower or comparable root-mean squared distance with increased structure overlap coverage) for all datasets, and the highest agreement with reference alignments from the challenging dataset (iv) above, when compared with both sequentially constrained alignments and other non-sequential alignments. SPalignNS was implemented in C++. The source code, binary executable, and a web server version is freely available at: http://sparks-lab.org yaoqi.zhou@griffith.edu.au. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  9. Integration of Structural Dynamics and Molecular Evolution via Protein Interaction Networks: A New Era in Genomic Medicine

    PubMed Central

    Kumar, Avishek; Butler, Brandon M.; Kumar, Sudhir; Ozkan, S. Banu

    2016-01-01

    Summary Sequencing technologies are revealing many new non-synonymous single nucleotide variants (nsSNVs) in each personal exome. To assess their functional impacts, comparative genomics is frequently employed to predict if they are benign or not. However, evolutionary analysis alone is insufficient, because it misdiagnoses many disease-associated nsSNVs, such as those at positions involved in protein interfaces, and because evolutionary predictions do not provide mechanistic insights into functional change or loss. Structural analyses can aid in overcoming both of these problems by incorporating conformational dynamics and allostery in nSNV diagnosis. Finally, protein-protein interaction networks using systems-level methodologies shed light onto disease etiology and pathogenesis. Bridging these network approaches with structurally resolved protein interactions and dynamics will advance genomic medicine. PMID:26684487

  10. Breakdown of the Debye polarization ansatz at protein-water interfaces

    NASA Astrophysics Data System (ADS)

    Fernández Stigliano, Ariel

    2013-06-01

    The topographical and physico-chemical complexity of protein-water interfaces scales down to the sub-nanoscale range. At this level of confinement, we demonstrate that the dielectric structure of interfacial water entails a breakdown of the Debye ansatz that postulates the alignment of polarization with the protein electrostatic field. The tendencies to promote anomalous polarization are determined for each residue type and a particular kind of structural defect is shown to provide the predominant causal context.

  11. Influence of structure properties on protein-protein interactions-QSAR modeling of changes in diffusion coefficients.

    PubMed

    Bauer, Katharina Christin; Hämmerling, Frank; Kittelmann, Jörg; Dürr, Cathrin; Görlich, Fabian; Hubbuch, Jürgen

    2017-04-01

    Information about protein-protein interactions provides valuable knowledge about the phase behavior of protein solutions during the biopharmaceutical production process. Up to date it is possible to capture their overall impact by an experimentally determined potential of mean force. For the description of this potential, the second virial coefficient B22, the diffusion interaction parameter kD, the storage modulus G', or the diffusion coefficient D is applied. In silico methods do not only have the potential to predict these parameters, but also to provide deeper understanding of the molecular origin of the protein-protein interactions by correlating the data to the protein's three-dimensional structure. This methodology furthermore allows a lower sample consumption and less experimental effort. Of all in silico methods, QSAR modeling, which correlates the properties of the molecule's structure with the experimental behavior, seems to be particularly suitable for this purpose. To verify this, the study reported here dealt with the determination of a QSAR model for the diffusion coefficient of proteins. This model consisted of diffusion coefficients for six different model proteins at various pH values and NaCl concentrations. The generated QSAR model showed a good correlation between experimental and predicted data with a coefficient of determination R2 = 0.9 and a good predictability for an external test set with R2 = 0.91. The information about the properties affecting protein-protein interactions present in solution was in agreement with experiment and theory. Furthermore, the model was able to give a more detailed picture of the protein properties influencing the diffusion coefficient and the acting protein-protein interactions. Biotechnol. Bioeng. 2017;114: 821-831. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  12. The structural flexibility of the human copper chaperone Atox1: Insights from combined pulsed EPR studies and computations.

    PubMed

    Levy, Ariel R; Turgeman, Meital; Gevorkyan-Aiapetov, Lada; Ruthstein, Sharon

    2017-08-01

    Metallochaperones are responsible for shuttling metal ions to target proteins. Thus, a metallochaperone's structure must be sufficiently flexible both to hold onto its ion while traversing the cytoplasm and to transfer the ion to or from a partner protein. Here, we sought to shed light on the structure of Atox1, a metallochaperone involved in the human copper regulation system. Atox1 shuttles copper ions from the main copper transporter, Ctr1, to the ATP7b transporter in the Golgi apparatus. Conventional biophysical tools such as X-ray or NMR cannot always target the various conformational states of metallochaperones, owing to a requirement for crystallography or low sensitivity and resolution. Electron paramagnetic resonance (EPR) spectroscopy has recently emerged as a powerful tool for resolving biological reactions and mechanisms in solution. When coupled with computational methods, EPR with site-directed spin labeling and nanoscale distance measurements can provide structural information on a protein or protein complex in solution. We use these methods to show that Atox1 can accommodate at least four different conformations in the apo state (unbound to copper), and two different conformations in the holo state (bound to copper). We also demonstrate that the structure of Atox1 in the holo form is more compact than in the apo form. Our data provide insight regarding the structural mechanisms through which Atox1 can fulfill its dual role of copper binding and transfer. © 2017 The Protein Society.

  13. Chlamydia trachomatis protein CT009 is a structural and functional homolog to the key morphogenesis component RodZ and interacts with division septal plane localized MreB

    DOE PAGES

    Kemege, Kyle E.; Hickey, John M.; Barta, Michael L.; ...

    2014-11-10

    Cell division in Chlamydiae is poorly understood as apparent homologs to most conserved bacterial cell division proteins are lacking and presence of elongation (rod shape) associated proteins indicate non-canonical mechanisms may be employed. The rod-shape determining protein MreB has been proposed as playing a unique role in chlamydial cell division. In other organisms, MreB is part of an elongation complex that requires RodZ for proper function. A recent study reported that the protein encoded by ORF CT009 interacts with MreB despite low sequence similarity to RodZ. The studies in this paper expand on those observations through protein structure, mutagenesis andmore » cellular localization analyses. Structural analysis indicated that CT009 shares high level of structural similarity to RodZ, revealing the conserved orientation of two residues critical for MreB interaction. Substitutions eliminated MreB protein interaction and partial complementation provided by CT009 in RodZ deficient Escherichia coli. Cellular localization analysis of CT009 showed uniform membrane staining in Chlamydia. This was in contrast to the localization of MreB, which was restricted to predicted septal planes. Finally, MreB localization to septal planes provides direct experimental observation for the role of MreB in cell division and supports the hypothesis that it serves as a functional replacement for FtsZ in Chlamydia.« less

  14. Chlamydia trachomatis protein CT009 is a structural and functional homolog to the key morphogenesis component RodZ and interacts with division septal plane localized MreB

    PubMed Central

    Kemege, Kyle E.; Hickey, John M.; Barta, Michael L.; Wickstrum, Jason; Balwalli, Namita; Lovell, Scott; Battaile, Kevin P.; Hefty, P. Scott

    2015-01-01

    Summary Cell division in Chlamydiae is poorly understood as apparent homologs to most conserved bacterial cell division proteins are lacking and presence of elongation (rod shape) associated proteins indicate non-canonical mechanisms may be employed. The rod-shape determining protein MreB has been proposed as playing a unique role in chlamydial cell division. In other organisms, MreB is part of an elongation complex that requires RodZ for proper function. A recent study reported that the protein encoded by ORF CT009 interacts with MreB despite low sequence similarity to RodZ. The studies herein expand on those observations through protein structure, mutagenesis, and cellular localization analyses. Structural analysis indicated that CT009 shares high level of structural similarity to RodZ, revealing the conserved orientation of two residues critical for MreB interaction. Substitutions eliminated MreB protein interaction and partial complementation provided by CT009 in RodZ deficient E. coli. Cellular localization analysis of CT009 showed uniform membrane staining in Chlamydia. This was in contrast to the localization of MreB, which was restricted to predicted septal planes. MreB localization to septal planes provides direct experimental observation for the role of MreB in cell division and supports the hypothesis that it serves as a functional replacement for FtsZ in Chlamydia. PMID:25382739

  15. Chlamydia trachomatis protein CT009 is a structural and functional homolog to the key morphogenesis component RodZ and interacts with division septal plane localized MreB.

    PubMed

    Kemege, Kyle E; Hickey, John M; Barta, Michael L; Wickstrum, Jason; Balwalli, Namita; Lovell, Scott; Battaile, Kevin P; Hefty, P Scott

    2015-02-01

    Cell division in Chlamydiae is poorly understood as apparent homologs to most conserved bacterial cell division proteins are lacking and presence of elongation (rod shape) associated proteins indicate non-canonical mechanisms may be employed. The rod-shape determining protein MreB has been proposed as playing a unique role in chlamydial cell division. In other organisms, MreB is part of an elongation complex that requires RodZ for proper function. A recent study reported that the protein encoded by ORF CT009 interacts with MreB despite low sequence similarity to RodZ. The studies herein expand on those observations through protein structure, mutagenesis and cellular localization analyses. Structural analysis indicated that CT009 shares high level of structural similarity to RodZ, revealing the conserved orientation of two residues critical for MreB interaction. Substitutions eliminated MreB protein interaction and partial complementation provided by CT009 in RodZ deficient Escherichia coli. Cellular localization analysis of CT009 showed uniform membrane staining in Chlamydia. This was in contrast to the localization of MreB, which was restricted to predicted septal planes. MreB localization to septal planes provides direct experimental observation for the role of MreB in cell division and supports the hypothesis that it serves as a functional replacement for FtsZ in Chlamydia. © 2014 John Wiley & Sons Ltd.

  16. Columba: an integrated database of proteins, structures, and annotations.

    PubMed

    Trissl, Silke; Rother, Kristian; Müller, Heiko; Steinke, Thomas; Koch, Ina; Preissner, Robert; Frömmel, Cornelius; Leser, Ulf

    2005-03-31

    Structural and functional research often requires the computation of sets of protein structures based on certain properties of the proteins, such as sequence features, fold classification, or functional annotation. Compiling such sets using current web resources is tedious because the necessary data are spread over many different databases. To facilitate this task, we have created COLUMBA, an integrated database of annotations of protein structures. COLUMBA currently integrates twelve different databases, including PDB, KEGG, Swiss-Prot, CATH, SCOP, the Gene Ontology, and ENZYME. The database can be searched using either keyword search or data source-specific web forms. Users can thus quickly select and download PDB entries that, for instance, participate in a particular pathway, are classified as containing a certain CATH architecture, are annotated as having a certain molecular function in the Gene Ontology, and whose structures have a resolution under a defined threshold. The results of queries are provided in both machine-readable extensible markup language and human-readable format. The structures themselves can be viewed interactively on the web. The COLUMBA database facilitates the creation of protein structure data sets for many structure-based studies. It allows to combine queries on a number of structure-related databases not covered by other projects at present. Thus, information on both many and few protein structures can be used efficiently. The web interface for COLUMBA is available at http://www.columba-db.de.

  17. Local-global alignment for finding 3D similarities in protein structures

    DOEpatents

    Zemla, Adam T [Brentwood, CA

    2011-09-20

    A method of finding 3D similarities in protein structures of a first molecule and a second molecule. The method comprises providing preselected information regarding the first molecule and the second molecule. Comparing the first molecule and the second molecule using Longest Continuous Segments (LCS) analysis. Comparing the first molecule and the second molecule using Global Distance Test (GDT) analysis. Comparing the first molecule and the second molecule using Local Global Alignment Scoring function (LGA_S) analysis. Verifying constructed alignment and repeating the steps to find the regions of 3D similarities in protein structures.

  18. Text Mining for Protein Docking

    PubMed Central

    Badal, Varsha D.; Kundrotas, Petras J.; Vakser, Ilya A.

    2015-01-01

    The rapidly growing amount of publicly available information from biomedical research is readily accessible on the Internet, providing a powerful resource for predictive biomolecular modeling. The accumulated data on experimentally determined structures transformed structure prediction of proteins and protein complexes. Instead of exploring the enormous search space, predictive tools can simply proceed to the solution based on similarity to the existing, previously determined structures. A similar major paradigm shift is emerging due to the rapidly expanding amount of information, other than experimentally determined structures, which still can be used as constraints in biomolecular structure prediction. Automated text mining has been widely used in recreating protein interaction networks, as well as in detecting small ligand binding sites on protein structures. Combining and expanding these two well-developed areas of research, we applied the text mining to structural modeling of protein-protein complexes (protein docking). Protein docking can be significantly improved when constraints on the docking mode are available. We developed a procedure that retrieves published abstracts on a specific protein-protein interaction and extracts information relevant to docking. The procedure was assessed on protein complexes from Dockground (http://dockground.compbio.ku.edu). The results show that correct information on binding residues can be extracted for about half of the complexes. The amount of irrelevant information was reduced by conceptual analysis of a subset of the retrieved abstracts, based on the bag-of-words (features) approach. Support Vector Machine models were trained and validated on the subset. The remaining abstracts were filtered by the best-performing models, which decreased the irrelevant information for ~ 25% complexes in the dataset. The extracted constraints were incorporated in the docking protocol and tested on the Dockground unbound benchmark set, significantly increasing the docking success rate. PMID:26650466

  19. Dimerization of a flocculent protein from Moringa oleifera: experimental evidence and in silico interpretation.

    PubMed

    Pavankumar, Asalapuram R; Kayathri, Rajarathinam; Murugan, Natarajan A; Zhang, Qiong; Srivastava, Vaibhav; Okoli, Chuka; Bulone, Vincent; Rajarao, Gunaratna K; Ågren, Hans

    2014-01-01

    Many proteins exist in dimeric and other oligomeric forms to gain stability and functional advantages. In this study, the dimerization property of a coagulant protein (MO2.1) from Moringa oleifera seeds was addressed through laboratory experiments, protein-protein docking studies and binding free energy calculations. The structure of MO2.1 was predicted by homology modelling, while binding free energy and residues-distance profile analyses provided insight into the energetics and structural factors for dimer formation. Since the coagulation activities of the monomeric and dimeric forms of MO2.1 were comparable, it was concluded that oligomerization does not affect the biological activity of the protein.

  20. Comprehensive comparative analysis and identification of RNA-binding protein domains: multi-class classification and feature selection.

    PubMed

    Jahandideh, Samad; Srinivasasainagendra, Vinodh; Zhi, Degui

    2012-11-07

    RNA-protein interaction plays an important role in various cellular processes, such as protein synthesis, gene regulation, post-transcriptional gene regulation, alternative splicing, and infections by RNA viruses. In this study, using Gene Ontology Annotated (GOA) and Structural Classification of Proteins (SCOP) databases an automatic procedure was designed to capture structurally solved RNA-binding protein domains in different subclasses. Subsequently, we applied tuned multi-class SVM (TMCSVM), Random Forest (RF), and multi-class ℓ1/ℓq-regularized logistic regression (MCRLR) for analysis and classifying RNA-binding protein domains based on a comprehensive set of sequence and structural features. In this study, we compared prediction accuracy of three different state-of-the-art predictor methods. From our results, TMCSVM outperforms the other methods and suggests the potential of TMCSVM as a useful tool for facilitating the multi-class prediction of RNA-binding protein domains. On the other hand, MCRLR by elucidating importance of features for their contribution in predictive accuracy of RNA-binding protein domains subclasses, helps us to provide some biological insights into the roles of sequences and structures in protein-RNA interactions.

Top