Functional assignment to JEV proteins using SVM.
Sahoo, Ganesh Chandra; Dikhit, Manas Ranjan; Das, Pradeep
2008-01-01
Identification of different protein functions facilitates a mechanistic understanding of Japanese encephalitis virus (JEV) infection and opens novel means for drug development. Support vector machines (SVM), useful for predicting the functional class of distantly related proteins, is employed to ascribe a possible functional class to Japanese encephalitis virus protein. Our study from SVMProt and available JE virus sequences suggests that structural and nonstructural proteins of JEV genome possibly belong to diverse protein functions, are expected to occur in the life cycle of JE virus. Protein functions common to both structural and non-structural proteins are iron-binding, metal-binding, lipid-binding, copper-binding, transmembrane, outer membrane, channels/Pores - Pore-forming toxins (proteins and peptides) group of proteins. Non-structural proteins perform functions like actin binding, zinc-binding, calcium-binding, hydrolases, Carbon-Oxygen Lyases, P-type ATPase, proteins belonging to major facilitator family (MFS), secreting main terminal branch (MTB) family, phosphotransfer-driven group translocators and ATP-binding cassette (ABC) family group of proteins. Whereas structural proteins besides belonging to same structural group of proteins (capsid, structural, envelope), they also perform functions like nuclear receptor, antibiotic resistance, RNA-binding, DNA-binding, magnesium-binding, isomerase (intra-molecular), oxidoreductase and participate in type II (general) secretory pathway (IISP).
Functional assignment to JEV proteins using SVM
Sahoo, Ganesh Chandra; Dikhit, Manas Ranjan; Das, Pradeep
2008-01-01
Identification of different protein functions facilitates a mechanistic understanding of Japanese encephalitis virus (JEV) infection and opens novel means for drug development. Support vector machines (SVM), useful for predicting the functional class of distantly related proteins, is employed to ascribe a possible functional class to Japanese encephalitis virus protein. Our study from SVMProt and available JE virus sequences suggests that structural and nonstructural proteins of JEV genome possibly belong to diverse protein functions, are expected to occur in the life cycle of JE virus. Protein functions common to both structural and non-structural proteins are iron-binding, metal-binding, lipid-binding, copper-binding, transmembrane, outer membrane, channels/Pores - Pore-forming toxins (proteins and peptides) group of proteins. Non-structural proteins perform functions like actin binding, zinc-binding, calcium-binding, hydrolases, Carbon-Oxygen Lyases, P-type ATPase, proteins belonging to major facilitator family (MFS), secreting main terminal branch (MTB) family, phosphotransfer-driven group translocators and ATP-binding cassette (ABC) family group of proteins. Whereas structural proteins besides belonging to same structural group of proteins (capsid, structural, envelope), they also perform functions like nuclear receptor, antibiotic resistance, RNA-binding, DNA-binding, magnesium-binding, isomerase (intra-molecular), oxidoreductase and participate in type II (general) secretory pathway (IISP). PMID:19052658
Visualizing and Clustering Protein Similarity Networks: Sequences, Structures, and Functions.
Mai, Te-Lun; Hu, Geng-Ming; Chen, Chi-Ming
2016-07-01
Research in the recent decade has demonstrated the usefulness of protein network knowledge in furthering the study of molecular evolution of proteins, understanding the robustness of cells to perturbation, and annotating new protein functions. In this study, we aimed to provide a general clustering approach to visualize the sequence-structure-function relationship of protein networks, and investigate possible causes for inconsistency in the protein classifications based on sequences, structures, and functions. Such visualization of protein networks could facilitate our understanding of the overall relationship among proteins and help researchers comprehend various protein databases. As a demonstration, we clustered 1437 enzymes by their sequences and structures using the minimum span clustering (MSC) method. The general structure of this protein network was delineated at two clustering resolutions, and the second level MSC clustering was found to be highly similar to existing enzyme classifications. The clustering of these enzymes based on sequence, structure, and function information is consistent with each other. For proteases, the Jaccard's similarity coefficient is 0.86 between sequence and function classifications, 0.82 between sequence and structure classifications, and 0.78 between structure and function classifications. From our clustering results, we discussed possible examples of divergent evolution and convergent evolution of enzymes. Our clustering approach provides a panoramic view of the sequence-structure-function network of proteins, helps visualize the relation between related proteins intuitively, and is useful in predicting the structure and function of newly determined protein sequences.
Dong, Zheng; Zhou, Hongyu; Tao, Peng
2018-02-01
PAS domains are widespread in archaea, bacteria, and eukaryota, and play important roles in various functions. In this study, we aim to explore functional evolutionary relationship among proteins in the PAS domain superfamily in view of the sequence-structure-dynamics-function relationship. We collected protein sequences and crystal structure data from RCSB Protein Data Bank of the PAS domain superfamily belonging to three biological functions (nucleotide binding, photoreceptor activity, and transferase activity). Protein sequences were aligned and then used to select sequence-conserved residues and build phylogenetic tree. Three-dimensional structure alignment was also applied to obtain structure-conserved residues. The protein dynamics were analyzed using elastic network model (ENM) and validated by molecular dynamics (MD) simulation. The result showed that the proteins with same function could be grouped by sequence similarity, and proteins in different functional groups displayed statistically significant difference in their vibrational patterns. Interestingly, in all three functional groups, conserved amino acid residues identified by sequence and structure conservation analysis generally have a lower fluctuation than other residues. In addition, the fluctuation of conserved residues in each biological function group was strongly correlated with the corresponding biological function. This research suggested a direct connection in which the protein sequences were related to various functions through structural dynamics. This is a new attempt to delineate functional evolution of proteins using the integrated information of sequence, structure, and dynamics. © 2017 The Protein Society.
Understand protein functions by comparing the similarity of local structural environments.
Chen, Jiawen; Xie, Zhong-Ru; Wu, Yinghao
2017-02-01
The three-dimensional structures of proteins play an essential role in regulating binding between proteins and their partners, offering a direct relationship between structures and functions of proteins. It is widely accepted that the function of a protein can be determined if its structure is similar to other proteins whose functions are known. However, it is also observed that proteins with similar global structures do not necessarily correspond to the same function, while proteins with very different folds can share similar functions. This indicates that function similarity is originated from the local structural information of proteins instead of their global shapes. We assume that proteins with similar local environments prefer binding to similar types of molecular targets. In order to testify this assumption, we designed a new structural indicator to define the similarity of local environment between residues in different proteins. This indicator was further used to calculate the probability that a given residue binds to a specific type of structural neighbors, including DNA, RNA, small molecules and proteins. After applying the method to a large-scale non-redundant database of proteins, we show that the positive signal of binding probability calculated from the local structural indicator is statistically meaningful. In summary, our studies suggested that the local environment of residues in a protein is a good indicator to recognize specific binding partners of the protein. The new method could be a potential addition to a suite of existing template-based approaches for protein function prediction. Copyright © 2016 Elsevier B.V. All rights reserved.
von Grotthuss, Marcin; Plewczynski, Dariusz; Ginalski, Krzysztof; Rychlewski, Leszek; Shakhnovich, Eugene I
2006-02-06
The number of protein structures from structural genomics centers dramatically increases in the Protein Data Bank (PDB). Many of these structures are functionally unannotated because they have no sequence similarity to proteins of known function. However, it is possible to successfully infer function using only structural similarity. Here we present the PDB-UF database, a web-accessible collection of predictions of enzymatic properties using structure-function relationship. The assignments were conducted for three-dimensional protein structures of unknown function that come from structural genomics initiatives. We show that 4 hypothetical proteins (with PDB accession codes: 1VH0, 1NS5, 1O6D, and 1TO0), for which standard BLAST tools such as PSI-BLAST or RPS-BLAST failed to assign any function, are probably methyltransferase enzymes. We suggest that the structure-based prediction of an EC number should be conducted having the different similarity score cutoff for different protein folds. Moreover, performing the annotation using two different algorithms can reduce the rate of false positive assignments. We believe, that the presented web-based repository will help to decrease the number of protein structures that have functions marked as "unknown" in the PDB file. http://paradox.harvard.edu/PDB-UF and http://bioinfo.pl/PDB-UF.
In Silico Analysis for the Study of Botulinum Toxin Structure
NASA Astrophysics Data System (ADS)
Suzuki, Tomonori; Miyazaki, Satoru
2010-01-01
Protein-protein interactions play many important roles in biological function. Knowledge of protein-protein complex structure is required for understanding the function. The determination of protein-protein complex structure by experimental studies remains difficult, therefore computational prediction of protein structures by structure modeling and docking studies is valuable method. In addition, MD simulation is also one of the most popular methods for protein structure modeling and characteristics. Here, we attempt to predict protein-protein complex structure and property using some of bioinformatic methods, and we focus botulinum toxin complex as target structure.
Dynamic New World: Refining Our View of Protein Structure, Function and Evolution
Mannige, Ranjan V.
2014-01-01
Proteins are crucial to the functioning of all lifeforms. Traditional understanding posits that a single protein occupies a single structure (“fold”), which performs a single function. This view is radically challenged with the recognition that high structural dynamism—the capacity to be extra “floppy”—is more prevalent in functional proteins than previously assumed. As reviewed here, this dynamic take on proteins affects our understanding of protein “structure”, function, and evolution, and even gives us a glimpse into protein origination. Specifically, this review will discuss historical developments concerning protein structure, and important new relationships between dynamism and aspects of protein sequence, structure, binding modes, binding promiscuity, evolvability, and origination. Along the way, suggestions will be provided for how key parts of textbook definitions—that so far have excluded membership to intrinsically disordered proteins (IDPs)—could be modified to accommodate our more dynamic understanding of proteins. PMID:28250374
Rincon, Sergio A; Paoletti, Anne
2016-01-01
Unveiling the function of a novel protein is a challenging task that requires careful experimental design. Yeast cytokinesis is a conserved process that involves modular structural and regulatory proteins. For such proteins, an important step is to identify their domains and structural organization. Here we briefly discuss a collection of methods commonly used for sequence alignment and prediction of protein structure that represent powerful tools for the identification homologous domains and design of structure-function approaches to test experimentally the function of multi-domain proteins such as those implicated in yeast cytokinesis.
Protein Structure and Function Prediction Using I-TASSER
Yang, Jianyi; Zhang, Yang
2016-01-01
I-TASSER is a hierarchical protocol for automated protein structure prediction and structure-based function annotation. Starting from the amino acid sequence of target proteins, I-TASSER first generates full-length atomic structural models from multiple threading alignments and iterative structural assembly simulations followed by atomic-level structure refinement. The biological functions of the protein, including ligand-binding sites, enzyme commission number, and gene ontology terms, are then inferred from known protein function databases based on sequence and structure profile comparisons. I-TASSER is freely available as both an on-line server and a stand-alone package. This unit describes how to use the I-TASSER protocol to generate structure and function prediction and how to interpret the prediction results, as well as alternative approaches for further improving the I-TASSER modeling quality for distant-homologous and multi-domain protein targets. PMID:26678386
Structure and function of seed storage proteins in faba bean (Vicia faba L.).
Liu, Yujiao; Wu, Xuexia; Hou, Wanwei; Li, Ping; Sha, Weichao; Tian, Yingying
2017-05-01
The protein subunit is the most important basic unit of protein, and its study can unravel the structure and function of seed storage proteins in faba bean. In this study, we identified six specific protein subunits in Faba bean (cv. Qinghai 13) combining liquid chromatography (LC), liquid chromatography-electronic spray ionization mass (LC-ESI-MS/MS) and bio-information technology. The results suggested a diversity of seed storage proteins in faba bean, and a total of 16 proteins (four GroEL molecular chaperones and 12 plant-specific proteins) were identified from 97-, 96-, 64-, 47-, 42-, and 38-kD-specific protein subunits in faba bean based on the peptide sequence. We also analyzed the composition and abundance of the amino acids, the physicochemical characteristics, secondary structure, three-dimensional structure, transmembrane domain, and possible subcellular localization of these identified proteins in faba bean seed, and finally predicted function and structure. The three-dimensional structures were generated based on homologous modeling, and the protein function was analyzed based on the annotation from the non-redundant protein database (NR database, NCBI) and function analysis of optimal modeling. The objective of this study was to identify the seed storage proteins in faba bean and confirm the structure and function of these proteins. Our results can be useful for the study of protein nutrition and achieve breeding goals for optimal protein quality in faba bean.
Tuncbag, Nurcan; Gursoy, Attila; Nussinov, Ruth; Keskin, Ozlem
2011-08-11
Prediction of protein-protein interactions at the structural level on the proteome scale is important because it allows prediction of protein function, helps drug discovery and takes steps toward genome-wide structural systems biology. We provide a protocol (termed PRISM, protein interactions by structural matching) for large-scale prediction of protein-protein interactions and assembly of protein complex structures. The method consists of two components: rigid-body structural comparisons of target proteins to known template protein-protein interfaces and flexible refinement using a docking energy function. The PRISM rationale follows our observation that globally different protein structures can interact via similar architectural motifs. PRISM predicts binding residues by using structural similarity and evolutionary conservation of putative binding residue 'hot spots'. Ultimately, PRISM could help to construct cellular pathways and functional, proteome-scale annotation. PRISM is implemented in Python and runs in a UNIX environment. The program accepts Protein Data Bank-formatted protein structures and is available at http://prism.ccbb.ku.edu.tr/prism_protocol/.
Bandyopadhyay, Deepak; Huan, Jun; Prins, Jan; Snoeyink, Jack; Wang, Wei; Tropsha, Alexander
2009-11-01
Protein function prediction is one of the central problems in computational biology. We present a novel automated protein structure-based function prediction method using libraries of local residue packing patterns that are common to most proteins in a known functional family. Critical to this approach is the representation of a protein structure as a graph where residue vertices (residue name used as a vertex label) are connected by geometrical proximity edges. The approach employs two steps. First, it uses a fast subgraph mining algorithm to find all occurrences of family-specific labeled subgraphs for all well characterized protein structural and functional families. Second, it queries a new structure for occurrences of a set of motifs characteristic of a known family, using a graph index to speed up Ullman's subgraph isomorphism algorithm. The confidence of function inference from structure depends on the number of family-specific motifs found in the query structure compared with their distribution in a large non-redundant database of proteins. This method can assign a new structure to a specific functional family in cases where sequence alignments, sequence patterns, structural superposition and active site templates fail to provide accurate annotation.
A Web-Accessible Protein Structure Prediction Pipeline
2009-06-01
Abstract Proteins are the molecular basis of nearly all structural, catalytic, sensory, and regulatory functions in living organisms. The biological...sensory, and regulatory functions in living organisms. The structure of a protein is essential in understanding its function at the molecular level...Characterizing sequence-structure and structure-function relationships have been the goals of molecular biology for more than three decades
Rebelling for a Reason: Protein Structural “Outliers”
Arumugam, Gandhimathi; Nair, Anu G.; Hariharaputran, Sridhar; Ramanathan, Sowdhamini
2013-01-01
Analysis of structural variation in domain superfamilies can reveal constraints in protein evolution which aids protein structure prediction and classification. Structure-based sequence alignment of distantly related proteins, organized in PASS2 database, provides clues about structurally conserved regions among different functional families. Some superfamily members show large structural differences which are functionally relevant. This paper analyses the impact of structural divergence on function for multi-member superfamilies, selected from the PASS2 superfamily alignment database. Functional annotations within superfamilies, with structural outliers or ‘rebels’, are discussed in the context of structural variations. Overall, these data reinforce the idea that functional similarities cannot be extrapolated from mere structural conservation. The implication for fold-function prediction is that the functional annotations can only be inherited with very careful consideration, especially at low sequence identities. PMID:24073209
Medvedeva, Irina V; Demenkov, Pavel S; Ivanisenko, Vladimir A
2017-04-01
Functional sites define the diversity of protein functions and are the central object of research of the structural and functional organization of proteins. The mechanisms underlying protein functional sites emergence and their variability during evolution are distinguished by duplication, shuffling, insertion and deletion of the exons in genes. The study of the correlation between a site structure and exon structure serves as the basis for the in-depth understanding of sites organization. In this regard, the development of programming resources that allow the realization of the mutual projection of exon structure of genes and primary and tertiary structures of encoded proteins is still the actual problem. Previously, we developed the SitEx system that provides information about protein and gene sequences with mapped exon borders and protein functional sites amino acid positions. The database included information on proteins with known 3D structure. However, data with respect to orthologs was not available. Therefore, we added the projection of sites positions to the exon structures of orthologs in SitEx 2.0. We implemented a search through database using site conservation variability and site discontinuity through exon structure. Inclusion of the information on orthologs allowed to expand the possibilities of SitEx usage for solving problems regarding the analysis of the structural and functional organization of proteins. Database URL: http://www-bionet.sscc.ru/sitex/ .
G-LoSA for Prediction of Protein-Ligand Binding Sites and Structures.
Lee, Hui Sun; Im, Wonpil
2017-01-01
Recent advances in high-throughput structure determination and computational protein structure prediction have significantly enriched the universe of protein structure. However, there is still a large gap between the number of available protein structures and that of proteins with annotated function in high accuracy. Computational structure-based protein function prediction has emerged to reduce this knowledge gap. The identification of a ligand binding site and its structure is critical to the determination of a protein's molecular function. We present a computational methodology for predicting small molecule ligand binding site and ligand structure using G-LoSA, our protein local structure alignment and similarity measurement tool. All the computational procedures described here can be easily implemented using G-LoSA Toolkit, a package of standalone software programs and preprocessed PDB structure libraries. G-LoSA and G-LoSA Toolkit are freely available to academic users at http://compbio.lehigh.edu/GLoSA . We also illustrate a case study to show the potential of our template-based approach harnessing G-LoSA for protein function prediction.
Density functional study of molecular interactions in secondary structures of proteins.
Takano, Yu; Kusaka, Ayumi; Nakamura, Haruki
2016-01-01
Proteins play diverse and vital roles in biology, which are dominated by their three-dimensional structures. The three-dimensional structure of a protein determines its functions and chemical properties. Protein secondary structures, including α-helices and β-sheets, are key components of the protein architecture. Molecular interactions, in particular hydrogen bonds, play significant roles in the formation of protein secondary structures. Precise and quantitative estimations of these interactions are required to understand the principles underlying the formation of three-dimensional protein structures. In the present study, we have investigated the molecular interactions in α-helices and β-sheets, using ab initio wave function-based methods, the Hartree-Fock method (HF) and the second-order Møller-Plesset perturbation theory (MP2), density functional theory, and molecular mechanics. The characteristic interactions essential for forming the secondary structures are discussed quantitatively.
Detection of functionally important regions in "hypothetical proteins" of known structure.
Nimrod, Guy; Schushan, Maya; Steinberg, David M; Ben-Tal, Nir
2008-12-10
Structural genomics initiatives provide ample structures of "hypothetical proteins" (i.e., proteins of unknown function) at an ever increasing rate. However, without function annotation, this structural goldmine is of little use to biologists who are interested in particular molecular systems. To this end, we used (an improved version of) the PatchFinder algorithm for the detection of functional regions on the protein surface, which could mediate its interactions with, e.g., substrates, ligands, and other proteins. Examination, using a data set of annotated proteins, showed that PatchFinder outperforms similar methods. We collected 757 structures of hypothetical proteins and their predicted functional regions in the N-Func database. Inspection of several of these regions demonstrated that they are useful for function prediction. For example, we suggested an interprotein interface and a putative nucleotide-binding site. A web-server implementation of PatchFinder and the N-Func database are available at http://patchfinder.tau.ac.il/.
Functional classification of protein structures by local structure matching in graph representation.
Mills, Caitlyn L; Garg, Rohan; Lee, Joslynn S; Tian, Liang; Suciu, Alexandru; Cooperman, Gene; Beuning, Penny J; Ondrechen, Mary Jo
2018-03-31
As a result of high-throughput protein structure initiatives, over 14,400 protein structures have been solved by structural genomics (SG) centers and participating research groups. While the totality of SG data represents a tremendous contribution to genomics and structural biology, reliable functional information for these proteins is generally lacking. Better functional predictions for SG proteins will add substantial value to the structural information already obtained. Our method described herein, Graph Representation of Active Sites for Prediction of Function (GRASP-Func), predicts quickly and accurately the biochemical function of proteins by representing residues at the predicted local active site as graphs rather than in Cartesian coordinates. We compare the GRASP-Func method to our previously reported method, structurally aligned local sites of activity (SALSA), using the ribulose phosphate binding barrel (RPBB), 6-hairpin glycosidase (6-HG), and Concanavalin A-like Lectins/Glucanase (CAL/G) superfamilies as test cases. In each of the superfamilies, SALSA and the much faster method GRASP-Func yield similar correct classification of previously characterized proteins, providing a validated benchmark for the new method. In addition, we analyzed SG proteins using our SALSA and GRASP-Func methods to predict function. Forty-one SG proteins in the RPBB superfamily, nine SG proteins in the 6-HG superfamily, and one SG protein in the CAL/G superfamily were successfully classified into one of the functional families in their respective superfamily by both methods. This improved, faster, validated computational method can yield more reliable predictions of function that can be used for a wide variety of applications by the community. © 2018 The Authors Protein Science published by Wiley Periodicals, Inc. on behalf of The Protein Society.
Predicting nucleic acid binding interfaces from structural models of proteins
Dror, Iris; Shazman, Shula; Mukherjee, Srayanta; Zhang, Yang; Glaser, Fabian; Mandel-Gutfreund, Yael
2011-01-01
The function of DNA- and RNA-binding proteins can be inferred from the characterization and accurate prediction of their binding interfaces. However the main pitfall of various structure-based methods for predicting nucleic acid binding function is that they are all limited to a relatively small number of proteins for which high-resolution three dimensional structures are available. In this study, we developed a pipeline for extracting functional electrostatic patches from surfaces of protein structural models, obtained using the I-TASSER protein structure predictor. The largest positive patches are extracted from the protein surface using the patchfinder algorithm. We show that functional electrostatic patches extracted from an ensemble of structural models highly overlap the patches extracted from high-resolution structures. Furthermore, by testing our pipeline on a set of 55 known nucleic acid binding proteins for which I-TASSER produces high-quality models, we show that the method accurately identifies the nucleic acids binding interface on structural models of proteins. Employing a combined patch approach we show that patches extracted from an ensemble of models better predicts the real nucleic acid binding interfaces compared to patches extracted from independent models. Overall, these results suggest that combining information from a collection of low-resolution structural models could be a valuable approach for functional annotation. We suggest that our method will be further applicable for predicting other functional surfaces of proteins with unknown structure. PMID:22086767
An overview of the structures of protein-DNA complexes
Luscombe, Nicholas M; Austin, Susan E; Berman , Helen M; Thornton, Janet M
2000-01-01
On the basis of a structural analysis of 240 protein-DNA complexes contained in the Protein Data Bank (PDB), we have classified the DNA-binding proteins involved into eight different structural/functional groups, which are further classified into 54 structural families. Here we present this classification and review the functions, structures and binding interactions of these protein-DNA complexes. PMID:11104519
Du, Yushen; Wu, Nicholas C; Jiang, Lin; Zhang, Tianhao; Gong, Danyang; Shu, Sara; Wu, Ting-Ting; Sun, Ren
2016-11-01
Identification and annotation of functional residues are fundamental questions in protein sequence analysis. Sequence and structure conservation provides valuable information to tackle these questions. It is, however, limited by the incomplete sampling of sequence space in natural evolution. Moreover, proteins often have multiple functions, with overlapping sequences that present challenges to accurate annotation of the exact functions of individual residues by conservation-based methods. Using the influenza A virus PB1 protein as an example, we developed a method to systematically identify and annotate functional residues. We used saturation mutagenesis and high-throughput sequencing to measure the replication capacity of single nucleotide mutations across the entire PB1 protein. After predicting protein stability upon mutations, we identified functional PB1 residues that are essential for viral replication. To further annotate the functional residues important to the canonical or noncanonical functions of viral RNA-dependent RNA polymerase (vRdRp), we performed a homologous-structure analysis with 16 different vRdRp structures. We achieved high sensitivity in annotating the known canonical polymerase functional residues. Moreover, we identified a cluster of noncanonical functional residues located in the loop region of the PB1 β-ribbon. We further demonstrated that these residues were important for PB1 protein nuclear import through the interaction with Ran-binding protein 5. In summary, we developed a systematic and sensitive method to identify and annotate functional residues that are not restrained by sequence conservation. Importantly, this method is generally applicable to other proteins about which homologous-structure information is available. To fully comprehend the diverse functions of a protein, it is essential to understand the functionality of individual residues. Current methods are highly dependent on evolutionary sequence conservation, which is usually limited by sampling size. Sequence conservation-based methods are further confounded by structural constraints and multifunctionality of proteins. Here we present a method that can systematically identify and annotate functional residues of a given protein. We used a high-throughput functional profiling platform to identify essential residues. Coupling it with homologous-structure comparison, we were able to annotate multiple functions of proteins. We demonstrated the method with the PB1 protein of influenza A virus and identified novel functional residues in addition to its canonical function as an RNA-dependent RNA polymerase. Not limited to virology, this method is generally applicable to other proteins that can be functionally selected and about which homologous-structure information is available. Copyright © 2016 Du et al.
Lua, Rhonald C; Wilson, Stephen J; Konecki, Daniel M; Wilkins, Angela D; Venner, Eric; Morgan, Daniel H; Lichtarge, Olivier
2016-01-04
The structure and function of proteins underlie most aspects of biology and their mutational perturbations often cause disease. To identify the molecular determinants of function as well as targets for drugs, it is central to characterize the important residues and how they cluster to form functional sites. The Evolutionary Trace (ET) achieves this by ranking the functional and structural importance of the protein sequence positions. ET uses evolutionary distances to estimate functional distances and correlates genotype variations with those in the fitness phenotype. Thus, ET ranks are worse for sequence positions that vary among evolutionarily closer homologs but better for positions that vary mostly among distant homologs. This approach identifies functional determinants, predicts function, guides the mutational redesign of functional and allosteric specificity, and interprets the action of coding sequence variations in proteins, people and populations. Now, the UET database offers pre-computed ET analyses for the protein structure databank, and on-the-fly analysis of any protein sequence. A web interface retrieves ET rankings of sequence positions and maps results to a structure to identify functionally important regions. This UET database integrates several ways of viewing the results on the protein sequence or structure and can be found at http://mammoth.bcm.tmc.edu/uet/. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
LenVarDB: database of length-variant protein domains.
Mutt, Eshita; Mathew, Oommen K; Sowdhamini, Ramanathan
2014-01-01
Protein domains are functionally and structurally independent modules, which add to the functional variety of proteins. This array of functional diversity has been enabled by evolutionary changes, such as amino acid substitutions or insertions or deletions, occurring in these protein domains. Length variations (indels) can introduce changes at structural, functional and interaction levels. LenVarDB (freely available at http://caps.ncbs.res.in/lenvardb/) traces these length variations, starting from structure-based sequence alignments in our Protein Alignments organized as Structural Superfamilies (PASS2) database, across 731 structural classification of proteins (SCOP)-based protein domain superfamilies connected to 2 730 625 sequence homologues. Alignment of sequence homologues corresponding to a structural domain is available, starting from a structure-based sequence alignment of the superfamily. Orientation of the length-variant (indel) regions in protein domains can be visualized by mapping them on the structure and on the alignment. Knowledge about location of length variations within protein domains and their visual representation will be useful in predicting changes within structurally or functionally relevant sites, which may ultimately regulate protein function. Non-technical summary: Evolutionary changes bring about natural changes to proteins that may be found in many organisms. Such changes could be reflected as amino acid substitutions or insertions-deletions (indels) in protein sequences. LenVarDB is a database that provides an early overview of observed length variations that were set among 731 protein families and after examining >2 million sequences. Indels are followed up to observe if they are close to the active site such that they can affect the activity of proteins. Inclusion of such information can aid the design of bioengineering experiments.
2016-01-01
Biologically active but floppy proteins represent a new reality of modern protein science. These intrinsically disordered proteins (IDPs) and hybrid proteins containing ordered and intrinsically disordered protein regions (IDPRs) constitute a noticeable part of any given proteome. Functionally, they complement ordered proteins, and their conformational flexibility and structural plasticity allow them to perform impossible tricks and be engaged in biological activities that are inaccessible to well folded proteins with their unique structures. The major goals of this minireview are to show that, despite their simplified amino acid sequences, IDPs/IDPRs are complex entities often resembling chaotic systems, are structurally and functionally heterogeneous, and can be considered an important part of the structure-function continuum. Furthermore, IDPs/IDPRs are everywhere, and are ubiquitously engaged in various interactions characterized by a wide spectrum of binding scenarios and an even wider spectrum of structural and functional outputs. PMID:26851286
From Sequence and Forces to Structure, Function and Evolution of Intrinsically Disordered Proteins
Forman-Kay, Julie D.; Mittag, Tanja
2015-01-01
Intrinsically disordered proteins (IDPs), which lack persistent structure, are a challenge to structural biology due to the inapplicability of standard methods for characterization of folded proteins as well as their deviation from the dominant structure/function paradigm. Their widespread presence and involvement in biological function, however, has spurred the growing acceptance of the importance of IDPs and the development of new tools for studying their structure, dynamics and function. The interplay of folded and disordered domains or regions for function and the existence of a continuum of protein states with respect to conformational energetics, motional timescales and compactness is shaping a unified understanding of structure-dynamics-disorder/function relationships. On the 20th anniversary of this journal, Structure, we provide a historical perspective on the investigation of IDPs and summarize the sequence features and physical forces that underlie their unique structural, functional and evolutionary properties. PMID:24010708
From sequence and forces to structure, function, and evolution of intrinsically disordered proteins.
Forman-Kay, Julie D; Mittag, Tanja
2013-09-03
Intrinsically disordered proteins (IDPs), which lack persistent structure, are a challenge to structural biology due to the inapplicability of standard methods for characterization of folded proteins as well as their deviation from the dominant structure/function paradigm. Their widespread presence and involvement in biological function, however, has spurred the growing acceptance of the importance of IDPs and the development of new tools for studying their structure, dynamics, and function. The interplay of folded and disordered domains or regions for function and the existence of a continuum of protein states with respect to conformational energetics, motional timescales, and compactness are shaping a unified understanding of structure-dynamics-disorder/function relationships. In the 20(th) anniversary of Structure, we provide a historical perspective on the investigation of IDPs and summarize the sequence features and physical forces that underlie their unique structural, functional, and evolutionary properties. Copyright © 2013 Elsevier Ltd. All rights reserved.
Predicting nucleic acid binding interfaces from structural models of proteins.
Dror, Iris; Shazman, Shula; Mukherjee, Srayanta; Zhang, Yang; Glaser, Fabian; Mandel-Gutfreund, Yael
2012-02-01
The function of DNA- and RNA-binding proteins can be inferred from the characterization and accurate prediction of their binding interfaces. However, the main pitfall of various structure-based methods for predicting nucleic acid binding function is that they are all limited to a relatively small number of proteins for which high-resolution three-dimensional structures are available. In this study, we developed a pipeline for extracting functional electrostatic patches from surfaces of protein structural models, obtained using the I-TASSER protein structure predictor. The largest positive patches are extracted from the protein surface using the patchfinder algorithm. We show that functional electrostatic patches extracted from an ensemble of structural models highly overlap the patches extracted from high-resolution structures. Furthermore, by testing our pipeline on a set of 55 known nucleic acid binding proteins for which I-TASSER produces high-quality models, we show that the method accurately identifies the nucleic acids binding interface on structural models of proteins. Employing a combined patch approach we show that patches extracted from an ensemble of models better predicts the real nucleic acid binding interfaces compared with patches extracted from independent models. Overall, these results suggest that combining information from a collection of low-resolution structural models could be a valuable approach for functional annotation. We suggest that our method will be further applicable for predicting other functional surfaces of proteins with unknown structure. Copyright © 2011 Wiley Periodicals, Inc.
Kemege, Kyle E.; Hickey, John M.; Lovell, Scott; Battaile, Kevin P.; Zhang, Yang; Hefty, P. Scott
2011-01-01
Chlamydia trachomatis is a medically important pathogen that encodes a relatively high percentage of proteins with unknown function. The three-dimensional structure of a protein can be very informative regarding the protein's functional characteristics; however, determining protein structures experimentally can be very challenging. Computational methods that model protein structures with sufficient accuracy to facilitate functional studies have had notable successes. To evaluate the accuracy and potential impact of computational protein structure modeling of hypothetical proteins encoded by Chlamydia, a successful computational method termed I-TASSER was utilized to model the three-dimensional structure of a hypothetical protein encoded by open reading frame (ORF) CT296. CT296 has been reported to exhibit functional properties of a divalent cation transcription repressor (DcrA), with similarity to the Escherichia coli iron-responsive transcriptional repressor, Fur. Unexpectedly, the I-TASSER model of CT296 exhibited no structural similarity to any DNA-interacting proteins or motifs. To validate the I-TASSER-generated model, the structure of CT296 was solved experimentally using X-ray crystallography. Impressively, the ab initio I-TASSER-generated model closely matched (2.72-Å Cα root mean square deviation [RMSD]) the high-resolution (1.8-Å) crystal structure of CT296. Modeled and experimentally determined structures of CT296 share structural characteristics of non-heme Fe(II) 2-oxoglutarate-dependent enzymes, although key enzymatic residues are not conserved, suggesting a unique biochemical process is likely associated with CT296 function. Additionally, functional analyses did not support prior reports that CT296 has properties shared with divalent cation repressors such as Fur. PMID:21965559
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kemege, Kyle E.; Hickey, John M.; Lovell, Scott
2012-02-13
Chlamydia trachomatis is a medically important pathogen that encodes a relatively high percentage of proteins with unknown function. The three-dimensional structure of a protein can be very informative regarding the protein's functional characteristics; however, determining protein structures experimentally can be very challenging. Computational methods that model protein structures with sufficient accuracy to facilitate functional studies have had notable successes. To evaluate the accuracy and potential impact of computational protein structure modeling of hypothetical proteins encoded by Chlamydia, a successful computational method termed I-TASSER was utilized to model the three-dimensional structure of a hypothetical protein encoded by open reading frame (ORF)more » CT296. CT296 has been reported to exhibit functional properties of a divalent cation transcription repressor (DcrA), with similarity to the Escherichia coli iron-responsive transcriptional repressor, Fur. Unexpectedly, the I-TASSER model of CT296 exhibited no structural similarity to any DNA-interacting proteins or motifs. To validate the I-TASSER-generated model, the structure of CT296 was solved experimentally using X-ray crystallography. Impressively, the ab initio I-TASSER-generated model closely matched (2.72-{angstrom} C{alpha} root mean square deviation [RMSD]) the high-resolution (1.8-{angstrom}) crystal structure of CT296. Modeled and experimentally determined structures of CT296 share structural characteristics of non-heme Fe(II) 2-oxoglutarate-dependent enzymes, although key enzymatic residues are not conserved, suggesting a unique biochemical process is likely associated with CT296 function. Additionally, functional analyses did not support prior reports that CT296 has properties shared with divalent cation repressors such as Fur.« less
A topological approach for protein classification
Cang, Zixuan; Mu, Lin; Wu, Kedi; ...
2015-11-04
Here, protein function and dynamics are closely related to its sequence and structure. However, prediction of protein function and dynamics from its sequence and structure is still a fundamental challenge in molecular biology. Protein classification, which is typically done through measuring the similarity between proteins based on protein sequence or physical information, serves as a crucial step toward the understanding of protein function and dynamics.
Mao, Xiaoying; Hua, Yufei
2012-01-01
In this study, composition, structure and the functional properties of protein concentrate (WPC) and protein isolate (WPI) produced from defatted walnut flour (DFWF) were investigated. The results showed that the composition and structure of walnut protein concentrate (WPC) and walnut protein isolate (WPI) were significantly different. The molecular weight distribution of WPI was uniform and the protein composition of DFWF and WPC was complex with the protein aggregation. H(0) of WPC was significantly higher (p < 0.05) than those of DFWF and WPI, whilst WPI had a higher H(0) compared to DFWF. The secondary structure of WPI was similar to WPC. WPI showed big flaky plate like structures; whereas WPC appeared as a small flaky and more compact structure. The most functional properties of WPI were better than WPC. In comparing most functional properties of WPI and WPC with soybean protein concentrate and isolate, WPI and WPC showed higher fat absorption capacity (FAC). Emulsifying properties and foam properties of WPC and WPI in alkaline pH were comparable with that of soybean protein concentrate and isolate. Walnut protein concentrates and isolates can be considered as potential functional food ingredients.
Houston, Simon; Lithgow, Karen Vivien; Osbak, Kara Krista; Kenyon, Chris Richard; Cameron, Caroline E
2018-05-16
Syphilis continues to be a major global health threat with 11 million new infections each year, and a global burden of 36 million cases. The causative agent of syphilis, Treponema pallidum subspecies pallidum, is a highly virulent bacterium, however the molecular mechanisms underlying T. pallidum pathogenesis remain to be definitively identified. This is due to the fact that T. pallidum is currently uncultivatable, inherently fragile and thus difficult to work with, and phylogenetically distinct with no conventional virulence factor homologs found in other pathogens. In fact, approximately 30% of its predicted protein-coding genes have no known orthologs or assigned functions. Here we employed a structural bioinformatics approach using Phyre2-based tertiary structure modeling to improve our understanding of T. pallidum protein function on a proteome-wide scale. Phyre2-based tertiary structure modeling generated high-confidence predictions for 80% of the T. pallidum proteome (780/978 predicted proteins). Tertiary structure modeling also inferred the same function as primary structure-based annotations from genome sequencing pipelines for 525/605 proteins (87%), which represents 54% (525/978) of all T. pallidum proteins. Of the 175 T. pallidum proteins modeled with high confidence that were not assigned functions in the previously annotated published proteome, 167 (95%) were able to be assigned predicted functions. Twenty-one of the 175 hypothetical proteins modeled with high confidence were also predicted to exhibit significant structural similarity with proteins experimentally confirmed to be required for virulence in other pathogens. Phyre2-based structural modeling is a powerful bioinformatics tool that has provided insight into the potential structure and function of the majority of T. pallidum proteins and helped validate the primary structure-based annotation of more than 50% of all T. pallidum proteins with high confidence. This work represents the first T. pallidum proteome-wide structural modeling study and is one of few studies to apply this approach for the functional annotation of a whole proteome.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cang, Zixuan; Mu, Lin; Wu, Kedi
Here, protein function and dynamics are closely related to its sequence and structure. However, prediction of protein function and dynamics from its sequence and structure is still a fundamental challenge in molecular biology. Protein classification, which is typically done through measuring the similarity between proteins based on protein sequence or physical information, serves as a crucial step toward the understanding of protein function and dynamics.
Bhagavat, Raghu; Sankar, Santhosh; Srinivasan, Narayanaswamy; Chandra, Nagasuma
2018-03-06
Protein-ligand interactions form the basis of most cellular events. Identifying ligand binding pockets in proteins will greatly facilitate rationalizing and predicting protein function. Ligand binding sites are unknown for many proteins of known three-dimensional (3D) structure, creating a gap in our understanding of protein structure-function relationships. To bridge this gap, we detect pockets in proteins of known 3D structures, using computational techniques. This augmented pocketome (PocketDB) consists of 249,096 pockets, which is about seven times larger than what is currently known. We deduce possible ligand associations for about 46% of the newly identified pockets. The augmented pocketome, when subjected to clustering based on similarities among pockets, yielded 2,161 site types, which are associated with 1,037 ligand types, together providing fold-site-type-ligand-type associations. The PocketDB resource facilitates a structure-based function annotation, delineation of the structural basis of ligand recognition, and provides functional clues for domains of unknown functions, allosteric proteins, and druggable pockets. Copyright © 2018 Elsevier Ltd. All rights reserved.
Regad, Leslie; Martin, Juliette; Camproux, Anne-Claude
2011-06-20
One of the strategies for protein function annotation is to search particular structural motifs that are known to be shared by proteins with a given function. Here, we present a systematic extraction of structural motifs of seven residues from protein loops and we explore their correspondence with functional sites. Our approach is based on the structural alphabet HMM-SA (Hidden Markov Model - Structural Alphabet), which allows simplification of protein structures into uni-dimensional sequences, and advanced pattern statistics adapted to short sequences. Structural motifs of interest are selected by looking for structural motifs significantly over-represented in SCOP superfamilies in protein loops. We discovered two types of structural motifs significantly over-represented in SCOP superfamilies: (i) ubiquitous motifs, shared by several superfamilies and (ii) superfamily-specific motifs, over-represented in few superfamilies. A comparison of ubiquitous words with known small structural motifs shows that they contain well-described motifs as turn, niche or nest motifs. A comparison between superfamily-specific motifs and biological annotations of Swiss-Prot reveals that some of them actually correspond to functional sites involved in the binding sites of small ligands, such as ATP/GTP, NAD(P) and SAH/SAM. Our findings show that statistical over-representation in SCOP superfamilies is linked to functional features. The detection of over-represented motifs within structures simplified by HMM-SA is therefore a promising approach for prediction of functional sites and annotation of uncharacterized proteins.
2011-01-01
Background One of the strategies for protein function annotation is to search particular structural motifs that are known to be shared by proteins with a given function. Results Here, we present a systematic extraction of structural motifs of seven residues from protein loops and we explore their correspondence with functional sites. Our approach is based on the structural alphabet HMM-SA (Hidden Markov Model - Structural Alphabet), which allows simplification of protein structures into uni-dimensional sequences, and advanced pattern statistics adapted to short sequences. Structural motifs of interest are selected by looking for structural motifs significantly over-represented in SCOP superfamilies in protein loops. We discovered two types of structural motifs significantly over-represented in SCOP superfamilies: (i) ubiquitous motifs, shared by several superfamilies and (ii) superfamily-specific motifs, over-represented in few superfamilies. A comparison of ubiquitous words with known small structural motifs shows that they contain well-described motifs as turn, niche or nest motifs. A comparison between superfamily-specific motifs and biological annotations of Swiss-Prot reveals that some of them actually correspond to functional sites involved in the binding sites of small ligands, such as ATP/GTP, NAD(P) and SAH/SAM. Conclusions Our findings show that statistical over-representation in SCOP superfamilies is linked to functional features. The detection of over-represented motifs within structures simplified by HMM-SA is therefore a promising approach for prediction of functional sites and annotation of uncharacterized proteins. PMID:21689388
Lee, Hasup; Baek, Minkyung; Lee, Gyu Rie; Park, Sangwoo; Seok, Chaok
2017-03-01
Many proteins function as homo- or hetero-oligomers; therefore, attempts to understand and regulate protein functions require knowledge of protein oligomer structures. The number of available experimental protein structures is increasing, and oligomer structures can be predicted using the experimental structures of related proteins as templates. However, template-based models may have errors due to sequence differences between the target and template proteins, which can lead to functional differences. Such structural differences may be predicted by loop modeling of local regions or refinement of the overall structure. In CAPRI (Critical Assessment of PRotein Interactions) round 30, we used recently developed features of the GALAXY protein modeling package, including template-based structure prediction, loop modeling, model refinement, and protein-protein docking to predict protein complex structures from amino acid sequences. Out of the 25 CAPRI targets, medium and acceptable quality models were obtained for 14 and 1 target(s), respectively, for which proper oligomer or monomer templates could be detected. Symmetric interface loop modeling on oligomer model structures successfully improved model quality, while loop modeling on monomer model structures failed. Overall refinement of the predicted oligomer structures consistently improved the model quality, in particular in interface contacts. Proteins 2017; 85:399-407. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Dewhurst, Henry M.; Choudhury, Shilpa; Torres, Matthew P.
2015-01-01
Predicting the biological function potential of post-translational modifications (PTMs) is becoming increasingly important in light of the exponential increase in available PTM data from high-throughput proteomics. We developed structural analysis of PTM hotspots (SAPH-ire)—a quantitative PTM ranking method that integrates experimental PTM observations, sequence conservation, protein structure, and interaction data to allow rank order comparisons within or between protein families. Here, we applied SAPH-ire to the study of PTMs in diverse G protein families, a conserved and ubiquitous class of proteins essential for maintenance of intracellular structure (tubulins) and signal transduction (large and small Ras-like G proteins). A total of 1728 experimentally verified PTMs from eight unique G protein families were clustered into 451 unique hotspots, 51 of which have a known and cited biological function or response. Using customized software, the hotspots were analyzed in the context of 598 unique protein structures. By comparing distributions of hotspots with known versus unknown function, we show that SAPH-ire analysis is predictive for PTM biological function. Notably, SAPH-ire revealed high-ranking hotspots for which a functional impact has not yet been determined, including phosphorylation hotspots in the N-terminal tails of G protein gamma subunits—conserved protein structures never before reported as regulators of G protein coupled receptor signaling. To validate this prediction we used the yeast model system for G protein coupled receptor signaling, revealing that gamma subunit–N-terminal tail phosphorylation is activated in response to G protein coupled receptor stimulation and regulates protein stability in vivo. These results demonstrate the utility of integrating protein structural and sequence features into PTM prioritization schemes that can improve the analysis and functional power of modification-specific proteomics data. PMID:26070665
Gold, Nicola D; Jackson, Richard M
2006-02-03
The rapid growth in protein structural data and the emergence of structural genomics projects have increased the need for automatic structure analysis and tools for function prediction. Small molecule recognition is critical to the function of many proteins; therefore, determination of ligand binding site similarity is important for understanding ligand interactions and may allow their functional classification. Here, we present a binding sites database (SitesBase) that given a known protein-ligand binding site allows rapid retrieval of other binding sites with similar structure independent of overall sequence or fold similarity. However, each match is also annotated with sequence similarity and fold information to aid interpretation of structure and functional similarity. Similarity in ligand binding sites can indicate common binding modes and recognition of similar molecules, allowing potential inference of function for an uncharacterised protein or providing additional evidence of common function where sequence or fold similarity is already known. Alternatively, the resource can provide valuable information for detailed studies of molecular recognition including structure-based ligand design and in understanding ligand cross-reactivity. Here, we show examples of atomic similarity between superfamily or more distant fold relatives as well as between seemingly unrelated proteins. Assignment of unclassified proteins to structural superfamiles is also undertaken and in most cases substantiates assignments made using sequence similarity. Correct assignment is also possible where sequence similarity fails to find significant matches, illustrating the potential use of binding site comparisons for newly determined proteins.
Uversky, Vladimir N
2016-03-25
Biologically active but floppy proteins represent a new reality of modern protein science. These intrinsically disordered proteins (IDPs) and hybrid proteins containing ordered and intrinsically disordered protein regions (IDPRs) constitute a noticeable part of any given proteome. Functionally, they complement ordered proteins, and their conformational flexibility and structural plasticity allow them to perform impossible tricks and be engaged in biological activities that are inaccessible to well folded proteins with their unique structures. The major goals of this minireview are to show that, despite their simplified amino acid sequences, IDPs/IDPRs are complex entities often resembling chaotic systems, are structurally and functionally heterogeneous, and can be considered an important part of the structure-function continuum. Furthermore, IDPs/IDPRs are everywhere, and are ubiquitously engaged in various interactions characterized by a wide spectrum of binding scenarios and an even wider spectrum of structural and functional outputs. © 2016 by The American Society for Biochemistry and Molecular Biology, Inc.
Advances in structural and functional analysis of membrane proteins by electron crystallography
Wisedchaisri, Goragot; Reichow, Steve L.; Gonen, Tamir
2011-01-01
Summary Electron crystallography is a powerful technique for the study of membrane protein structure and function in the lipid environment. When well-ordered two-dimensional crystals are obtained the structure of both protein and lipid can be determined and lipid-protein interactions analyzed. Protons and ionic charges can be visualized by electron crystallography and the protein of interest can be captured for structural analysis in a variety of physiologically distinct states. This review highlights the strengths of electron crystallography and the momentum that is building up in automation and the development of high throughput tools and methods for structural and functional analysis of membrane proteins by electron crystallography. PMID:22000511
Advances in structural and functional analysis of membrane proteins by electron crystallography.
Wisedchaisri, Goragot; Reichow, Steve L; Gonen, Tamir
2011-10-12
Electron crystallography is a powerful technique for the study of membrane protein structure and function in the lipid environment. When well-ordered two-dimensional crystals are obtained the structure of both protein and lipid can be determined and lipid-protein interactions analyzed. Protons and ionic charges can be visualized by electron crystallography and the protein of interest can be captured for structural analysis in a variety of physiologically distinct states. This review highlights the strengths of electron crystallography and the momentum that is building up in automation and the development of high throughput tools and methods for structural and functional analysis of membrane proteins by electron crystallography. Copyright © 2011 Elsevier Ltd. All rights reserved.
Self-Assembled Materials Made from Functional Recombinant Proteins.
Jang, Yeongseon; Champion, Julie A
2016-10-18
Proteins are potent molecules that can be used as therapeutics, sensors, and biocatalysts with many advantages over small-molecule counterparts due to the specificity of their activity based on their amino acid sequence and folded three-dimensional structure. However, they also have significant limitations in their stability, localization, and recovery when used in soluble form. These opportunities and challenges have motivated the creation of materials from such functional proteins in order to protect and present them in a way that enhances their function. We have designed functional recombinant fusion proteins capable of self-assembling into materials with unique structures that maintain or improve the functionality of the protein. Fusion of either a functional protein or an assembly domain to a leucine zipper domain makes the materials design strategy modular, based on the high affinity between leucine zippers. The self-assembly domains, including elastin-like polypeptides (ELPs) and defined-sequence random coil polypeptides, can be fused with a leucine zipper motif in order to promote assembly of the fusion proteins into larger structures upon specific stimuli such as temperature and ionic strength. Fusion of other functional domains with the counterpart leucine zipper motif endows the self-assembled materials with protein-specific functions such as fluorescence or catalytic activity. In this Account, we describe several examples of materials assembled from functional fusion proteins as well as the structural characterization, functionality, and understanding of the assembly mechanism. The first example is zipper fusion proteins containing ELPs that assemble into particles when introduced to a model extracellular matrix and subsequently disassemble over time to release the functional protein for drug delivery applications. Under different conditions, the same fusion proteins can self-assemble into hollow vesicles. The vesicles display a functional protein on the surface and can also carry protein, small-molecule, or nanoparticle cargo in the vesicle lumen. To create a material with a more complex hierarchical structure, we combined calcium phosphate with zipper fusion proteins containing random coil polypeptides to produce hybrid protein-inorganic supraparticles with high surface area and porous structure. The use of a functional enzyme created supraparticles with the ability to degrade inflammatory cytokines. Our characterization of these protein materials revealed that the molecular interactions are complex because of the large size of the protein building blocks, their folded structures, and the number of potential interactions including hydrophobic interactions, electrostatic interactions, van der Waals forces, and specific affinity-based interactions. It is difficult or even impossible to predict the structures a priori. However, once the basic assembly principles are understood, there is opportunity to tune the material properties, such as size, through control of the self-assembly conditions. Our future efforts on the fundamental side will focus on identifying the phase space of self-assembly of these fusion proteins and additional experimental levers with which to control and tune the resulting materials. On the application side, we are investigating an array of different functional proteins to expand the use of these structures in both therapeutic protein delivery and biocatalysis.
Computer analysis of protein functional sites projection on exon structure of genes in Metazoa.
Medvedeva, Irina V; Demenkov, Pavel S; Ivanisenko, Vladimir A
2015-01-01
Study of the relationship between the structural and functional organization of proteins and their coding genes is necessary for an understanding of the evolution of molecular systems and can provide new knowledge for many applications for designing proteins with improved medical and biological properties. It is well known that the functional properties of proteins are determined by their functional sites. Functional sites are usually represented by a small number of amino acid residues that are distantly located from each other in the amino acid sequence. They are highly conserved within their functional group and vary significantly in structure between such groups. According to this facts analysis of the general properties of the structural organization of the functional sites at the protein level and, at the level of exon-intron structure of the coding gene is still an actual problem. One approach to this analysis is the projection of amino acid residue positions of the functional sites along with the exon boundaries to the gene structure. In this paper, we examined the discontinuity of the functional sites in the exon-intron structure of genes and the distribution of lengths and phases of the functional site encoding exons in vertebrate genes. We have shown that the DNA fragments coding the functional sites were in the same exons, or in close exons. The observed tendency to cluster the exons that code functional sites which could be considered as the unit of protein evolution. We studied the characteristics of the structure of the exon boundaries that code, and do not code, functional sites in 11 Metazoa species. This is accompanied by a reduced frequency of intercodon gaps (phase 0) in exons encoding the amino acid residue functional site, which may be evidence of the existence of evolutionary limitations to the exon shuffling. These results characterize the features of the coding exon-intron structure that affect the functionality of the encoded protein and allow a better understanding of the emergence of biological diversity.
MoonProt: a database for proteins that are known to moonlight
Mani, Mathew; Chen, Chang; Amblee, Vaishak; Liu, Haipeng; Mathur, Tanu; Zwicke, Grant; Zabad, Shadi; Patel, Bansi; Thakkar, Jagravi; Jeffery, Constance J.
2015-01-01
Moonlighting proteins comprise a class of multifunctional proteins in which a single polypeptide chain performs multiple biochemical functions that are not due to gene fusions, multiple RNA splice variants or pleiotropic effects. The known moonlighting proteins perform a variety of diverse functions in many different cell types and species, and information about their structures and functions is scattered in many publications. We have constructed the manually curated, searchable, internet-based MoonProt Database (http://www.moonlightingproteins.org) with information about the over 200 proteins that have been experimentally verified to be moonlighting proteins. The availability of this organized information provides a more complete picture of what is currently known about moonlighting proteins. The database will also aid researchers in other fields, including determining the functions of genes identified in genome sequencing projects, interpreting data from proteomics projects and annotating protein sequence and structural databases. In addition, information about the structures and functions of moonlighting proteins can be helpful in understanding how novel protein functional sites evolved on an ancient protein scaffold, which can also help in the design of proteins with novel functions. PMID:25324305
Miyakawa, Takuya; Sawano, Yoriko; Miyazono, Ken-ichi; Miyauchi, Yumiko; Hatano, Ken-ichi
2013-01-01
STK_08120 is a member of the thermoacidophile-specific DUF3211 protein family from Sulfolobus tokodaii strain 7. Its molecular function remains obscure, and sequence similarities for obtaining functional remarks are not available. In this study, the crystal structure of STK_08120 was determined at 1.79-Å resolution to predict its probable function using structure similarity searches. The structure adopts an α/β structure of a helix-grip fold, which is found in the START domain proteins with cavities for hydrophobic substrates or ligands. The detailed structural features implied that fatty acids are the primary ligand candidates for STK_08120, and binding assays revealed that the protein bound long-chain saturated fatty acids (>C14) and their trans-unsaturated types with an affinity equal to that for major fatty acid binding proteins in mammals and plants. Moreover, the structure of an STK_08120-myristic acid complex revealed a unique binding mode among fatty acid binding proteins. These results suggest that the thermoacidophile-specific protein family DUF3211 functions as a fatty acid carrier with a novel binding mode. PMID:23836863
A scoring function based on solvation thermodynamics for protein structure prediction
Du, Shiqiao; Harano, Yuichi; Kinoshita, Masahiro; Sakurai, Minoru
2012-01-01
We predict protein structure using our recently developed free energy function for describing protein stability, which is focused on solvation thermodynamics. The function is combined with the current most reliable sampling methods, i.e., fragment assembly (FA) and comparative modeling (CM). The prediction is tested using 11 small proteins for which high-resolution crystal structures are available. For 8 of these proteins, sequence similarities are found in the database, and the prediction is performed with CM. Fairly accurate models with average Cα root mean square deviation (RMSD) ∼ 2.0 Å are successfully obtained for all cases. For the rest of the target proteins, we perform the prediction following FA protocols. For 2 cases, we obtain predicted models with an RMSD ∼ 3.0 Å as the best-scored structures. For the other case, the RMSD remains larger than 7 Å. For all the 11 target proteins, our scoring function identifies the experimentally determined native structure as the best structure. Starting from the predicted structure, replica exchange molecular dynamics is performed to further refine the structures. However, we are unable to improve its RMSD toward the experimental structure. The exhaustive sampling by coarse-grained normal mode analysis around the native structures reveals that our function has a linear correlation with RMSDs < 3.0 Å. These results suggest that the function is quite reliable for the protein structure prediction while the sampling method remains one of the major limiting factors in it. The aspects through which the methodology could further be improved are discussed. PMID:27493529
Automated prediction of protein function and detection of functional sites from structure.
Pazos, Florencio; Sternberg, Michael J E
2004-10-12
Current structural genomics projects are yielding structures for proteins whose functions are unknown. Accordingly, there is a pressing requirement for computational methods for function prediction. Here we present PHUNCTIONER, an automatic method for structure-based function prediction using automatically extracted functional sites (residues associated to functions). The method relates proteins with the same function through structural alignments and extracts 3D profiles of conserved residues. Functional features to train the method are extracted from the Gene Ontology (GO) database. The method extracts these features from the entire GO hierarchy and hence is applicable across the whole range of function specificity. 3D profiles associated with 121 GO annotations were extracted. We tested the power of the method both for the prediction of function and for the extraction of functional sites. The success of function prediction by our method was compared with the standard homology-based method. In the zone of low sequence similarity (approximately 15%), our method assigns the correct GO annotation in 90% of the protein structures considered, approximately 20% higher than inheritance of function from the closest homologue.
Some of the most interesting CASP11 targets through the eyes of their authors.
Kryshtafovych, Andriy; Moult, John; Baslé, Arnaud; Burgin, Alex; Craig, Timothy K; Edwards, Robert A; Fass, Deborah; Hartmann, Marcus D; Korycinski, Mateusz; Lewis, Richard J; Lorimer, Donald; Lupas, Andrei N; Newman, Janet; Peat, Thomas S; Piepenbrink, Kurt H; Prahlad, Janani; van Raaij, Mark J; Rohwer, Forest; Segall, Anca M; Seguritan, Victor; Sundberg, Eric J; Singh, Abhimanyu K; Wilson, Mark A; Schwede, Torsten
2016-09-01
The Critical Assessment of protein Structure Prediction (CASP) experiment would not have been possible without the prediction targets provided by the experimental structural biology community. In this article, selected crystallographers providing targets for the CASP11 experiment discuss the functional and biological significance of the target proteins, highlight their most interesting structural features, and assess whether these features were correctly reproduced in the predictions submitted to CASP11. Proteins 2016; 84(Suppl 1):34-50. © 2015 The Authors. Proteins: Structure, Function, and Bioinformatics Published by Wiley Periodicals, Inc. © 2015 The Authors. Proteins: Structure, Function, and Bioinformatics Published by Wiley Periodicals, Inc.
Scavuzzo-Duggan, Tess R.; Chaves, Arielle M.; Roberts, Alison W.
2015-07-14
Here, a method for rapid in vivo functional analysis of engineered proteins was developed using Physcomitrella patens. A complementation assay was designed for testing structure/function relationships in cellulose synthase (CESA) proteins. The components of the assay include (1) construction of test vectors that drive expression of epitope-tagged PpCESA5 carrying engineered mutations, (2) transformation of a ppcesa5 knockout line that fails to produce gametophores with test and control vectors, (3) scoring the stable transformants for gametophore production, (4) statistical analysis comparing complementation rates for test vectors to positive and negative control vectors, and (5) analysis of transgenic protein expression by Westernmore » blotting. The assay distinguished mutations that generate fully functional, nonfunctional, and partially functional proteins. In conclusion, compared with existing methods for in vivo testing of protein function, this complementation assay provides a rapid method for investigating protein structure/function relationships in plants.« less
De Novo Proteins with Life-Sustaining Functions Are Structurally Dynamic.
Murphy, Grant S; Greisman, Jack B; Hecht, Michael H
2016-01-29
Designing and producing novel proteins that fold into stable structures and provide essential biological functions are key goals in synthetic biology. In initial steps toward achieving these goals, we constructed a combinatorial library of de novo proteins designed to fold into 4-helix bundles. As described previously, screening this library for sequences that function in vivo to rescue conditionally lethal mutants of Escherichia coli (auxotrophs) yielded several de novo sequences, termed SynRescue proteins, which rescued four different E. coli auxotrophs. In an effort to understand the structural requirements necessary for auxotroph rescue, we investigated the biophysical properties of the SynRescue proteins, using both computational and experimental approaches. Results from circular dichroism, size-exclusion chromatography, and NMR demonstrate that the SynRescue proteins are α-helical and relatively stable. Surprisingly, however, they do not form well-ordered structures. Instead, they form dynamic structures that fluctuate between monomeric and dimeric states. These findings show that a well-ordered structure is not a prerequisite for life-sustaining functions, and suggests that dynamic structures may have been important in the early evolution of protein function. Copyright © 2015 Elsevier Ltd. All rights reserved.
Use of designed sequences in protein structure recognition.
Kumar, Gayatri; Mudgal, Richa; Srinivasan, Narayanaswamy; Sandhya, Sankaran
2018-05-09
Knowledge of the protein structure is a pre-requisite for improved understanding of molecular function. The gap in the sequence-structure space has increased in the post-genomic era. Grouping related protein sequences into families can aid in narrowing the gap. In the Pfam database, structure description is provided for part or full-length proteins of 7726 families. For the remaining 52% of the families, information on 3-D structure is not yet available. We use the computationally designed sequences that are intermediately related to two protein domain families, which are already known to share the same fold. These strategically designed sequences enable detection of distant relationships and here, we have employed them for the purpose of structure recognition of protein families of yet unknown structure. We first measured the success rate of our approach using a dataset of protein families of known fold and achieved a success rate of 88%. Next, for 1392 families of yet unknown structure, we made structural assignments for part/full length of the proteins. Fold association for 423 domains of unknown function (DUFs) are provided as a step towards functional annotation. The results indicate that knowledge-based filling of gaps in protein sequence space is a lucrative approach for structure recognition. Such sequences assist in traversal through protein sequence space and effectively function as 'linkers', where natural linkers between distant proteins are unavailable. This article was reviewed by Oliviero Carugo, Christine Orengo and Srikrishna Subramanian.
The history of the CATH structural classification of protein domains.
Sillitoe, Ian; Dawson, Natalie; Thornton, Janet; Orengo, Christine
2015-12-01
This article presents a historical review of the protein structure classification database CATH. Together with the SCOP database, CATH remains comprehensive and reasonably up-to-date with the now more than 100,000 protein structures in the PDB. We review the expansion of the CATH and SCOP resources to capture predicted domain structures in the genome sequence data and to provide information on the likely functions of proteins mediated by their constituent domains. The establishment of comprehensive function annotation resources has also meant that domain families can be functionally annotated allowing insights into functional divergence and evolution within protein families. Copyright © 2015 The Authors. Published by Elsevier B.V. All rights reserved.
Uversky, Vladimir N
2015-03-01
Intrinsically disordered proteins (IDPs) and intrinsically disordered protein regions (IDPRs) are functional proteins or regions that do not have unique 3D structures under functional conditions. Therefore, from the viewpoint of their lack of stable 3D structure, IDPs/IDPRs are inherently unstable. As much as structure and function of normal ordered globular proteins are determined by their amino acid sequences, the lack of unique 3D structure in IDPs/IDPRs and their disorder-based functionality are also encoded in the amino acid sequences. Because of their specific sequence features and distinctive conformational behavior, these intrinsically unstable proteins or regions have several applications in biotechnology. This review introduces some of the most characteristic features of IDPs/IDPRs (such as peculiarities of amino acid sequences of these proteins and regions, their major structural features, and peculiar responses to changes in their environment) and describes how these features can be used in the biotechnology, for example for the proteome-wide analysis of the abundance of extended IDPs, for recombinant protein isolation and purification, as polypeptide nanoparticles for drug delivery, as solubilization tools, and as thermally sensitive carriers of active peptides and proteins. Copyright © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Du, Yushen; Wu, Nicholas C.; Jiang, Lin; Zhang, Tianhao; Gong, Danyang; Shu, Sara; Wu, Ting-Ting
2016-01-01
ABSTRACT Identification and annotation of functional residues are fundamental questions in protein sequence analysis. Sequence and structure conservation provides valuable information to tackle these questions. It is, however, limited by the incomplete sampling of sequence space in natural evolution. Moreover, proteins often have multiple functions, with overlapping sequences that present challenges to accurate annotation of the exact functions of individual residues by conservation-based methods. Using the influenza A virus PB1 protein as an example, we developed a method to systematically identify and annotate functional residues. We used saturation mutagenesis and high-throughput sequencing to measure the replication capacity of single nucleotide mutations across the entire PB1 protein. After predicting protein stability upon mutations, we identified functional PB1 residues that are essential for viral replication. To further annotate the functional residues important to the canonical or noncanonical functions of viral RNA-dependent RNA polymerase (vRdRp), we performed a homologous-structure analysis with 16 different vRdRp structures. We achieved high sensitivity in annotating the known canonical polymerase functional residues. Moreover, we identified a cluster of noncanonical functional residues located in the loop region of the PB1 β-ribbon. We further demonstrated that these residues were important for PB1 protein nuclear import through the interaction with Ran-binding protein 5. In summary, we developed a systematic and sensitive method to identify and annotate functional residues that are not restrained by sequence conservation. Importantly, this method is generally applicable to other proteins about which homologous-structure information is available. PMID:27803181
Mathematical methods for protein science
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hart, W.; Istrail, S.; Atkins, J.
1997-12-31
Understanding the structure and function of proteins is a fundamental endeavor in molecular biology. Currently, over 100,000 protein sequences have been determined by experimental methods. The three dimensional structure of the protein determines its function, but there are currently less than 4,000 structures known to atomic resolution. Accordingly, techniques to predict protein structure from sequence have an important role in aiding the understanding of the Genome and the effects of mutations in genetic disease. The authors describe current efforts at Sandia to better understand the structure of proteins through rigorous mathematical analyses of simple lattice models. The efforts have focusedmore » on two aspects of protein science: mathematical structure prediction, and inverse protein folding.« less
Biological and functional relevance of CASP predictions.
Liu, Tianyun; Ish-Shalom, Shirbi; Torng, Wen; Lafita, Aleix; Bock, Christian; Mort, Matthew; Cooper, David N; Bliven, Spencer; Capitani, Guido; Mooney, Sean D; Altman, Russ B
2018-03-01
Our goal is to answer the question: compared with experimental structures, how useful are predicted models for functional annotation? We assessed the functional utility of predicted models by comparing the performances of a suite of methods for functional characterization on the predictions and the experimental structures. We identified 28 sites in 25 protein targets to perform functional assessment. These 28 sites included nine sites with known ligand binding (holo-sites), nine sites that are expected or suggested by experimental authors for small molecule binding (apo-sites), and Ten sites containing important motifs, loops, or key residues with important disease-associated mutations. We evaluated the utility of the predictions by comparing their microenvironments to the experimental structures. Overall structural quality correlates with functional utility. However, the best-ranked predictions (global) may not have the best functional quality (local). Our assessment provides an ability to discriminate between predictions with high structural quality. When assessing ligand-binding sites, most prediction methods have higher performance on apo-sites than holo-sites. Some servers show consistently high performance for certain types of functional sites. Finally, many functional sites are associated with protein-protein interaction. We also analyzed biologically relevant features from the protein assemblies of two targets where the active site spanned the protein-protein interface. For the assembly targets, we find that the features in the models are mainly determined by the choice of template. © 2017 The Authors Proteins: Structure, Function and Bioinformatics Published by Wiley Periodicals, Inc.
Feyzi, Samira; Varidi, Mehdi; Zare, Fatemeh; Varidi, Mohammad Javad
2018-03-01
Different drying methods due to protein denaturation could alter the functional properties of proteins, as well as their structure. So, this study focused on the effect of different drying methods on amino acid content, thermo and functional properties, and protein structure of fenugreek protein isolate. Freeze and spray drying methods resulted in comparable protein solubility, dynamic surface and interfacial tensions, foaming and emulsifying properties except for emulsion stability. Vacuum oven drying promoted emulsion stability, surface hydrophobicity and viscosity of fenugreek protein isolate at the expanse of its protein solubility. Vacuum oven process caused a higher level of Maillard reaction followed by the spray drying process, which was confirmed by the lower amount of lysine content and less lightness, also more browning intensity. ΔH of fenugreek protein isolates was higher than soy protein isolate, which confirmed the presence of more ordered structures. Also, the bands which are attributed to the α-helix structures in the FTIR spectrum were in the shorter wave number region for freeze and spray dried fenugreek protein isolates that show more possibility of such structures. This research suggests that any drying method must be conducted in its gentle state in order to sustain native structure of proteins and promote their functionalities. © 2017 Society of Chemical Industry. © 2017 Society of Chemical Industry.
Weininger, Arthur; Weininger, Susan
2015-01-01
The ability to identify the functional correlates of structural and sequence variation in proteins is a critical capability. We related structures of influenza A N10 and N11 proteins that have no established function to structures of proteins with known function by identifying spatially conserved atoms. We identified atoms with common distributed spatial occupancy in PDB structures of N10 protein, N11 protein, an influenza A neuraminidase, an influenza B neuraminidase, and a bacterial neuraminidase. By superposing these spatially conserved atoms, we aligned the structures and associated molecules. We report spatially and sequence invariant residues in the aligned structures. Spatially invariant residues in the N6 and influenza B neuraminidase active sites were found in previously unidentified spatially equivalent sites in the N10 and N11 proteins. We found the corresponding secondary and tertiary structures of the aligned proteins to be largely identical despite significant sequence divergence. We found structural precedent in known non-neuraminidase structures for residues exhibiting structural and sequence divergence in the aligned structures. In N10 protein, we identified staphylococcal enterotoxin I-like domains. In N11 protein, we identified hepatitis E E2S-like domains, SARS spike protein-like domains, and toxin components shared by alpha-bungarotoxin, staphylococcal enterotoxin I, anthrax lethal factor, clostridium botulinum neurotoxin, and clostridium tetanus toxin. The presence of active site components common to the N6, influenza B, and S. pneumoniae neuraminidases in the N10 and N11 proteins, combined with the absence of apparent neuraminidase function, suggests that the role of neuraminidases in H17N10 and H18N11 emerging influenza A viruses may have changed. The presentation of E2S-like, SARS spike protein-like, or toxin-like domains by the N10 and N11 proteins in these emerging viruses may indicate that H17N10 and H18N11 sialidase-facilitated cell entry has been supplemented or replaced by sialidase-independent receptor binding to an expanded cell population that may include neurons and T-cells. PMID:25706124
Pan, Joshua; Meyers, Robin M; Michel, Brittany C; Mashtalir, Nazar; Sizemore, Ann E; Wells, Jonathan N; Cassel, Seth H; Vazquez, Francisca; Weir, Barbara A; Hahn, William C; Marsh, Joseph A; Tsherniak, Aviad; Kadoch, Cigall
2018-05-23
Protein complexes are assemblies of subunits that have co-evolved to execute one or many coordinated functions in the cellular environment. Functional annotation of mammalian protein complexes is critical to understanding biological processes, as well as disease mechanisms. Here, we used genetic co-essentiality derived from genome-scale RNAi- and CRISPR-Cas9-based fitness screens performed across hundreds of human cancer cell lines to assign measures of functional similarity. From these measures, we systematically built and characterized functional similarity networks that recapitulate known structural and functional features of well-studied protein complexes and resolve novel functional modules within complexes lacking structural resolution, such as the mammalian SWI/SNF complex. Finally, by integrating functional networks with large protein-protein interaction networks, we discovered novel protein complexes involving recently evolved genes of unknown function. Taken together, these findings demonstrate the utility of genetic perturbation screens alone, and in combination with large-scale biophysical data, to enhance our understanding of mammalian protein complexes in normal and disease states. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.
Ramakrishnan, Gayatri; Ochoa-Montaño, Bernardo; Raghavender, Upadhyayula S; Mudgal, Richa; Joshi, Adwait G; Chandra, Nagasuma R; Sowdhamini, Ramanathan; Blundell, Tom L; Srinivasan, Narayanaswamy
2015-01-01
The availability of the genome sequence of Mycobacterium tuberculosis H37Rv has encouraged determination of large numbers of protein structures and detailed definition of the biological information encoded therein; yet, the functions of many proteins in M. tuberculosis remain unknown. The emergence of multidrug resistant strains makes it a priority to exploit recent advances in homology recognition and structure prediction to re-analyse its gene products. Here we report the structural and functional characterization of gene products encoded in the M. tuberculosis genome, with the help of sensitive profile-based remote homology search and fold recognition algorithms resulting in an enhanced annotation of the proteome where 95% of the M. tuberculosis proteins were identified wholly or partly with information on structure or function. New information includes association of 244 proteins with 205 domain families and a separate set of new association of folds to 64 proteins. Extending structural information across uncharacterized protein families represented in the M. tuberculosis proteome, by determining superfamily relationships between families of known and unknown structures, has contributed to an enhancement in the knowledge of structural content. In retrospect, such superfamily relationships have facilitated recognition of probable structure and/or function for several uncharacterized protein families, eventually aiding recognition of probable functions for homologous proteins corresponding to such families. Gene products unique to mycobacteria for which no functions could be identified are 183. Of these 18 were determined to be M. tuberculosis specific. Such pathogen-specific proteins are speculated to harbour virulence factors required for pathogenesis. A re-annotated proteome of M. tuberculosis, with greater completeness of annotated proteins and domain assigned regions, provides a valuable basis for experimental endeavours designed to obtain a better understanding of pathogenesis and to accelerate the process of drug target discovery. Copyright © 2014 Elsevier Ltd. All rights reserved.
Functional Advantages of Conserved Intrinsic Disorder in RNA-Binding Proteins.
Varadi, Mihaly; Zsolyomi, Fruzsina; Guharoy, Mainak; Tompa, Peter
2015-01-01
Proteins form large macromolecular assemblies with RNA that govern essential molecular processes. RNA-binding proteins have often been associated with conformational flexibility, yet the extent and functional implications of their intrinsic disorder have never been fully assessed. Here, through large-scale analysis of comprehensive protein sequence and structure datasets we demonstrate the prevalence of intrinsic structural disorder in RNA-binding proteins and domains. We addressed their functionality through a quantitative description of the evolutionary conservation of disordered segments involved in binding, and investigated the structural implications of flexibility in terms of conformational stability and interface formation. We conclude that the functional role of intrinsically disordered protein segments in RNA-binding is two-fold: first, these regions establish extended, conserved electrostatic interfaces with RNAs via induced fit. Second, conformational flexibility enables them to target different RNA partners, providing multi-functionality, while also ensuring specificity. These findings emphasize the functional importance of intrinsically disordered regions in RNA-binding proteins.
Possenti, Andrea; Vendruscolo, Michele; Camilloni, Carlo; Tiana, Guido
2018-05-23
Proteins employ the information stored in the genetic code and translated into their sequences to carry out well-defined functions in the cellular environment. The possibility to encode for such functions is controlled by the balance between the amount of information supplied by the sequence and that left after that the protein has folded into its structure. We study the amount of information necessary to specify the protein structure, providing an estimate that keeps into account the thermodynamic properties of protein folding. We thus show that the information remaining in the protein sequence after encoding for its structure (the 'information gap') is very close to what needed to encode for its function and interactions. Then, by predicting the information gap directly from the protein sequence, we show that it may be possible to use these insights from information theory to discriminate between ordered and disordered proteins, to identify unknown functions, and to optimize artificially-designed protein sequences. This article is protected by copyright. All rights reserved. © 2018 Wiley Periodicals, Inc.
Sudha, Govindarajan; Srinivasan, Narayanaswamy
2016-09-01
A comprehensive analysis of the quaternary features of distantly related homo-oligomeric proteins is the focus of the current study. This study has been performed at the levels of quaternary state, symmetry, and quaternary structure. Quaternary state and quaternary structure refers to the number of subunits and spatial arrangements of subunits, respectively. Using a large dataset of available 3D structures of biologically relevant assemblies, we show that only 53% of the distantly related homo-oligomeric proteins have the same quaternary state. Considering these homologous homo-oligomers with the same quaternary state, conservation of quaternary structures is observed only in 38% of the pairs. In 36% of the pairs of distantly related homo-oligomers with different quaternary states the larger assembly in a pair shows high structural similarity with the entire quaternary structure of the related protein with lower quaternary state and it is referred as "Russian doll effect." The differences in quaternary state and structure have been suggested to contribute to the functional diversity. Detailed investigations show that even though the gross functions of many distantly related homo-oligomers are the same, finer level differences in molecular functions are manifested by differences in quaternary states and structures. Comparison of structures of biological assemblies in distantly and closely related homo-oligomeric proteins throughout the study differentiates the effects of sequence divergence on the quaternary structures and function. Knowledge inferred from this study can provide insights for improved protein structure classification and function prediction of homo-oligomers. Proteins 2016; 84:1190-1202. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Dewhurst, Henry M; Choudhury, Shilpa; Torres, Matthew P
2015-08-01
Predicting the biological function potential of post-translational modifications (PTMs) is becoming increasingly important in light of the exponential increase in available PTM data from high-throughput proteomics. We developed structural analysis of PTM hotspots (SAPH-ire)--a quantitative PTM ranking method that integrates experimental PTM observations, sequence conservation, protein structure, and interaction data to allow rank order comparisons within or between protein families. Here, we applied SAPH-ire to the study of PTMs in diverse G protein families, a conserved and ubiquitous class of proteins essential for maintenance of intracellular structure (tubulins) and signal transduction (large and small Ras-like G proteins). A total of 1728 experimentally verified PTMs from eight unique G protein families were clustered into 451 unique hotspots, 51 of which have a known and cited biological function or response. Using customized software, the hotspots were analyzed in the context of 598 unique protein structures. By comparing distributions of hotspots with known versus unknown function, we show that SAPH-ire analysis is predictive for PTM biological function. Notably, SAPH-ire revealed high-ranking hotspots for which a functional impact has not yet been determined, including phosphorylation hotspots in the N-terminal tails of G protein gamma subunits--conserved protein structures never before reported as regulators of G protein coupled receptor signaling. To validate this prediction we used the yeast model system for G protein coupled receptor signaling, revealing that gamma subunit-N-terminal tail phosphorylation is activated in response to G protein coupled receptor stimulation and regulates protein stability in vivo. These results demonstrate the utility of integrating protein structural and sequence features into PTM prioritization schemes that can improve the analysis and functional power of modification-specific proteomics data. © 2015 by The American Society for Biochemistry and Molecular Biology, Inc.
Matityahu, Avi; Onn, Itay
2018-02-01
The higher-order organization of chromosomes ensures their stability and functionality. However, the molecular mechanism by which higher order structure is established is poorly understood. Dissecting the activity of the relevant proteins provides information essential for achieving a comprehensive understanding of chromosome structure. Proteins of the structural maintenance of chromosome (SMC) family of ATPases are the core of evolutionary conserved complexes. SMC complexes are involved in regulating genome dynamics and in maintaining genome stability. The structure of all SMC proteins resembles an elongated rod that contains a central coiled-coil domain, a common protein structural motif in which two α-helices twist together. In recent years, the imperative role of the coiled-coil domain to SMC protein activity and regulation has become evident. Here, we discuss recent advances in the function of the SMC coiled coils. We describe the structure of the coiled-coil domain of SMC proteins, modifications and interactions that are mediated by it. Furthermore, we assess the role of the coiled-coil domain in conformational switches of SMC proteins, and in determining the architecture of the SMC dimer. Finally, we review the interplay between mutations in the coiled-coil domain and human disorders. We suggest that distinctive properties of coiled coils of different SMC proteins contribute to their distinct functions. The discussion clarifies the mechanisms underlying the activity of SMC proteins, and advocates future studies to elucidate the function of the SMC coiled coil domain.
Dynamics of endoglucanase catalytic domains: implications towards thermostability
USDA-ARS?s Scientific Manuscript database
The function of proteins is controlled by their dynamics inherently determined by their structure. Exploring the protein structure-dynamics relationship is important to develop an understanding of protein function that allows tapping the potential of economically important proteins, such as endogluc...
ERIC Educational Resources Information Center
Giron, Maria D.; Salto, Rafael
2011-01-01
Structure-function relationship studies in proteins are essential in modern Cell Biology. Laboratory exercises that allow students to familiarize themselves with basic mutagenesis techniques are essential in all Genetic Engineering courses to teach the relevance of protein structure. We have implemented a laboratory course based on the…
Adaptability of Protein Structures to Enable Functional Interactions and Evolutionary Implications
Haliloglu, Turkan; Bahar, Ivet
2015-01-01
Several studies in recent years have drawn attention to the ability of proteins to adapt to intermolecular interactions by conformational changes along structure-encoded collective modes of motions. These so-called soft modes, primarily driven by entropic effects, facilitate, if not enable, functional interactions. They represent excursions on the conformational space along principal low-ascent directions/paths away from the original free energy minimum, and they are accessible to the protein even prior to protein-protein/ligand interactions. An emerging concept from these studies is the evolution of structures or modular domains to favor such modes of motion that will be recruited or integrated for enabling functional interactions. Structural dynamics, including the allosteric switches in conformation that are often stabilized upon formation of complexes and multimeric assemblies, emerge as key properties that are evolutionarily maintained to accomplish biological activities, consistent with the paradigm sequence → structure → dynamics → function where ‘dynamics’ bridges structure and function. PMID:26254902
Park, Hahnbeom; Bradley, Philip; Greisen, Per; Liu, Yuan; Mulligan, Vikram Khipple; Kim, David E.; Baker, David; DiMaio, Frank
2017-01-01
Most biomolecular modeling energy functions for structure prediction, sequence design, and molecular docking, have been parameterized using existing macromolecular structural data; this contrasts molecular mechanics force fields which are largely optimized using small-molecule data. In this study, we describe an integrated method that enables optimization of a biomolecular modeling energy function simultaneously against small-molecule thermodynamic data and high-resolution macromolecular structural data. We use this approach to develop a next-generation Rosetta energy function that utilizes a new anisotropic implicit solvation model, and an improved electrostatics and Lennard-Jones model, illustrating how energy functions can be considerably improved in their ability to describe large-scale energy landscapes by incorporating both small-molecule and macromolecule data. The energy function improves performance in a wide range of protein structure prediction challenges, including monomeric structure prediction, protein-protein and protein-ligand docking, protein sequence design, and prediction of the free energy changes by mutation, while reasonably recapitulating small-molecule thermodynamic properties. PMID:27766851
PredictProtein—an open resource for online prediction of protein structural and functional features
Yachdav, Guy; Kloppmann, Edda; Kajan, Laszlo; Hecht, Maximilian; Goldberg, Tatyana; Hamp, Tobias; Hönigschmid, Peter; Schafferhans, Andrea; Roos, Manfred; Bernhofer, Michael; Richter, Lothar; Ashkenazy, Haim; Punta, Marco; Schlessinger, Avner; Bromberg, Yana; Schneider, Reinhard; Vriend, Gerrit; Sander, Chris; Ben-Tal, Nir; Rost, Burkhard
2014-01-01
PredictProtein is a meta-service for sequence analysis that has been predicting structural and functional features of proteins since 1992. Queried with a protein sequence it returns: multiple sequence alignments, predicted aspects of structure (secondary structure, solvent accessibility, transmembrane helices (TMSEG) and strands, coiled-coil regions, disulfide bonds and disordered regions) and function. The service incorporates analysis methods for the identification of functional regions (ConSurf), homology-based inference of Gene Ontology terms (metastudent), comprehensive subcellular localization prediction (LocTree3), protein–protein binding sites (ISIS2), protein–polynucleotide binding sites (SomeNA) and predictions of the effect of point mutations (non-synonymous SNPs) on protein function (SNAP2). Our goal has always been to develop a system optimized to meet the demands of experimentalists not highly experienced in bioinformatics. To this end, the PredictProtein results are presented as both text and a series of intuitive, interactive and visually appealing figures. The web server and sources are available at http://ppopen.rostlab.org. PMID:24799431
Zebra: a web server for bioinformatic analysis of diverse protein families.
Suplatov, Dmitry; Kirilin, Evgeny; Takhaveev, Vakil; Svedas, Vytas
2014-01-01
During evolution of proteins from a common ancestor, one functional property can be preserved while others can vary leading to functional diversity. A systematic study of the corresponding adaptive mutations provides a key to one of the most challenging problems of modern structural biology - understanding the impact of amino acid substitutions on protein function. The subfamily-specific positions (SSPs) are conserved within functional subfamilies but are different between them and, therefore, seem to be responsible for functional diversity in protein superfamilies. Consequently, a corresponding method to perform the bioinformatic analysis of sequence and structural data has to be implemented in the common laboratory practice to study the structure-function relationship in proteins and develop novel protein engineering strategies. This paper describes Zebra web server - a powerful remote platform that implements a novel bioinformatic analysis algorithm to study diverse protein families. It is the first application that provides specificity determinants at different levels of functional classification, therefore addressing complex functional diversity of large superfamilies. Statistical analysis is implemented to automatically select a set of highly significant SSPs to be used as hotspots for directed evolution or rational design experiments and analyzed studying the structure-function relationship. Zebra results are provided in two ways - (1) as a single all-in-one parsable text file and (2) as PyMol sessions with structural representation of SSPs. Zebra web server is available at http://biokinet.belozersky.msu.ru/zebra .
Meeting Report: Structural Determination of Environmentally Responsive Proteins
Reinlib, Leslie
2005-01-01
The three-dimensional structure of gene products continues to be a missing lynchpin between linear genome sequences and our understanding of the normal and abnormal function of proteins and pathways. Enhanced activity in this area is likely to lead to better understanding of how discrete changes in molecular patterns and conformation underlie functional changes in protein complexes and, with it, sensitivity of an individual to an exposure. The National Institute of Environmental Health Sciences convened a workshop of experts in structural determination and environmental health to solicit advice for future research in structural resolution relative to environmentally responsive proteins and pathways. The highest priorities recommended by the workshop were to support studies of structure, analysis, control, and design of conformational and functional states at molecular resolution for environmentally responsive molecules and complexes; promote understanding of dynamics, kinetics, and ligand responses; investigate the mechanisms and steps in posttranslational modifications, protein partnering, impact of genetic polymorphisms on structure/function, and ligand interactions; and encourage integrated experimental and computational approaches. The workshop participants also saw value in improving the throughput and purity of protein samples and macromolecular assemblies; developing optimal processes for design, production, and assembly of macromolecular complexes; encouraging studies on protein–protein and macromolecular interactions; and examining assemblies of individual proteins and their functions in pathways of interest for environmental health. PMID:16263521
Hati, Sanchita; Bhattacharyya, Sudeep
2016-01-01
A project-based biophysical chemistry laboratory course, which is offered to the biochemistry and molecular biology majors in their senior year, is described. In this course, the classroom study of the structure-function of biomolecules is integrated with the discovery-guided laboratory study of these molecules using computer modeling and simulations. In particular, modern computational tools are employed to elucidate the relationship between structure, dynamics, and function in proteins. Computer-based laboratory protocols that we introduced in three modules allow students to visualize the secondary, super-secondary, and tertiary structures of proteins, analyze non-covalent interactions in protein-ligand complexes, develop three-dimensional structural models (homology model) for new protein sequences and evaluate their structural qualities, and study proteins' intrinsic dynamics to understand their functions. In the fourth module, students are assigned to an authentic research problem, where they apply their laboratory skills (acquired in modules 1-3) to answer conceptual biophysical questions. Through this process, students gain in-depth understanding of protein dynamics-the missing link between structure and function. Additionally, the requirement of term papers sharpens students' writing and communication skills. Finally, these projects result in new findings that are communicated in peer-reviewed journals. © 2016 The International Union of Biochemistry and Molecular Biology.
Computer analysis of protein functional sites projection on exon structure of genes in Metazoa
2015-01-01
Background Study of the relationship between the structural and functional organization of proteins and their coding genes is necessary for an understanding of the evolution of molecular systems and can provide new knowledge for many applications for designing proteins with improved medical and biological properties. It is well known that the functional properties of proteins are determined by their functional sites. Functional sites are usually represented by a small number of amino acid residues that are distantly located from each other in the amino acid sequence. They are highly conserved within their functional group and vary significantly in structure between such groups. According to this facts analysis of the general properties of the structural organization of the functional sites at the protein level and, at the level of exon-intron structure of the coding gene is still an actual problem. Results One approach to this analysis is the projection of amino acid residue positions of the functional sites along with the exon boundaries to the gene structure. In this paper, we examined the discontinuity of the functional sites in the exon-intron structure of genes and the distribution of lengths and phases of the functional site encoding exons in vertebrate genes. We have shown that the DNA fragments coding the functional sites were in the same exons, or in close exons. The observed tendency to cluster the exons that code functional sites which could be considered as the unit of protein evolution. We studied the characteristics of the structure of the exon boundaries that code, and do not code, functional sites in 11 Metazoa species. This is accompanied by a reduced frequency of intercodon gaps (phase 0) in exons encoding the amino acid residue functional site, which may be evidence of the existence of evolutionary limitations to the exon shuffling. Conclusions These results characterize the features of the coding exon-intron structure that affect the functionality of the encoded protein and allow a better understanding of the emergence of biological diversity. PMID:26693737
The Origin and Early Evolution of Membrane Proteins
NASA Technical Reports Server (NTRS)
Pohorille, Andrew; Schweighofer, Karl; Wilson, Michael A.
2005-01-01
Membrane proteins mediate functions that are essential to all cells. These functions include transport of ions, nutrients and waste products across cell walls, capture of energy and its transduction into the form usable in chemical reactions, transmission of environmental signals to the interior of the cell, cellular growth and cell volume regulation. In the absence of membrane proteins, ancestors of cell (protocells), would have had only very limited capabilities to communicate with their environment. Thus, it is not surprising that membrane proteins are quite common even in simplest prokaryotic cells. Considering that contemporary membrane channels are large and complex, both structurally and functionally, a question arises how their presumably much simpler ancestors could have emerged, perform functions and diversify in early protobiological evolution. Remarkably, despite their overall complexity, structural motifs in membrane proteins are quite simple, with a-helices being most common. This suggests that these proteins might have evolved from simple building blocks. To explain how these blocks could have organized into functional structures, we performed large-scale, accurate computer simulations of folding peptides at a water-membrane interface, their insertion into the membrane, self-assembly into higher-order structures and function. The results of these simulations, combined with analysis of structural and functional experimental data led to the first integrated view of the origin and early evolution of membrane proteins.
Zhang, Chengxin; Zheng, Wei; Freddolino, Peter L; Zhang, Yang
2018-03-10
Homology-based transferal remains the major approach to computational protein function annotations, but it becomes increasingly unreliable when the sequence identity between query and template decreases below 30%. We propose a novel pipeline, MetaGO, to deduce Gene Ontology attributes of proteins by combining sequence homology-based annotation with low-resolution structure prediction and comparison, and partner's homology-based protein-protein network mapping. The pipeline was tested on a large-scale set of 1000 non-redundant proteins from the CAFA3 experiment. Under the stringent benchmark conditions where templates with >30% sequence identity to the query are excluded, MetaGO achieves average F-measures of 0.487, 0.408, and 0.598, for Molecular Function, Biological Process, and Cellular Component, respectively, which are significantly higher than those achieved by other state-of-the-art function annotations methods. Detailed data analysis shows that the major advantage of the MetaGO lies in the new functional homolog detections from partner's homology-based network mapping and structure-based local and global structure alignments, the confidence scores of which can be optimally combined through logistic regression. These data demonstrate the power of using a hybrid model incorporating protein structure and interaction networks to deduce new functional insights beyond traditional sequence homology-based referrals, especially for proteins that lack homologous function templates. The MetaGO pipeline is available at http://zhanglab.ccmb.med.umich.edu/MetaGO/. Copyright © 2018. Published by Elsevier Ltd.
Jmol-Enhanced Biochemistry Research Projects
ERIC Educational Resources Information Center
Saderholm, Matthew; Reynolds, Anthony
2011-01-01
We developed a protein research project for a one-semester biochemistry lecture class to enhance learning and more effectively train students to understand protein structure and function. During this semester-long process, students select a protein with known structure and then research its structure, sequence, and function. This project…
Park, Hahnbeom; Lee, Gyu Rie; Heo, Lim; Seok, Chaok
2014-01-01
Protein loop modeling is a tool for predicting protein local structures of particular interest, providing opportunities for applications involving protein structure prediction and de novo protein design. Until recently, the majority of loop modeling methods have been developed and tested by reconstructing loops in frameworks of experimentally resolved structures. In many practical applications, however, the protein loops to be modeled are located in inaccurate structural environments. These include loops in model structures, low-resolution experimental structures, or experimental structures of different functional forms. Accordingly, discrepancies in the accuracy of the structural environment assumed in development of the method and that in practical applications present additional challenges to modern loop modeling methods. This study demonstrates a new strategy for employing a hybrid energy function combining physics-based and knowledge-based components to help tackle this challenge. The hybrid energy function is designed to combine the strengths of each energy component, simultaneously maintaining accurate loop structure prediction in a high-resolution framework structure and tolerating minor environmental errors in low-resolution structures. A loop modeling method based on global optimization of this new energy function is tested on loop targets situated in different levels of environmental errors, ranging from experimental structures to structures perturbed in backbone as well as side chains and template-based model structures. The new method performs comparably to force field-based approaches in loop reconstruction in crystal structures and better in loop prediction in inaccurate framework structures. This result suggests that higher-accuracy predictions would be possible for a broader range of applications. The web server for this method is available at http://galaxy.seoklab.org/loop with the PS2 option for the scoring function.
Liu, Suxuan; Xiong, Xinyu; Zhao, Xianxian; Yang, Xiaofeng; Wang, Hong
2015-05-09
Eukaryotic cell membrane dynamics change in curvature during physiological and pathological processes. In the past ten years, a novel protein family, Fes/CIP4 homology-Bin/Amphiphysin/Rvs (F-BAR) domain proteins, has been identified to be the most important coordinators in membrane curvature regulation. The F-BAR domain family is a member of the Bin/Amphiphysin/Rvs (BAR) domain superfamily that is associated with dynamic changes in cell membrane. However, the molecular basis in membrane structure regulation and the biological functions of F-BAR protein are unclear. The pathophysiological role of F-BAR protein is unknown. This review summarizes the current understanding of structure and function in the BAR domain superfamily, classifies F-BAR family proteins into nine subfamilies based on domain structure, and characterizes F-BAR protein structure, domain interaction, and functional relevance. In general, F-BAR protein binds to cell membrane via F-BAR domain association with membrane phospholipids and initiates membrane curvature and scission via Src homology-3 (SH3) domain interaction with its partner proteins. This process causes membrane dynamic changes and leads to seven important cellular biological functions, which include endocytosis, phagocytosis, filopodium, lamellipodium, cytokinesis, adhesion, and podosome formation, via distinct signaling pathways determined by specific domain-binding partners. These cellular functions play important roles in many physiological and pathophysiological processes. We further summarize F-BAR protein expression and mutation changes observed in various diseases and developmental disorders. Considering the structure feature and functional implication of F-BAR proteins, we anticipate that F-BAR proteins modulate physiological and pathophysiological processes via transferring extracellular materials, regulating cell trafficking and mobility, presenting antigens, mediating extracellular matrix degradation, and transmitting signaling for cell proliferation.
Functional Evolution of PLP-dependent Enzymes based on Active-Site Structural Similarities
Catazaro, Jonathan; Caprez, Adam; Guru, Ashu; Swanson, David; Powers, Robert
2014-01-01
Families of distantly related proteins typically have very low sequence identity, which hinders evolutionary analysis and functional annotation. Slowly evolving features of proteins, such as an active site, are therefore valuable for annotating putative and distantly related proteins. To date, a complete evolutionary analysis of the functional relationship of an entire enzyme family based on active-site structural similarities has not yet been undertaken. Pyridoxal-5’-phosphate (PLP) dependent enzymes are primordial enzymes that diversified in the last universal ancestor. Using the Comparison of Protein Active Site Structures (CPASS) software and database, we show that the active site structures of PLP-dependent enzymes can be used to infer evolutionary relationships based on functional similarity. The enzymes successfully clustered together based on substrate specificity, function, and three-dimensional fold. This study demonstrates the value of using active site structures for functional evolutionary analysis and the effectiveness of CPASS. PMID:24920327
Functional evolution of PLP-dependent enzymes based on active-site structural similarities.
Catazaro, Jonathan; Caprez, Adam; Guru, Ashu; Swanson, David; Powers, Robert
2014-10-01
Families of distantly related proteins typically have very low sequence identity, which hinders evolutionary analysis and functional annotation. Slowly evolving features of proteins, such as an active site, are therefore valuable for annotating putative and distantly related proteins. To date, a complete evolutionary analysis of the functional relationship of an entire enzyme family based on active-site structural similarities has not yet been undertaken. Pyridoxal-5'-phosphate (PLP) dependent enzymes are primordial enzymes that diversified in the last universal ancestor. Using the comparison of protein active site structures (CPASS) software and database, we show that the active site structures of PLP-dependent enzymes can be used to infer evolutionary relationships based on functional similarity. The enzymes successfully clustered together based on substrate specificity, function, and three-dimensional-fold. This study demonstrates the value of using active site structures for functional evolutionary analysis and the effectiveness of CPASS. © 2014 Wiley Periodicals, Inc.
Development of the field of structural physiology
FUJIYOSHI, Yoshinori
2015-01-01
Electron crystallography is especially useful for studying the structure and function of membrane proteins — key molecules with important functions in neural and other cells. Electron crystallography is now an established technique for analyzing the structures of membrane proteins in lipid bilayers that closely simulate their natural biological environment. Utilizing cryo-electron microscopes with helium-cooled specimen stages that were developed through a personal motivation to understand the functions of neural systems from a structural point of view, the structures of membrane proteins can be analyzed at a higher than 3 Å resolution. This review covers four objectives. First, I introduce the new research field of structural physiology. Second, I recount some of the struggles involved in developing cryo-electron microscopes. Third, I review the structural and functional analyses of membrane proteins mainly by electron crystallography using cryo-electron microscopes. Finally, I discuss multifunctional channels named “adhennels” based on structures analyzed using electron and X-ray crystallography. PMID:26560835
3D RNA and functional interactions from evolutionary couplings
Weinreb, Caleb; Riesselman, Adam; Ingraham, John B.; Gross, Torsten; Sander, Chris; Marks, Debora S.
2016-01-01
Summary Non-coding RNAs are ubiquitous, but the discovery of new RNA gene sequences far outpaces research on their structure and functional interactions. We mine the evolutionary sequence record to derive precise information about function and structure of RNAs and RNA-protein complexes. As in protein structure prediction, we use maximum entropy global probability models of sequence co-variation to infer evolutionarily constrained nucleotide-nucleotide interactions within RNA molecules, and nucleotide-amino acid interactions in RNA-protein complexes. The predicted contacts allow all-atom blinded 3D structure prediction at good accuracy for several known RNA structures and RNA-protein complexes. For unknown structures, we predict contacts in 160 non-coding RNA families. Beyond 3D structure prediction, evolutionary couplings help identify important functional interactions, e.g., at switch points in riboswitches and at a complex nucleation site in HIV. Aided by accelerating sequence accumulation, evolutionary coupling analysis can accelerate the discovery of functional interactions and 3D structures involving RNA. PMID:27087444
The Structure and Function of Non-Collagenous Bone Proteins
NASA Technical Reports Server (NTRS)
Hook, Magnus; McQuillan, David J.
1997-01-01
The research done under the cooperative research agreement for the project titled 'The structure and function of non-collagenous bone proteins' represented the first phase of an ongoing program to define the structural and functional relationships of the principal noncollagenous proteins in bone. An ultimate goal of this research is to enable design and execution of useful pharmacological compounds that will have a beneficial effect in treatment of osteoporosis, both land-based and induced by long-duration space travel. The goals of the now complete first phase were as follows: 1. Establish and/or develop powerful recombinant protein expression systems; 2. Develop and refine isolation and purification of recombinant proteins; 3. Express wild-type non-collagenous bone proteins; 4. Express site-specific mutant proteins and domains of wild-type proteins to enhance likelihood of crystal formation for subsequent solution of structure.
Vertebrate Membrane Proteins: Structure, Function, and Insights from Biophysical Approaches
MÜLLER, DANIEL J.; WU, NAN; PALCZEWSKI, KRZYSZTOF
2008-01-01
Membrane proteins are key targets for pharmacological intervention because they are vital for cellular function. Here, we analyze recent progress made in the understanding of the structure and function of membrane proteins with a focus on rhodopsin and development of atomic force microscopy techniques to study biological membranes. Membrane proteins are compartmentalized to carry out extra- and intracellular processes. Biological membranes are densely populated with membrane proteins that occupy approximately 50% of their volume. In most cases membranes contain lipid rafts, protein patches, or paracrystalline formations that lack the higher-order symmetry that would allow them to be characterized by diffraction methods. Despite many technical difficulties, several crystal structures of membrane proteins that illustrate their internal structural organization have been determined. Moreover, high-resolution atomic force microscopy, near-field scanning optical microscopy, and other lower resolution techniques have been used to investigate these structures. Single-molecule force spectroscopy tracks interactions that stabilize membrane proteins and those that switch their functional state; this spectroscopy can be applied to locate a ligand-binding site. Recent development of this technique also reveals the energy landscape of a membrane protein, defining its folding, reaction pathways, and kinetics. Future development and application of novel approaches during the coming years should provide even greater insights to the understanding of biological membrane organization and function. PMID:18321962
Can misfolded proteins be beneficial? The HAMLET case.
Pettersson-Kastberg, Jenny; Aits, Sonja; Gustafsson, Lotta; Mossberg, Anki; Storm, Petter; Trulsson, Maria; Persson, Filip; Mok, K Hun; Svanborg, Catharina
2009-01-01
By changing the three-dimensional structure, a protein can attain new functions, distinct from those of the native protein. Amyloid-forming proteins are one example, in which conformational change may lead to fibril formation and, in many cases, neurodegenerative disease. We have proposed that partial unfolding provides a mechanism to generate new and useful functional variants from a given polypeptide chain. Here we present HAMLET (Human Alpha-lactalbumin Made LEthal to Tumor cells) as an example where partial unfolding and the incorporation of cofactor create a complex with new, beneficial properties. Native alpha-lactalbumin functions as a substrate specifier in lactose synthesis, but when partially unfolded the protein binds oleic acid and forms the tumoricidal HAMLET complex. When the properties of HAMLET were first described they were surprising, as protein folding intermediates and especially amyloid-forming protein intermediates had been regarded as toxic conformations, but since then structural studies have supported functional diversity arising from a change in fold. The properties of HAMLET suggest a mechanism of structure-function variation, which might help the limited number of human protein genes to generate sufficient structural diversity to meet the diverse functional demands of complex organisms.
Geometrical comparison of two protein structures using Wigner-D functions.
Saberi Fathi, S M; White, Diana T; Tuszynski, Jack A
2014-10-01
In this article, we develop a quantitative comparison method for two arbitrary protein structures. This method uses a root-mean-square deviation characterization and employs a series expansion of the protein's shape function in terms of the Wigner-D functions to define a new criterion, which is called a "similarity value." We further demonstrate that the expansion coefficients for the shape function obtained with the help of the Wigner-D functions correspond to structure factors. Our method addresses the common problem of comparing two proteins with different numbers of atoms. We illustrate it with a worked example. © 2014 Wiley Periodicals, Inc.
Watching proteins function with picosecond X-ray crystallography and molecular dynamics simulations.
NASA Astrophysics Data System (ADS)
Anfinrud, Philip
2006-03-01
Time-resolved electron density maps of myoglobin, a ligand-binding heme protein, have been stitched together into movies that unveil with < 2-å spatial resolution and 150-ps time-resolution the correlated protein motions that accompany and/or mediate ligand migration within the hydrophobic interior of a protein. A joint analysis of all-atom molecular dynamics (MD) calculations and picosecond time-resolved X-ray structures provides single-molecule insights into mechanisms of protein function. Ensemble-averaged MD simulations of the L29F mutant of myoglobin following ligand dissociation reproduce the direction, amplitude, and timescales of crystallographically-determined structural changes. This close agreement with experiments at comparable resolution in space and time validates the individual MD trajectories, which identify and structurally characterize a conformational switch that directs dissociated ligands to one of two nearby protein cavities. This unique combination of simulation and experiment unveils functional protein motions and illustrates at an atomic level relationships among protein structure, dynamics, and function. In collaboration with Friedrich Schotte and Gerhard Hummer, NIH.
Chen, Xianfeng; Weber, Irene; Harrison, Robert W
2008-09-25
Water plays a critical role in the structure and function of proteins, although the experimental properties of water around protein structures are not well understood. The water can be classified by the separation from the protein surface into bulk water and hydration water. Hydration water interacts closely with the protein and contributes to protein folding, stability, and dynamics, as well as interacting with the bulk water. Water potential functions are often parametrized to fit bulk water properties because of the limited experimental data for hydration water. Therefore, the structural and energetic properties of the hydration water were assessed for 105 atomic resolution (
Protein Engineering Approaches in the Post-Genomic Era.
Singh, Raushan K; Lee, Jung-Kul; Selvaraj, Chandrabose; Singh, Ranjitha; Li, Jinglin; Kim, Sang-Yong; Kalia, Vipin C
2018-01-01
Proteins are one of the most multifaceted macromolecules in living systems. Proteins have evolved to function under physiological conditions and, therefore, are not usually tolerant of harsh experimental and environmental conditions. The growing use of proteins in industrial processes as a greener alternative to chemical catalysts often demands constant innovation to improve their performance. Protein engineering aims to design new proteins or modify the sequence of a protein to create proteins with new or desirable functions. With the emergence of structural and functional genomics, protein engineering has been invigorated in the post-genomic era. The three-dimensional structures of proteins with known functions facilitate protein engineering approaches to design variants with desired properties. There are three major approaches of protein engineering research, namely, directed evolution, rational design, and de novo design. Rational design is an effective method of protein engineering when the threedimensional structure and mechanism of the protein is well known. In contrast, directed evolution does not require extensive information and a three-dimensional structure of the protein of interest. Instead, it involves random mutagenesis and selection to screen enzymes with desired properties. De novo design uses computational protein design algorithms to tailor synthetic proteins by using the three-dimensional structures of natural proteins and their folding rules. The present review highlights and summarizes recent protein engineering approaches, and their challenges and limitations in the post-genomic era. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Year 2 Report: Protein Function Prediction Platform
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhou, C E
2012-04-27
Upon completion of our second year of development in a 3-year development cycle, we have completed a prototype protein structure-function annotation and function prediction system: Protein Function Prediction (PFP) platform (v.0.5). We have met our milestones for Years 1 and 2 and are positioned to continue development in completion of our original statement of work, or a reasonable modification thereof, in service to DTRA Programs involved in diagnostics and medical countermeasures research and development. The PFP platform is a multi-scale computational modeling system for protein structure-function annotation and function prediction. As of this writing, PFP is the only existing fullymore » automated, high-throughput, multi-scale modeling, whole-proteome annotation platform, and represents a significant advance in the field of genome annotation (Fig. 1). PFP modules perform protein functional annotations at the sequence, systems biology, protein structure, and atomistic levels of biological complexity (Fig. 2). Because these approaches provide orthogonal means of characterizing proteins and suggesting protein function, PFP processing maximizes the protein functional information that can currently be gained by computational means. Comprehensive annotation of pathogen genomes is essential for bio-defense applications in pathogen characterization, threat assessment, and medical countermeasure design and development in that it can short-cut the time and effort required to select and characterize protein biomarkers.« less
DNA mimic proteins: functions, structures, and bioinformatic analysis.
Wang, Hao-Ching; Ho, Chun-Han; Hsu, Kai-Cheng; Yang, Jinn-Moon; Wang, Andrew H-J
2014-05-13
DNA mimic proteins have DNA-like negative surface charge distributions, and they function by occupying the DNA binding sites of DNA binding proteins to prevent these sites from being accessed by DNA. DNA mimic proteins control the activities of a variety of DNA binding proteins and are involved in a wide range of cellular mechanisms such as chromatin assembly, DNA repair, transcription regulation, and gene recombination. However, the sequences and structures of DNA mimic proteins are diverse, making them difficult to predict by bioinformatic search. To date, only a few DNA mimic proteins have been reported. These DNA mimics were not found by searching for functional motifs in their sequences but were revealed only by structural analysis of their charge distribution. This review highlights the biological roles and structures of 16 reported DNA mimic proteins. We also discuss approaches that might be used to discover new DNA mimic proteins.
The proteome: structure, function and evolution
Fleming, Keiran; Kelley, Lawrence A; Islam, Suhail A; MacCallum, Robert M; Muller, Arne; Pazos, Florencio; Sternberg, Michael J.E
2006-01-01
This paper reports two studies to model the inter-relationships between protein sequence, structure and function. First, an automated pipeline to provide a structural annotation of proteomes in the major genomes is described. The results are stored in a database at Imperial College, London (3D-GENOMICS) that can be accessed at www.sbg.bio.ic.ac.uk. Analysis of the assignments to structural superfamilies provides evolutionary insights. 3D-GENOMICS is being integrated with related proteome annotation data at University College London and the European Bioinformatics Institute in a project known as e-protein (http://www.e-protein.org/). The second topic is motivated by the developments in structural genomics projects in which the structure of a protein is determined prior to knowledge of its function. We have developed a new approach PHUNCTIONER that uses the gene ontology (GO) classification to supervise the extraction of the sequence signal responsible for protein function from a structure-based sequence alignment. Using GO we can obtain profiles for a range of specificities described in the ontology. In the region of low sequence similarity (around 15%), our method is more accurate than assignment from the closest structural homologue. The method is also able to identify the specific residues associated with the function of the protein family. PMID:16524832
Leuthaeuser, Janelle B; Knutson, Stacy T; Kumar, Kiran; Babbitt, Patricia C; Fetrow, Jacquelyn S
2015-09-01
The development of accurate protein function annotation methods has emerged as a major unsolved biological problem. Protein similarity networks, one approach to function annotation via annotation transfer, group proteins into similarity-based clusters. An underlying assumption is that the edge metric used to identify such clusters correlates with functional information. In this contribution, this assumption is evaluated by observing topologies in similarity networks using three different edge metrics: sequence (BLAST), structure (TM-Align), and active site similarity (active site profiling, implemented in DASP). Network topologies for four well-studied protein superfamilies (enolase, peroxiredoxin (Prx), glutathione transferase (GST), and crotonase) were compared with curated functional hierarchies and structure. As expected, network topology differs, depending on edge metric; comparison of topologies provides valuable information on structure/function relationships. Subnetworks based on active site similarity correlate with known functional hierarchies at a single edge threshold more often than sequence- or structure-based networks. Sequence- and structure-based networks are useful for identifying sequence and domain similarities and differences; therefore, it is important to consider the clustering goal before deciding appropriate edge metric. Further, conserved active site residues identified in enolase and GST active site subnetworks correspond with published functionally important residues. Extension of this analysis yields predictions of functionally determinant residues for GST subgroups. These results support the hypothesis that active site similarity-based networks reveal clusters that share functional details and lay the foundation for capturing functionally relevant hierarchies using an approach that is both automatable and can deliver greater precision in function annotation than current similarity-based methods. © 2015 The Authors Protein Science published by Wiley Periodicals, Inc. on behalf of The Protein Society.
Müller, Boje; Groscurth, Sira; Menzel, Matthias; Rüping, Boris A.; Twyman, Richard M.; Prüfer, Dirk; Noll, Gundula A.
2014-01-01
Background and Aims Forisomes are specialized structural phloem proteins that mediate sieve element occlusion after wounding exclusively in papilionoid legumes, but most studies of forisome structure and function have focused on the Old World clade rather than the early lineages. A comprehensive phylogenetic, molecular, structural and functional analysis of forisomes from species covering a broad spectrum of the papilionoid legumes was therefore carried out, including the first analysis of Dipteryx panamensis forisomes, representing the earliest branch of the Papilionoideae lineage. The aim was to study the molecular, structural and functional conservation among forisomes from different tribes and to establish the roles of individual forisome subunits. Methods Sequence analysis and bioinformatics were combined with structural and functional analysis of native forisomes and artificial forisome-like protein bodies, the latter produced by expressing forisome genes from different legumes in a heterologous background. The structure of these bodies was analysed using a combination of confocal laser scanning microscopy (CLSM), scanning electron microscopy (SEM) and transmission electron microscopy (TEM), and the function of individual subunits was examined by combinatorial expression, micromanipulation and light microscopy. Key Results Dipteryx panamensis native forisomes and homomeric protein bodies assembled from the single sieve element occlusion by forisome (SEO-F) subunit identified in this species were structurally and functionally similar to forisomes from the Old World clade. In contrast, homomeric protein bodies assembled from individual SEO-F subunits from Old World species yielded artificial forisomes differing in proportion to their native counterparts, suggesting that multiple SEO-F proteins are required for forisome assembly in these plants. Structural differences between Medicago truncatula native forisomes, homomeric protein bodies and heteromeric bodies containing all possible subunit combinations suggested that combinations of SEO-F proteins may fine-tune the geometric proportions and reactivity of forisomes. Conclusions It is concluded that forisome structure and function have been strongly conserved during evolution and that species-dependent subsets of SEO-F proteins may have evolved to fine-tune the structure of native forisomes. PMID:24694827
DWARF – a data warehouse system for analyzing protein families
Fischer, Markus; Thai, Quan K; Grieb, Melanie; Pleiss, Jürgen
2006-01-01
Background The emerging field of integrative bioinformatics provides the tools to organize and systematically analyze vast amounts of highly diverse biological data and thus allows to gain a novel understanding of complex biological systems. The data warehouse DWARF applies integrative bioinformatics approaches to the analysis of large protein families. Description The data warehouse system DWARF integrates data on sequence, structure, and functional annotation for protein fold families. The underlying relational data model consists of three major sections representing entities related to the protein (biochemical function, source organism, classification to homologous families and superfamilies), the protein sequence (position-specific annotation, mutant information), and the protein structure (secondary structure information, superimposed tertiary structure). Tools for extracting, transforming and loading data from public available resources (ExPDB, GenBank, DSSP) are provided to populate the database. The data can be accessed by an interface for searching and browsing, and by analysis tools that operate on annotation, sequence, or structure. We applied DWARF to the family of α/β-hydrolases to host the Lipase Engineering database. Release 2.3 contains 6138 sequences and 167 experimentally determined protein structures, which are assigned to 37 superfamilies 103 homologous families. Conclusion DWARF has been designed for constructing databases of large structurally related protein families and for evaluating their sequence-structure-function relationships by a systematic analysis of sequence, structure and functional annotation. It has been applied to predict biochemical properties from sequence, and serves as a valuable tool for protein engineering. PMID:17094801
Exploring Human Diseases and Biological Mechanisms by Protein Structure Prediction and Modeling.
Wang, Juexin; Luttrell, Joseph; Zhang, Ning; Khan, Saad; Shi, NianQing; Wang, Michael X; Kang, Jing-Qiong; Wang, Zheng; Xu, Dong
2016-01-01
Protein structure prediction and modeling provide a tool for understanding protein functions by computationally constructing protein structures from amino acid sequences and analyzing them. With help from protein prediction tools and web servers, users can obtain the three-dimensional protein structure models and gain knowledge of functions from the proteins. In this chapter, we will provide several examples of such studies. As an example, structure modeling methods were used to investigate the relation between mutation-caused misfolding of protein and human diseases including epilepsy and leukemia. Protein structure prediction and modeling were also applied in nucleotide-gated channels and their interaction interfaces to investigate their roles in brain and heart cells. In molecular mechanism studies of plants, rice salinity tolerance mechanism was studied via structure modeling on crucial proteins identified by systems biology analysis; trait-associated protein-protein interactions were modeled, which sheds some light on the roles of mutations in soybean oil/protein content. In the age of precision medicine, we believe protein structure prediction and modeling will play more and more important roles in investigating biomedical mechanism of diseases and drug design.
2012-01-01
Background To discover a compound inhibiting multiple proteins (i.e. polypharmacological targets) is a new paradigm for the complex diseases (e.g. cancers and diabetes). In general, the polypharmacological proteins often share similar local binding environments and motifs. As the exponential growth of the number of protein structures, to find the similar structural binding motifs (pharma-motifs) is an emergency task for drug discovery (e.g. side effects and new uses for old drugs) and protein functions. Results We have developed a Space-Related Pharmamotifs (called SRPmotif) method to recognize the binding motifs by searching against protein structure database. SRPmotif is able to recognize conserved binding environments containing spatially discontinuous pharma-motifs which are often short conserved peptides with specific physico-chemical properties for protein functions. Among 356 pharma-motifs, 56.5% interacting residues are highly conserved. Experimental results indicate that 81.1% and 92.7% polypharmacological targets of each protein-ligand complex are annotated with same biological process (BP) and molecular function (MF) terms, respectively, based on Gene Ontology (GO). Our experimental results show that the identified pharma-motifs often consist of key residues in functional (active) sites and play the key roles for protein functions. The SRPmotif is available at http://gemdock.life.nctu.edu.tw/SRP/. Conclusions SRPmotif is able to identify similar pharma-interfaces and pharma-motifs sharing similar binding environments for polypharmacological targets by rapidly searching against the protein structure database. Pharma-motifs describe the conservations of binding environments for drug discovery and protein functions. Additionally, these pharma-motifs provide the clues for discovering new sequence-based motifs to predict protein functions from protein sequence databases. We believe that SRPmotif is useful for elucidating protein functions and drug discovery. PMID:23281852
Chiu, Yi-Yuan; Lin, Chun-Yu; Lin, Chih-Ta; Hsu, Kai-Cheng; Chang, Li-Zen; Yang, Jinn-Moon
2012-01-01
To discover a compound inhibiting multiple proteins (i.e. polypharmacological targets) is a new paradigm for the complex diseases (e.g. cancers and diabetes). In general, the polypharmacological proteins often share similar local binding environments and motifs. As the exponential growth of the number of protein structures, to find the similar structural binding motifs (pharma-motifs) is an emergency task for drug discovery (e.g. side effects and new uses for old drugs) and protein functions. We have developed a Space-Related Pharmamotifs (called SRPmotif) method to recognize the binding motifs by searching against protein structure database. SRPmotif is able to recognize conserved binding environments containing spatially discontinuous pharma-motifs which are often short conserved peptides with specific physico-chemical properties for protein functions. Among 356 pharma-motifs, 56.5% interacting residues are highly conserved. Experimental results indicate that 81.1% and 92.7% polypharmacological targets of each protein-ligand complex are annotated with same biological process (BP) and molecular function (MF) terms, respectively, based on Gene Ontology (GO). Our experimental results show that the identified pharma-motifs often consist of key residues in functional (active) sites and play the key roles for protein functions. The SRPmotif is available at http://gemdock.life.nctu.edu.tw/SRP/. SRPmotif is able to identify similar pharma-interfaces and pharma-motifs sharing similar binding environments for polypharmacological targets by rapidly searching against the protein structure database. Pharma-motifs describe the conservations of binding environments for drug discovery and protein functions. Additionally, these pharma-motifs provide the clues for discovering new sequence-based motifs to predict protein functions from protein sequence databases. We believe that SRPmotif is useful for elucidating protein functions and drug discovery.
Purely Structural Protein Scoring Functions Using Support Vector Machine and Ensemble Learning.
Mirzaei, Shokoufeh; Sidi, Tomer; Keasar, Chen; Crivelli, Silvia
2016-08-24
The function of a protein is determined by its structure, which creates a need for efficient methods of protein structure determination to advance scientific and medical research. Because current experimental structure determination methods carry a high price tag, computational predictions are highly desirable. Given a protein sequence, computational methods produce numerous 3D structures known as decoys. However, selection of the best quality decoys is challenging as the end users can handle only a few ones. Therefore, scoring functions are central to decoy selection. They combine measurable features into a single number indicator of decoy quality. Unfortunately, current scoring functions do not consistently select the best decoys. Machine learning techniques offer great potential to improve decoy scoring. This paper presents two machine-learning based scoring functions to predict the quality of proteins structures, i.e., the similarity between the predicted structure and the experimental one without knowing the latter. We use different metrics to compare these scoring functions against three state-of-the-art scores. This is a first attempt at comparing different scoring functions using the same non-redundant dataset for training and testing and the same features. The results show that adding informative features may be more significant than the method used.
Analysis of sequence repeats of proteins in the PDB.
Mary Rajathei, David; Selvaraj, Samuel
2013-12-01
Internal repeats in protein sequences play a significant role in the evolution of protein structure and function. Applications of different bioinformatics tools help in the identification and characterization of these repeats. In the present study, we analyzed sequence repeats in a non-redundant set of proteins available in the Protein Data Bank (PDB). We used RADAR for detecting internal repeats in a protein, PDBeFOLD for assessing structural similarity, PDBsum for finding functional involvement and Pfam for domain assignment of the repeats in a protein. Through the analysis of sequence repeats, we found that identity of the sequence repeats falls in the range of 20-40% and, the superimposed structures of the most of the sequence repeats maintain similar overall folding. Analysis sequence repeats at the functional level reveals that most of the sequence repeats are involved in the function of the protein through functionally involved residues in the repeat regions. We also found that sequence repeats in single and two domain proteins often contained conserved sequence motifs for the function of the domain. Copyright © 2013 Elsevier Ltd. All rights reserved.
Heinz, Eva; Lithgow, Trevor
2014-01-01
Members of the Omp85/TpsB protein superfamily are ubiquitously distributed in Gram-negative bacteria, and function in protein translocation (e.g., FhaC) or the assembly of outer membrane proteins (e.g., BamA). Several recent findings are suggestive of a further level of variation in the superfamily, including the identification of the novel membrane protein assembly factor TamA and protein translocase PlpD. To investigate the diversity and the causal evolutionary events, we undertook a comprehensive comparative sequence analysis of the Omp85/TpsB proteins. A total of 10 protein subfamilies were apparent, distinguished in their domain structure and sequence signatures. In addition to the proteins FhaC, BamA, and TamA, for which structural and functional information is available, are families of proteins with so far undescribed domain architectures linked to the Omp85 β-barrel domain. This study brings a classification structure to a dynamic protein superfamily of high interest given its essential function for Gram-negative bacteria as well as its diverse domain architecture, and we discuss several scenarios of putative functions of these so far undescribed proteins. PMID:25101071
regSNPs-splicing: a tool for prioritizing synonymous single-nucleotide substitution.
Zhang, Xinjun; Li, Meng; Lin, Hai; Rao, Xi; Feng, Weixing; Yang, Yuedong; Mort, Matthew; Cooper, David N; Wang, Yue; Wang, Yadong; Wells, Clark; Zhou, Yaoqi; Liu, Yunlong
2017-09-01
While synonymous single-nucleotide variants (sSNVs) have largely been unstudied, since they do not alter protein sequence, mounting evidence suggests that they may affect RNA conformation, splicing, and the stability of nascent-mRNAs to promote various diseases. Accurately prioritizing deleterious sSNVs from a pool of neutral ones can significantly improve our ability of selecting functional genetic variants identified from various genome-sequencing projects, and, therefore, advance our understanding of disease etiology. In this study, we develop a computational algorithm to prioritize sSNVs based on their impact on mRNA splicing and protein function. In addition to genomic features that potentially affect splicing regulation, our proposed algorithm also includes dozens structural features that characterize the functions of alternatively spliced exons on protein function. Our systematical evaluation on thousands of sSNVs suggests that several structural features, including intrinsic disorder protein scores, solvent accessible surface areas, protein secondary structures, and known and predicted protein family domains, show significant differences between disease-causing and neutral sSNVs. Our result suggests that the protein structure features offer an added dimension of information while distinguishing disease-causing and neutral synonymous variants. The inclusion of structural features increases the predictive accuracy for functional sSNV prioritization.
Composite Structural Motifs of Binding Sites for Delineating Biological Functions of Proteins
Kinjo, Akira R.; Nakamura, Haruki
2012-01-01
Most biological processes are described as a series of interactions between proteins and other molecules, and interactions are in turn described in terms of atomic structures. To annotate protein functions as sets of interaction states at atomic resolution, and thereby to better understand the relation between protein interactions and biological functions, we conducted exhaustive all-against-all atomic structure comparisons of all known binding sites for ligands including small molecules, proteins and nucleic acids, and identified recurring elementary motifs. By integrating the elementary motifs associated with each subunit, we defined composite motifs that represent context-dependent combinations of elementary motifs. It is demonstrated that function similarity can be better inferred from composite motif similarity compared to the similarity of protein sequences or of individual binding sites. By integrating the composite motifs associated with each protein function, we define meta-composite motifs each of which is regarded as a time-independent diagrammatic representation of a biological process. It is shown that meta-composite motifs provide richer annotations of biological processes than sequence clusters. The present results serve as a basis for bridging atomic structures to higher-order biological phenomena by classification and integration of binding site structures. PMID:22347478
Zemla, Adam T; Lang, Dorothy M; Kostova, Tanya; Andino, Raul; Ecale Zhou, Carol L
2011-06-02
Most of the currently used methods for protein function prediction rely on sequence-based comparisons between a query protein and those for which a functional annotation is provided. A serious limitation of sequence similarity-based approaches for identifying residue conservation among proteins is the low confidence in assigning residue-residue correspondences among proteins when the level of sequence identity between the compared proteins is poor. Multiple sequence alignment methods are more satisfactory--still, they cannot provide reliable results at low levels of sequence identity. Our goal in the current work was to develop an algorithm that could help overcome these difficulties by facilitating the identification of structurally (and possibly functionally) relevant residue-residue correspondences between compared protein structures. Here we present StralSV (structure-alignment sequence variability), a new algorithm for detecting closely related structure fragments and quantifying residue frequency from tight local structure alignments. We apply StralSV in a study of the RNA-dependent RNA polymerase of poliovirus, and we demonstrate that the algorithm can be used to determine regions of the protein that are relatively unique, or that share structural similarity with proteins that would be considered distantly related. By quantifying residue frequencies among many residue-residue pairs extracted from local structural alignments, one can infer potential structural or functional importance of specific residues that are determined to be highly conserved or that deviate from a consensus. We further demonstrate that considerable detailed structural and phylogenetic information can be derived from StralSV analyses. StralSV is a new structure-based algorithm for identifying and aligning structure fragments that have similarity to a reference protein. StralSV analysis can be used to quantify residue-residue correspondences and identify residues that may be of particular structural or functional importance, as well as unusual or unexpected residues at a given sequence position. StralSV is provided as a web service at http://proteinmodel.org/AS2TS/STRALSV/.
The Papillomavirus E2 proteins
DOE Office of Scientific and Technical Information (OSTI.GOV)
McBride, Alison A., E-mail: amcbride@nih.gov
2013-10-15
The papillomavirus E2 proteins are pivotal to the viral life cycle and have well characterized functions in transcriptional regulation, initiation of DNA replication and partitioning the viral genome. The E2 proteins also function in vegetative DNA replication, post-transcriptional processes and possibly packaging. This review describes structural and functional aspects of the E2 proteins and their binding sites on the viral genome. It is intended to be a reference guide to this viral protein. - Highlights: • Overview of E2 protein functions. • Structural domains of the papillomavirus E2 proteins. • Analysis of E2 binding sites in different genera of papillomaviruses.more » • Compilation of E2 associated proteins. • Comparison of key mutations in distinct E2 functions.« less
Lubin, Johnathan W; Rao, Timsi; Mandell, Edward K; Wuttke, Deborah S; Lundblad, Victoria
2013-03-01
Mutations that confer the loss of a single biochemical property (separation-of-function mutations) can often uncover a previously unknown role for a protein in a particular biological process. However, most mutations are identified based on loss-of-function phenotypes, which cannot differentiate between separation-of-function alleles vs. mutations that encode unstable/unfolded proteins. An alternative approach is to use overexpression dominant-negative (ODN) phenotypes to identify mutant proteins that disrupt function in an otherwise wild-type strain when overexpressed. This is based on the assumption that such mutant proteins retain an overall structure that is comparable to that of the wild-type protein and are able to compete with the endogenous protein (Herskowitz 1987). To test this, the in vivo phenotypes of mutations in the Est3 telomerase subunit from Saccharomyces cerevisiae were compared with the in vitro secondary structure of these mutant proteins as analyzed by circular-dichroism spectroscopy, which demonstrates that ODN is a more sensitive assessment of protein stability than the commonly used method of monitoring protein levels from extracts. Reverse mutagenesis of EST3, which targeted different categories of amino acids, also showed that mutating highly conserved charged residues to the oppositely charged amino acid had an increased likelihood of generating a severely defective est3(-) mutation, which nevertheless encoded a structurally stable protein. These results suggest that charge-swap mutagenesis directed at a limited subset of highly conserved charged residues, combined with ODN screening to eliminate partially unfolded proteins, may provide a widely applicable and efficient strategy for generating separation-of-function mutations.
Nagata, Koji
2010-01-01
Peptides and proteins with similar amino acid sequences can have different biological functions. Knowledge of their three-dimensional molecular structures is critically important in identifying their functional determinants. In this review, I describe the results of our and other groups' structure-based functional characterization of insect insulin-like peptides, a crustacean hyperglycemic hormone-family peptide, a mammalian epidermal growth factor-family protein, and an intracellular signaling domain that recognizes proline-rich sequence.
Visualizing water molecules in transmembrane proteins using radiolytic labeling methods†
Orban, Tivadar; Gupta, Sayan; Palczewski, Krzysztof; Chance, Mark R.
2010-01-01
Essential to cells and their organelles, water is both shuttled to where it is needed and trapped within cellular compartments and structures. Moreover, ordered waters within protein structures often co-localize with strategically placed polar or charged groups critical for protein function. Yet it is unclear if these ordered water molecules provide structural stabilization, mediate conformational changes in signaling, neutralize charged residues, or carry out a combination of all these functions. Structures of many integral membrane proteins, including G protein-coupled receptors (GPCRs), reveal the presence of ordered water molecules that may act like prosthetic groups in a manner quite unlike bulk water. Identification of ‘ordered’ waters within a crystalline protein structure requires sufficient occupancy of water to enable its detection in the protein's X-ray diffraction pattern and thus the observed waters likely represent a subset of tightly-bound functional waters. In this review, we highlight recent studies that suggest the structures of ordered waters within GPCRs are as conserved (and thus as important) as conserved side chains. In addition, methods of radiolysis, coupled to structural mass spectrometry (protein footprinting), reveal dynamic changes in water structure that mediate transmembrane signaling. The idea of water as a prosthetic group mediating chemical reaction dynamics is not new in fields such as catalysis. However, the concept of water as a mediator of conformational dynamics in signaling is just emerging, owing to advances in both crystallographic structure determination and new methods of protein footprinting. Although oil and water do not mix, understanding the roles of water is essential to understanding the function of membrane proteins. PMID:20047303
Impact of genetic variation on three dimensional structure and function of proteins
Bhattacharya, Roshni; Rose, Peter W.; Burley, Stephen K.
2017-01-01
The Protein Data Bank (PDB; http://wwpdb.org) was established in 1971 as the first open access digital data resource in biology with seven protein structures as its initial holdings. The global PDB archive now contains more than 126,000 experimentally determined atomic level three-dimensional (3D) structures of biological macromolecules (proteins, DNA, RNA), all of which are freely accessible via the Internet. Knowledge of the 3D structure of the gene product can help in understanding its function and role in disease. Of particular interest in the PDB archive are proteins for which 3D structures of genetic variant proteins have been determined, thus revealing atomic-level structural differences caused by the variation at the DNA level. Herein, we present a systematic and qualitative analysis of such cases. We observe a wide range of structural and functional changes caused by single amino acid differences, including changes in enzyme activity, aggregation propensity, structural stability, binding, and dissociation, some in the context of large assemblies. Structural comparison of wild type and mutated proteins, when both are available, provide insights into atomic-level structural differences caused by the genetic variation. PMID:28296894
Computational mining for hypothetical patterns of amino acid side chains in protein data bank (PDB)
NASA Astrophysics Data System (ADS)
Ghani, Nur Syatila Ab; Firdaus-Raih, Mohd
2018-04-01
The three-dimensional structure of a protein can provide insights regarding its function. Functional relationship between proteins can be inferred from fold and sequence similarities. In certain cases, sequence or fold comparison fails to conclude homology between proteins with similar mechanism. Since the structure is more conserved than the sequence, a constellation of functional residues can be similarly arranged among proteins of similar mechanism. Local structural similarity searches are able to detect such constellation of amino acids among distinct proteins, which can be useful to annotate proteins of unknown function. Detection of such patterns of amino acids on a large scale can increase the repertoire of important 3D motifs since available known 3D motifs currently, could not compensate the ever-increasing numbers of uncharacterized proteins to be annotated. Here, a computational platform for an automated detection of 3D motifs is described. A fuzzy-pattern searching algorithm derived from IMagine an Amino Acid 3D Arrangement search EnGINE (IMAAAGINE) was implemented to develop an automated method for searching of hypothetical patterns of amino acid side chains in Protein Data Bank (PDB), without the need for prior knowledge on related sequence or structure of pattern of interest. We present an example of the searches, which is the detection of a hypothetical pattern derived from known structural motif of C2H2 structural pattern from zinc fingers. The conservation of particular patterns of amino acid side chains in unrelated proteins is highlighted. This approach can act as a complementary method for available structure- and sequence-based platforms and may contribute in improving functional association between proteins.
Tracing Primordial Protein Evolution through Structurally Guided Stepwise Segment Elongation*
Watanabe, Hideki; Yamasaki, Kazuhiko; Honda, Shinya
2014-01-01
The understanding of how primordial proteins emerged has been a fundamental and longstanding issue in biology and biochemistry. For a better understanding of primordial protein evolution, we synthesized an artificial protein on the basis of an evolutionary hypothesis, segment-based elongation starting from an autonomously foldable short peptide. A 10-residue protein, chignolin, the smallest foldable polypeptide ever reported, was used as a structural support to facilitate higher structural organization and gain-of-function in the development of an artificial protein. Repetitive cycles of segment elongation and subsequent phage display selection successfully produced a 25-residue protein, termed AF.2A1, with nanomolar affinity against the Fc region of immunoglobulin G. AF.2A1 shows exquisite molecular recognition ability such that it can distinguish conformational differences of the same molecule. The structure determined by NMR measurements demonstrated that AF.2A1 forms a globular protein-like conformation with the chignolin-derived β-hairpin and a tryptophan-mediated hydrophobic core. Using sequence analysis and a mutation study, we discovered that the structural organization and gain-of-function emerged from the vicinity of the chignolin segment, revealing that the structural support served as the core in both structural and functional development. Here, we propose an evolutionary model for primordial proteins in which a foldable segment serves as the evolving core to facilitate structural and functional evolution. This study provides insights into primordial protein evolution and also presents a novel methodology for designing small sized proteins useful for industrial and pharmaceutical applications. PMID:24356963
Metamorphic Proteins: Emergence of Dual Protein Folds from One Primary Sequence.
Lella, Muralikrishna; Mahalakshmi, Radhakrishnan
2017-06-20
Every amino acid exhibits a different propensity for distinct structural conformations. Hence, decoding how the primary amino acid sequence undergoes the transition to a defined secondary structure and its final three-dimensional fold is presently considered predictable with reasonable certainty. However, protein sequences that defy the first principles of secondary structure prediction (they attain two different folds) have recently been discovered. Such proteins, aptly named metamorphic proteins, decrease the conformational constraint by increasing flexibility in the secondary structure and thereby result in efficient functionality. In this review, we discuss the major factors driving the conformational switch related both to protein sequence and to structure using illustrative examples. We discuss the concept of an evolutionary transition in sequence and structure, the functional impact of the tertiary fold, and the pressure of intrinsic and external factors that give rise to metamorphic proteins. We mainly focus on the major components of protein architecture, namely, the α-helix and β-sheet segments, which are involved in conformational switching within the same or highly similar sequences. These chameleonic sequences are widespread in both cytosolic and membrane proteins, and these folds are equally important for protein structure and function. We discuss the implications of metamorphic proteins and chameleonic peptide sequences in de novo peptide design.
Kinjo, Akira R; Nakamura, Haruki
2013-01-01
Protein functions are mediated by interactions between proteins and other molecules. One useful approach to analyze protein functions is to compare and classify the structures of interaction interfaces of proteins. Here, we describe the procedures for compiling a database of interface structures and efficiently comparing the interface structures. To do so requires a good understanding of the data structures of the Protein Data Bank (PDB). Therefore, we also provide a detailed account of the PDB exchange dictionary necessary for extracting data that are relevant for analyzing interaction interfaces and secondary structures. We identify recurring structural motifs by classifying similar interface structures, and we define a coarse-grained representation of supersecondary structures (SSS) which represents a sequence of two or three secondary structure elements including their relative orientations as a string of four to seven letters. By examining the correspondence between structural motifs and SSS strings, we show that no SSS string has particularly high propensity to be found interaction interfaces in general, indicating any SSS can be used as a binding interface. When individual structural motifs are examined, there are some SSS strings that have high propensity for particular groups of structural motifs. In addition, it is shown that while the SSS strings found in particular structural motifs for nonpolymer and protein interfaces are as abundant as in other structural motifs that belong to the same subunit, structural motifs for nucleic acid interfaces exhibit somewhat stronger preference for SSS strings. In regard to protein folds, many motif-specific SSS strings were found across many folds, suggesting that SSS may be a useful description to investigate the universality of ligand binding modes.
Lessons on RNA Silencing Mechanisms in Plants from Eukaryotic Argonaute Structures[W
Poulsen, Christian; Vaucheret, Hervé; Brodersen, Peter
2013-01-01
RNA silencing refers to a collection of gene regulatory mechanisms that use small RNAs for sequence specific repression. These mechanisms rely on ARGONAUTE (AGO) proteins that directly bind small RNAs and thereby constitute the central component of the RNA-induced silencing complex (RISC). AGO protein function has been probed extensively by mutational analyses, particularly in plants where large allelic series of several AGO proteins have been isolated. Structures of entire human and yeast AGO proteins have only very recently been obtained, and they allow more precise analyses of functional consequences of mutations obtained by forward genetics. To a large extent, these analyses support current models of regions of particular functional importance of AGO proteins. Interestingly, they also identify previously unrecognized parts of AGO proteins with profound structural and functional importance and provide the first hints at structural elements that have important functions specific to individual AGO family members. A particularly important outcome of the analysis concerns the evidence for existence of Gly-Trp (GW) repeat interactors of AGO proteins acting in the plant microRNA pathway. The parallel analysis of AGO structures and plant AGO mutations also suggests that such interactions with GW proteins may be a determinant of whether an endonucleolytically competent RISC is formed. PMID:23303917
Lessons on RNA silencing mechanisms in plants from eukaryotic argonaute structures.
Poulsen, Christian; Vaucheret, Hervé; Brodersen, Peter
2013-01-01
RNA silencing refers to a collection of gene regulatory mechanisms that use small RNAs for sequence specific repression. These mechanisms rely on ARGONAUTE (AGO) proteins that directly bind small RNAs and thereby constitute the central component of the RNA-induced silencing complex (RISC). AGO protein function has been probed extensively by mutational analyses, particularly in plants where large allelic series of several AGO proteins have been isolated. Structures of entire human and yeast AGO proteins have only very recently been obtained, and they allow more precise analyses of functional consequences of mutations obtained by forward genetics. To a large extent, these analyses support current models of regions of particular functional importance of AGO proteins. Interestingly, they also identify previously unrecognized parts of AGO proteins with profound structural and functional importance and provide the first hints at structural elements that have important functions specific to individual AGO family members. A particularly important outcome of the analysis concerns the evidence for existence of Gly-Trp (GW) repeat interactors of AGO proteins acting in the plant microRNA pathway. The parallel analysis of AGO structures and plant AGO mutations also suggests that such interactions with GW proteins may be a determinant of whether an endonucleolytically competent RISC is formed.
The flavivirus capsid protein: Structure, function and perspectives towards drug design.
Oliveira, Edson R A; Mohana-Borges, Ronaldo; de Alencastro, Ricardo B; Horta, Bruno A C
2017-01-02
Flaviviruses, such as dengue and zika viruses, are etiologic agents transmitted to humans mainly by arthropods and are of great epidemiological interest. The flavivirus capsid protein is a structural element required for the viral nucleocapsid assembly that presents the classical function of sheltering the viral genome. After decades of research, many reports have shown its different functionalities and influence over cell normal functioning. The subcellular distribution of this protein, which involves accumulation around lipid droplets and nuclear localization, also corroborates with its multi-functional characteristic. As flavivirus diseases are still in need of global control and in view of the possible key functionalities that the capsid protein promotes over flavivirus biology, novel considerations arise towards anti-flavivirus drug research. This review covers the main aspects concerning structural and functional features of the flavivirus C protein, ultimately, highlighting prospects in drug discovery based on this viral target. Copyright © 2016 Elsevier B.V. All rights reserved.
Nakai, S; Li-Chan, E
1985-10-01
According to the original idea of quantitative structure-activity relationship, electric, hydrophobic, and structural parameters should be taken into consideration for elucidating functionality. Changes in these parameters are reflected in the property of protein solubility upon modification of whey proteins by heating. Although solubility is itself a functional property, it has been utilized to explain other functionalities of proteins. However, better correlations were obtained when hydrophobic parameters of the proteins were used in conjunction with solubility. Various treatments reported in the literature were applied to whey protein concentrate in an attempt to obtain whipping and gelling properties similar to those of egg white. Mapping simplex optimization was used to search for the best results. Improvement in whipping properties by pepsin hydrolysis may have been due to higher protein solubility, and good gelling properties resulting from polyphosphate treatment may have been due to an increase in exposable hydrophobicity. However, the results of angel food cake making were still unsatisfactory.
The hypothetical protein Atu4866 from Agrobacterium tumefaciens adopts a streptavidin-like fold
Ai, Xuanjun; Semesi, Anthony; Yee, Adelinda; Arrowsmith, Cheryl H.; Choy, Wing-Yiu; Li, Shawn S.C.
2008-01-01
Atu4866 is a 79-residue conserved hypothetical protein of unknown function from Agrobacterium tumefaciens. Protein sequence alignments show that it shares ≥60% sequence identity with 20 other hypothetical proteins of bacterial origin. However, the structures and functions of these proteins remain unknown so far. To gain insight into the function of this family of proteins, we have determined the structure of Atu4866 as a target of a structural genomics project using solution NMR spectroscopy. Our results reveal that Atu4866 adopts a streptavidin-like fold featuring a β-barrel/sandwich formed by eight antiparallel β-strands. Further structural analysis identified a continuous patch of conserved residues on the surface of Atu4866 that may constitute a potential ligand-binding site. PMID:18042676
3D-SURFER 2.0: web platform for real-time search and characterization of protein surfaces.
Xiong, Yi; Esquivel-Rodriguez, Juan; Sael, Lee; Kihara, Daisuke
2014-01-01
The increasing number of uncharacterized protein structures necessitates the development of computational approaches for function annotation using the protein tertiary structures. Protein structure database search is the basis of any structure-based functional elucidation of proteins. 3D-SURFER is a web platform for real-time protein surface comparison of a given protein structure against the entire PDB using 3D Zernike descriptors. It can smoothly navigate the protein structure space in real-time from one query structure to another. A major new feature of Release 2.0 is the ability to compare the protein surface of a single chain, a single domain, or a single complex against databases of protein chains, domains, complexes, or a combination of all three in the latest PDB. Additionally, two types of protein structures can now be compared: all-atom-surface and backbone-atom-surface. The server can also accept a batch job for a large number of database searches. Pockets in protein surfaces can be identified by VisGrid and LIGSITE (csc) . The server is available at http://kiharalab.org/3d-surfer/.
2014-01-01
Background Due to rapid sequencing of genomes, there are now millions of deposited protein sequences with no known function. Fast sequence-based comparisons allow detecting close homologs for a protein of interest to transfer functional information from the homologs to the given protein. Sequence-based comparison cannot detect remote homologs, in which evolution has adjusted the sequence while largely preserving structure. Structure-based comparisons can detect remote homologs but most methods for doing so are too expensive to apply at a large scale over structural databases of proteins. Recently, fragment-based structural representations have been proposed that allow fast detection of remote homologs with reasonable accuracy. These representations have also been used to obtain linearly-reducible maps of protein structure space. It has been shown, as additionally supported from analysis in this paper that such maps preserve functional co-localization of the protein structure space. Methods Inspired by a recent application of the Latent Dirichlet Allocation (LDA) model for conducting structural comparisons of proteins, we propose higher-order LDA-obtained topic-based representations of protein structures to provide an alternative route for remote homology detection and organization of the protein structure space in few dimensions. Various techniques based on natural language processing are proposed and employed to aid the analysis of topics in the protein structure domain. Results We show that a topic-based representation is just as effective as a fragment-based one at automated detection of remote homologs and organization of protein structure space. We conduct a detailed analysis of the information content in the topic-based representation, showing that topics have semantic meaning. The fragment-based and topic-based representations are also shown to allow prediction of superfamily membership. Conclusions This work opens exciting venues in designing novel representations to extract information about protein structures, as well as organizing and mining protein structure space with mature text mining tools. PMID:25080993
Quality assessment of protein model-structures based on structural and functional similarities.
Konopka, Bogumil M; Nebel, Jean-Christophe; Kotulska, Malgorzata
2012-09-21
Experimental determination of protein 3D structures is expensive, time consuming and sometimes impossible. A gap between number of protein structures deposited in the World Wide Protein Data Bank and the number of sequenced proteins constantly broadens. Computational modeling is deemed to be one of the ways to deal with the problem. Although protein 3D structure prediction is a difficult task, many tools are available. These tools can model it from a sequence or partial structural information, e.g. contact maps. Consequently, biologists have the ability to generate automatically a putative 3D structure model of any protein. However, the main issue becomes evaluation of the model quality, which is one of the most important challenges of structural biology. GOBA--Gene Ontology-Based Assessment is a novel Protein Model Quality Assessment Program. It estimates the compatibility between a model-structure and its expected function. GOBA is based on the assumption that a high quality model is expected to be structurally similar to proteins functionally similar to the prediction target. Whereas DALI is used to measure structure similarity, protein functional similarity is quantified using standardized and hierarchical description of proteins provided by Gene Ontology combined with Wang's algorithm for calculating semantic similarity. Two approaches are proposed to express the quality of protein model-structures. One is a single model quality assessment method, the other is its modification, which provides a relative measure of model quality. Exhaustive evaluation is performed on data sets of model-structures submitted to the CASP8 and CASP9 contests. The validation shows that the method is able to discriminate between good and bad model-structures. The best of tested GOBA scores achieved 0.74 and 0.8 as a mean Pearson correlation to the observed quality of models in our CASP8 and CASP9-based validation sets. GOBA also obtained the best result for two targets of CASP8, and one of CASP9, compared to the contest participants. Consequently, GOBA offers a novel single model quality assessment program that addresses the practical needs of biologists. In conjunction with other Model Quality Assessment Programs (MQAPs), it would prove useful for the evaluation of single protein models.
Protein flexibility in the light of structural alphabets
Craveur, Pierrick; Joseph, Agnel P.; Esque, Jeremy; Narwani, Tarun J.; Noël, Floriane; Shinada, Nicolas; Goguet, Matthieu; Leonard, Sylvain; Poulain, Pierre; Bertrand, Olivier; Faure, Guilhem; Rebehmed, Joseph; Ghozlane, Amine; Swapna, Lakshmipuram S.; Bhaskara, Ramachandra M.; Barnoud, Jonathan; Téletchéa, Stéphane; Jallu, Vincent; Cerny, Jiri; Schneider, Bohdan; Etchebest, Catherine; Srinivasan, Narayanaswamy; Gelly, Jean-Christophe; de Brevern, Alexandre G.
2015-01-01
Protein structures are valuable tools to understand protein function. Nonetheless, proteins are often considered as rigid macromolecules while their structures exhibit specific flexibility, which is essential to complete their functions. Analyses of protein structures and dynamics are often performed with a simplified three-state description, i.e., the classical secondary structures. More precise and complete description of protein backbone conformation can be obtained using libraries of small protein fragments that are able to approximate every part of protein structures. These libraries, called structural alphabets (SAs), have been widely used in structure analysis field, from definition of ligand binding sites to superimposition of protein structures. SAs are also well suited to analyze the dynamics of protein structures. Here, we review innovative approaches that investigate protein flexibility based on SAs description. Coupled to various sources of experimental data (e.g., B-factor) and computational methodology (e.g., Molecular Dynamic simulation), SAs turn out to be powerful tools to analyze protein dynamics, e.g., to examine allosteric mechanisms in large set of structures in complexes, to identify order/disorder transition. SAs were also shown to be quite efficient to predict protein flexibility from amino-acid sequence. Finally, in this review, we exemplify the interest of SAs for studying flexibility with different cases of proteins implicated in pathologies and diseases. PMID:26075209
Classification of protein quaternary structure by functional domain composition
Yu, Xiaojing; Wang, Chuan; Li, Yixue
2006-01-01
Background The number and the arrangement of subunits that form a protein are referred to as quaternary structure. Quaternary structure is an important protein attribute that is closely related to its function. Proteins with quaternary structure are called oligomeric proteins. Oligomeric proteins are involved in various biological processes, such as metabolism, signal transduction, and chromosome replication. Thus, it is highly desirable to develop some computational methods to automatically classify the quaternary structure of proteins from their sequences. Results To explore this problem, we adopted an approach based on the functional domain composition of proteins. Every protein was represented by a vector calculated from the domains in the PFAM database. The nearest neighbor algorithm (NNA) was used for classifying the quaternary structure of proteins from this information. The jackknife cross-validation test was performed on the non-redundant protein dataset in which the sequence identity was less than 25%. The overall success rate obtained is 75.17%. Additionally, to demonstrate the effectiveness of this method, we predicted the proteins in an independent dataset and achieved an overall success rate of 84.11% Conclusion Compared with the amino acid composition method and Blast, the results indicate that the domain composition approach may be a more effective and promising high-throughput method in dealing with this complicated problem in bioinformatics. PMID:16584572
Functional Insights from Structural Genomics
DOE Office of Scientific and Technical Information (OSTI.GOV)
Forouhar,F.; Kuzin, A.; Seetharaman, J.
2007-01-01
Structural genomics efforts have produced structural information, either directly or by modeling, for thousands of proteins over the past few years. While many of these proteins have known functions, a large percentage of them have not been characterized at the functional level. The structural information has provided valuable functional insights on some of these proteins, through careful structural analyses, serendipity, and structure-guided functional screening. Some of the success stories based on structures solved at the Northeast Structural Genomics Consortium (NESG) are reported here. These include a novel methyl salicylate esterase with important role in plant innate immunity, a novel RNAmore » methyltransferase (H. influenzae yggJ (HI0303)), a novel spermidine/spermine N-acetyltransferase (B. subtilis PaiA), a novel methyltransferase or AdoMet binding protein (A. fulgidus AF{_}0241), an ATP:cob(I)alamin adenosyltransferase (B. subtilis YvqK), a novel carboxysome pore (E. coli EutN), a proline racemase homolog with a disrupted active site (B. melitensis BME11586), an FMN-dependent enzyme (S. pneumoniae SP{_}1951), and a 12-stranded {beta}-barrel with a novel fold (V. parahaemolyticus VPA1032).« less
Challenges in the Development of Functional Assays of Membrane Proteins
Tiefenauer, Louis; Demarche, Sophie
2012-01-01
Lipid bilayers are natural barriers of biological cells and cellular compartments. Membrane proteins integrated in biological membranes enable vital cell functions such as signal transduction and the transport of ions or small molecules. In order to determine the activity of a protein of interest at defined conditions, the membrane protein has to be integrated into artificial lipid bilayers immobilized on a surface. For the fabrication of such biosensors expertise is required in material science, surface and analytical chemistry, molecular biology and biotechnology. Specifically, techniques are needed for structuring surfaces in the micro- and nanometer scale, chemical modification and analysis, lipid bilayer formation, protein expression, purification and solubilization, and most importantly, protein integration into engineered lipid bilayers. Electrochemical and optical methods are suitable to detect membrane activity-related signals. The importance of structural knowledge to understand membrane protein function is obvious. Presently only a few structures of membrane proteins are solved at atomic resolution. Functional assays together with known structures of individual membrane proteins will contribute to a better understanding of vital biological processes occurring at biological membranes. Such assays will be utilized in the discovery of drugs, since membrane proteins are major drug targets.
Tactile Teaching: Exploring Protein Structure/Function Using Physical Models
ERIC Educational Resources Information Center
Herman, Tim; Morris, Jennifer; Colton, Shannon; Batiza, Ann; Patrick, Michael; Franzen, Margaret; Goodsell, David S.
2006-01-01
The technology now exists to construct physical models of proteins based on atomic coordinates of solved structures. We review here our recent experiences in using physical models to teach concepts of protein structure and function at both the high school and the undergraduate levels. At the high school level, physical models are used in a…
ERIC Educational Resources Information Center
Terrell, Cassidy R.; Listenberger, Laura L.
2017-01-01
Recognizing that undergraduate students can benefit from analysis of 3D protein structure and function, we have developed a multiweek, inquiry-based molecular visualization project for Biochemistry I students. This project uses a virtual model of cyclooxygenase-1 (COX-1) to guide students through multiple levels of protein structure analysis. The…
A structural-alphabet-based strategy for finding structural motifs across protein families
Wu, Chih Yuan; Chen, Yao Chi; Lim, Carmay
2010-01-01
Proteins with insignificant sequence and overall structure similarity may still share locally conserved contiguous structural segments; i.e. structural/3D motifs. Most methods for finding 3D motifs require a known motif to search for other similar structures or functionally/structurally crucial residues. Here, without requiring a query motif or essential residues, a fully automated method for discovering 3D motifs of various sizes across protein families with different folds based on a 16-letter structural alphabet is presented. It was applied to structurally non-redundant proteins bound to DNA, RNA, obligate/non-obligate proteins as well as free DNA-binding proteins (DBPs) and proteins with known structures but unknown function. Its usefulness was illustrated by analyzing the 3D motifs found in DBPs. A non-specific motif was found with a ‘corner’ architecture that confers a stable scaffold and enables diverse interactions, making it suitable for binding not only DNA but also RNA and proteins. Furthermore, DNA-specific motifs present ‘only’ in DBPs were discovered. The motifs found can provide useful guidelines in detecting binding sites and computational protein redesign. PMID:20525797
SA-Mot: a web server for the identification of motifs of interest extracted from protein loops
Regad, Leslie; Saladin, Adrien; Maupetit, Julien; Geneix, Colette; Camproux, Anne-Claude
2011-01-01
The detection of functional motifs is an important step for the determination of protein functions. We present here a new web server SA-Mot (Structural Alphabet Motif) for the extraction and location of structural motifs of interest from protein loops. Contrary to other methods, SA-Mot does not focus only on functional motifs, but it extracts recurrent and conserved structural motifs involved in structural redundancy of loops. SA-Mot uses the structural word notion to extract all structural motifs from uni-dimensional sequences corresponding to loop structures. Then, SA-Mot provides a description of these structural motifs using statistics computed in the loop data set and in SCOP superfamily, sequence and structural parameters. SA-Mot results correspond to an interactive table listing all structural motifs extracted from a target structure and their associated descriptors. Using this information, the users can easily locate loop regions that are important for the protein folding and function. The SA-Mot web server is available at http://sa-mot.mti.univ-paris-diderot.fr. PMID:21665924
SA-Mot: a web server for the identification of motifs of interest extracted from protein loops.
Regad, Leslie; Saladin, Adrien; Maupetit, Julien; Geneix, Colette; Camproux, Anne-Claude
2011-07-01
The detection of functional motifs is an important step for the determination of protein functions. We present here a new web server SA-Mot (Structural Alphabet Motif) for the extraction and location of structural motifs of interest from protein loops. Contrary to other methods, SA-Mot does not focus only on functional motifs, but it extracts recurrent and conserved structural motifs involved in structural redundancy of loops. SA-Mot uses the structural word notion to extract all structural motifs from uni-dimensional sequences corresponding to loop structures. Then, SA-Mot provides a description of these structural motifs using statistics computed in the loop data set and in SCOP superfamily, sequence and structural parameters. SA-Mot results correspond to an interactive table listing all structural motifs extracted from a target structure and their associated descriptors. Using this information, the users can easily locate loop regions that are important for the protein folding and function. The SA-Mot web server is available at http://sa-mot.mti.univ-paris-diderot.fr.
Martínez-Castilla, León P.; Rodríguez-Sotres, Rogelio
2010-01-01
Background Despite the remarkable progress of bioinformatics, how the primary structure of a protein leads to a three-dimensional fold, and in turn determines its function remains an elusive question. Alignments of sequences with known function can be used to identify proteins with the same or similar function with high success. However, identification of function-related and structure-related amino acid positions is only possible after a detailed study of every protein. Folding pattern diversity seems to be much narrower than sequence diversity, and the amino acid sequences of natural proteins have evolved under a selective pressure comprising structural and functional requirements acting in parallel. Principal Findings The approach described in this work begins by generating a large number of amino acid sequences using ROSETTA [Dantas G et al. (2003) J Mol Biol 332:449–460], a program with notable robustness in the assignment of amino acids to a known three-dimensional structure. The resulting sequence-sets showed no conservation of amino acids at active sites, or protein-protein interfaces. Hidden Markov models built from the resulting sequence sets were used to search sequence databases. Surprisingly, the models retrieved from the database sequences belonged to proteins with the same or a very similar function. Given an appropriate cutoff, the rate of false positives was zero. According to our results, this protocol, here referred to as Rd.HMM, detects fine structural details on the folding patterns, that seem to be tightly linked to the fitness of a structural framework for a specific biological function. Conclusion Because the sequence of the native protein used to create the Rd.HMM model was always amongst the top hits, the procedure is a reliable tool to score, very accurately, the quality and appropriateness of computer-modeled 3D-structures, without the need for spectroscopy data. However, Rd.HMM is very sensitive to the conformational features of the models' backbone. PMID:20830209
Text Mining Improves Prediction of Protein Functional Sites
Cohn, Judith D.; Ravikumar, Komandur E.
2012-01-01
We present an approach that integrates protein structure analysis and text mining for protein functional site prediction, called LEAP-FS (Literature Enhanced Automated Prediction of Functional Sites). The structure analysis was carried out using Dynamics Perturbation Analysis (DPA), which predicts functional sites at control points where interactions greatly perturb protein vibrations. The text mining extracts mentions of residues in the literature, and predicts that residues mentioned are functionally important. We assessed the significance of each of these methods by analyzing their performance in finding known functional sites (specifically, small-molecule binding sites and catalytic sites) in about 100,000 publicly available protein structures. The DPA predictions recapitulated many of the functional site annotations and preferentially recovered binding sites annotated as biologically relevant vs. those annotated as potentially spurious. The text-based predictions were also substantially supported by the functional site annotations: compared to other residues, residues mentioned in text were roughly six times more likely to be found in a functional site. The overlap of predictions with annotations improved when the text-based and structure-based methods agreed. Our analysis also yielded new high-quality predictions of many functional site residues that were not catalogued in the curated data sources we inspected. We conclude that both DPA and text mining independently provide valuable high-throughput protein functional site predictions, and that integrating the two methods using LEAP-FS further improves the quality of these predictions. PMID:22393388
Structure of a Trypanosoma Brucei Alpha/Beta--Hydrolase Fold Protein With Unknown Function
DOE Office of Scientific and Technical Information (OSTI.GOV)
Merritt, E.A.; Holmes, M.; Buckner, F.S.
2009-05-26
The structure of a structural genomics target protein, Tbru020260AAA from Trypanosoma brucei, has been determined to a resolution of 2.2 {angstrom} using multiple-wavelength anomalous diffraction at the Se K edge. This protein belongs to Pfam sequence family PF08538 and is only distantly related to previously studied members of the {alpha}/{beta}-hydrolase fold family. Structural superposition onto representative {alpha}/{beta}-hydrolase fold proteins of known function indicates that a possible catalytic nucleophile, Ser116 in the T. brucei protein, lies at the expected location. However, the present structure and by extension the other trypanosomatid members of this sequence family have neither sequence nor structural similaritymore » at the location of other active-site residues typical for proteins with this fold. Together with the presence of an additional domain between strands {beta}6 and {beta}7 that is conserved in trypanosomatid genomes, this suggests that the function of these homologs has diverged from other members of the fold family.« less
Vishwanath, Sneha
2018-01-01
The majority of the proteins encoded in the genomes of eukaryotes contain more than one domain. Reasons for high prevalence of multi-domain proteins in various organisms have been attributed to higher stability and functional and folding advantages over single-domain proteins. Despite these advantages, many proteins are composed of only one domain while their homologous domains are part of multi-domain proteins. In the study presented here, differences in the properties of protein domains in single-domain and multi-domain systems and their influence on functions are discussed. We studied 20 pairs of identical protein domains, which were crystallized in two forms (a) tethered to other proteins domains and (b) tethered to fewer protein domains than (a) or not tethered to any protein domain. Results suggest that tethering of domains in multi-domain proteins influences the structural, dynamic and energetic properties of the constituent protein domains. 50% of the protein domain pairs show significant structural deviations while 90% of the protein domain pairs show differences in dynamics and 12% of the residues show differences in the energetics. To gain further insights on the influence of tethering on the function of the domains, 4 pairs of homologous protein domains, where one of them is a full-length single-domain protein and the other protein domain is a part of a multi-domain protein, were studied. Analyses showed that identical and structurally equivalent functional residues show differential dynamics in homologous protein domains; though comparable dynamics between in-silico generated chimera protein and multi-domain proteins were observed. From these observations, the differences observed in the functions of homologous proteins could be attributed to the presence of tethered domain. Overall, we conclude that tethered domains in multi-domain proteins not only provide stability or folding advantages but also influence pathways resulting in differences in function or regulatory properties. PMID:29432415
Vishwanath, Sneha; de Brevern, Alexandre G; Srinivasan, Narayanaswamy
2018-02-01
The majority of the proteins encoded in the genomes of eukaryotes contain more than one domain. Reasons for high prevalence of multi-domain proteins in various organisms have been attributed to higher stability and functional and folding advantages over single-domain proteins. Despite these advantages, many proteins are composed of only one domain while their homologous domains are part of multi-domain proteins. In the study presented here, differences in the properties of protein domains in single-domain and multi-domain systems and their influence on functions are discussed. We studied 20 pairs of identical protein domains, which were crystallized in two forms (a) tethered to other proteins domains and (b) tethered to fewer protein domains than (a) or not tethered to any protein domain. Results suggest that tethering of domains in multi-domain proteins influences the structural, dynamic and energetic properties of the constituent protein domains. 50% of the protein domain pairs show significant structural deviations while 90% of the protein domain pairs show differences in dynamics and 12% of the residues show differences in the energetics. To gain further insights on the influence of tethering on the function of the domains, 4 pairs of homologous protein domains, where one of them is a full-length single-domain protein and the other protein domain is a part of a multi-domain protein, were studied. Analyses showed that identical and structurally equivalent functional residues show differential dynamics in homologous protein domains; though comparable dynamics between in-silico generated chimera protein and multi-domain proteins were observed. From these observations, the differences observed in the functions of homologous proteins could be attributed to the presence of tethered domain. Overall, we conclude that tethered domains in multi-domain proteins not only provide stability or folding advantages but also influence pathways resulting in differences in function or regulatory properties.
Shahbaaz, Mohd; Ahmad, Faizan; Imtaiyaz Hassan, Md
2015-06-01
Haemophilus influenzae is a small pleomorphic Gram-negative bacteria which causes several chronic diseases, including bacteremia, meningitis, cellulitis, epiglottitis, septic arthritis, pneumonia, and empyema. Here we extensively analyzed the sequenced genome of H. influenzae strain Rd KW20 using protein family databases, protein structure prediction, pathways and genome context methods to assign a precise function to proteins whose functions are unknown. These proteins are termed as hypothetical proteins (HPs), for which no experimental information is available. Function prediction of these proteins would surely be supportive to precisely understand the biochemical pathways and mechanism of pathogenesis of Haemophilus influenzae. During the extensive analysis of H. influenzae genome, we found the presence of eight HPs showing lyase activity. Subsequently, we modeled and analyzed three-dimensional structure of all these HPs to determine their functions more precisely. We found these HPs possess cystathionine-β-synthase, cyclase, carboxymuconolactone decarboxylase, pseudouridine synthase A and C, D-tagatose-1,6-bisphosphate aldolase and aminodeoxychorismate lyase-like features, indicating their corresponding functions in the H. influenzae. Lyases are actively involved in the regulation of biosynthesis of various hormones, metabolic pathways, signal transduction, and DNA repair. Lyases are also considered as a key player for various biological processes. These enzymes are critically essential for the survival and pathogenesis of H. influenzae and, therefore, these enzymes may be considered as a potential target for structure-based rational drug design. Our structure-function relationship analysis will be useful to search and design potential lead molecules based on the structure of these lyases, for drug design and discovery.
Non-Structural Proteins of Arthropod-Borne Bunyaviruses: Roles and Functions
Eifan, Saleh; Schnettler, Esther; Dietrich, Isabelle; Kohl, Alain; Blomström, Anne-Lie
2013-01-01
Viruses within the Bunyaviridae family are tri-segmented, negative-stranded RNA viruses. The family includes several emerging and re-emerging viruses of humans, animals and plants, such as Rift Valley fever virus, Crimean-Congo hemorrhagic fever virus, La Crosse virus, Schmallenberg virus and tomato spotted wilt virus. Many bunyaviruses are arthropod-borne, so-called arboviruses. Depending on the genus, bunyaviruses encode, in addition to the RNA-dependent RNA polymerase and the different structural proteins, one or several non-structural proteins. These non-structural proteins are not always essential for virus growth and replication but can play an important role in viral pathogenesis through their interaction with the host innate immune system. In this review, we will summarize current knowledge and understanding of insect-borne bunyavirus non-structural protein function(s) in vertebrate, plant and arthropod. PMID:24100888
Query3d: a new method for high-throughput analysis of functional residues in protein structures.
Ausiello, Gabriele; Via, Allegra; Helmer-Citterich, Manuela
2005-12-01
The identification of local similarities between two protein structures can provide clues of a common function. Many different methods exist for searching for similar subsets of residues in proteins of known structure. However, the lack of functional and structural information on single residues, together with the low level of integration of this information in comparison methods, is a limitation that prevents these methods from being fully exploited in high-throughput analyses. Here we describe Query3d, a program that is both a structural DBMS (Database Management System) and a local comparison method. The method conserves a copy of all the residues of the Protein Data Bank annotated with a variety of functional and structural information. New annotations can be easily added from a variety of methods and known databases. The algorithm makes it possible to create complex queries based on the residues' function and then to compare only subsets of the selected residues. Functional information is also essential to speed up the comparison and the analysis of the results. With Query3d, users can easily obtain statistics on how many and which residues share certain properties in all proteins of known structure. At the same time, the method also finds their structural neighbours in the whole PDB. Programs and data can be accessed through the PdbFun web interface.
Query3d: a new method for high-throughput analysis of functional residues in protein structures
Ausiello, Gabriele; Via, Allegra; Helmer-Citterich, Manuela
2005-01-01
Background The identification of local similarities between two protein structures can provide clues of a common function. Many different methods exist for searching for similar subsets of residues in proteins of known structure. However, the lack of functional and structural information on single residues, together with the low level of integration of this information in comparison methods, is a limitation that prevents these methods from being fully exploited in high-throughput analyses. Results Here we describe Query3d, a program that is both a structural DBMS (Database Management System) and a local comparison method. The method conserves a copy of all the residues of the Protein Data Bank annotated with a variety of functional and structural information. New annotations can be easily added from a variety of methods and known databases. The algorithm makes it possible to create complex queries based on the residues' function and then to compare only subsets of the selected residues. Functional information is also essential to speed up the comparison and the analysis of the results. Conclusion With Query3d, users can easily obtain statistics on how many and which residues share certain properties in all proteins of known structure. At the same time, the method also finds their structural neighbours in the whole PDB. Programs and data can be accessed through the PdbFun web interface. PMID:16351754
Hensen, Ulf; Meyer, Tim; Haas, Jürgen; Rex, René; Vriend, Gert; Grubmüller, Helmut
2012-01-01
Proteins are usually described and classified according to amino acid sequence, structure or function. Here, we develop a minimally biased scheme to compare and classify proteins according to their internal mobility patterns. This approach is based on the notion that proteins not only fold into recurring structural motifs but might also be carrying out only a limited set of recurring mobility motifs. The complete set of these patterns, which we tentatively call the dynasome, spans a multi-dimensional space with axes, the dynasome descriptors, characterizing different aspects of protein dynamics. The unique dynamic fingerprint of each protein is represented as a vector in the dynasome space. The difference between any two vectors, consequently, gives a reliable measure of the difference between the corresponding protein dynamics. We characterize the properties of the dynasome by comparing the dynamics fingerprints obtained from molecular dynamics simulations of 112 proteins but our approach is, in principle, not restricted to any specific source of data of protein dynamics. We conclude that: 1. the dynasome consists of a continuum of proteins, rather than well separated classes. 2. For the majority of proteins we observe strong correlations between structure and dynamics. 3. Proteins with similar function carry out similar dynamics, which suggests a new method to improve protein function annotation based on protein dynamics. PMID:22606222
Dawson, Natalie L; Sillitoe, Ian; Lees, Jonathan G; Lam, Su Datt; Orengo, Christine A
2017-01-01
This chapter describes the generation of the data in the CATH-Gene3D online resource and how it can be used to study protein domains and their evolutionary relationships. Methods will be presented for: comparing protein structures, recognizing homologs, predicting domain structures within protein sequences, and subclassifying superfamilies into functionally pure families, together with a guide on using the webpages.
Brown, Peter; Pullan, Wayne; Yang, Yuedong; Zhou, Yaoqi
2016-02-01
The three dimensional tertiary structure of a protein at near atomic level resolution provides insight alluding to its function and evolution. As protein structure decides its functionality, similarity in structure usually implies similarity in function. As such, structure alignment techniques are often useful in the classifications of protein function. Given the rapidly growing rate of new, experimentally determined structures being made available from repositories such as the Protein Data Bank, fast and accurate computational structure comparison tools are required. This paper presents SPalignNS, a non-sequential protein structure alignment tool using a novel asymmetrical greedy search technique. The performance of SPalignNS was evaluated against existing sequential and non-sequential structure alignment methods by performing trials with commonly used datasets. These benchmark datasets used to gauge alignment accuracy include (i) 9538 pairwise alignments implied by the HOMSTRAD database of homologous proteins; (ii) a subset of 64 difficult alignments from set (i) that have low structure similarity; (iii) 199 pairwise alignments of proteins with similar structure but different topology; and (iv) a subset of 20 pairwise alignments from the RIPC set. SPalignNS is shown to achieve greater alignment accuracy (lower or comparable root-mean squared distance with increased structure overlap coverage) for all datasets, and the highest agreement with reference alignments from the challenging dataset (iv) above, when compared with both sequentially constrained alignments and other non-sequential alignments. SPalignNS was implemented in C++. The source code, binary executable, and a web server version is freely available at: http://sparks-lab.org yaoqi.zhou@griffith.edu.au. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
A decade and a half of protein intrinsic disorder: Biology still waits for physics
Uversky, Vladimir N
2013-01-01
The abundant existence of proteins and regions that possess specific functions without being uniquely folded into unique 3D structures has become accepted by a significant number of protein scientists. Sequences of these intrinsically disordered proteins (IDPs) and IDP regions (IDPRs) are characterized by a number of specific features, such as low overall hydrophobicity and high net charge which makes these proteins predictable. IDPs/IDPRs possess large hydrodynamic volumes, low contents of ordered secondary structure, and are characterized by high structural heterogeneity. They are very flexible, but some may undergo disorder to order transitions in the presence of natural ligands. The degree of these structural rearrangements varies over a very wide range. IDPs/IDPRs are tightly controlled under the normal conditions and have numerous specific functions that complement functions of ordered proteins and domains. When lacking proper control, they have multiple roles in pathogenesis of various human diseases. Gaining structural and functional information about these proteins is a challenge, since they do not typically “freeze” while their “pictures are taken.” However, despite or perhaps because of the experimental challenges, these fuzzy objects with fuzzy structures and fuzzy functions are among the most interesting targets for modern protein research. This review briefly summarizes some of the recent advances in this exciting field and considers some of the basic lessons learned from the analysis of physics, chemistry, and biology of IDPs. PMID:23553817
Gifford, Lida K; Carter, Lester G; Gabanyi, Margaret J; Berman, Helen M; Adams, Paul D
2012-06-01
The Technology Portal of the Protein Structure Initiative Structural Biology Knowledgebase (PSI SBKB; http://technology.sbkb.org/portal/ ) is a web resource providing information about methods and tools that can be used to relieve bottlenecks in many areas of protein production and structural biology research. Several useful features are available on the web site, including multiple ways to search the database of over 250 technological advances, a link to videos of methods on YouTube, and access to a technology forum where scientists can connect, ask questions, get news, and develop collaborations. The Technology Portal is a component of the PSI SBKB ( http://sbkb.org ), which presents integrated genomic, structural, and functional information for all protein sequence targets selected by the Protein Structure Initiative. Created in collaboration with the Nature Publishing Group, the SBKB offers an array of resources for structural biologists, such as a research library, editorials about new research advances, a featured biological system each month, and a functional sleuth for searching protein structures of unknown function. An overview of the various features and examples of user searches highlight the information, tools, and avenues for scientific interaction available through the Technology Portal.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zemla, A; Lang, D; Kostova, T
2010-11-29
Most of the currently used methods for protein function prediction rely on sequence-based comparisons between a query protein and those for which a functional annotation is provided. A serious limitation of sequence similarity-based approaches for identifying residue conservation among proteins is the low confidence in assigning residue-residue correspondences among proteins when the level of sequence identity between the compared proteins is poor. Multiple sequence alignment methods are more satisfactory - still, they cannot provide reliable results at low levels of sequence identity. Our goal in the current work was to develop an algorithm that could overcome these difficulties and facilitatemore » the identification of structurally (and possibly functionally) relevant residue-residue correspondences between compared protein structures. Here we present StralSV, a new algorithm for detecting closely related structure fragments and quantifying residue frequency from tight local structure alignments. We apply StralSV in a study of the RNA-dependent RNA polymerase of poliovirus and demonstrate that the algorithm can be used to determine regions of the protein that are relatively unique or that shared structural similarity with structures that are distantly related. By quantifying residue frequencies among many residue-residue pairs extracted from local alignments, one can infer potential structural or functional importance of specific residues that are determined to be highly conserved or that deviate from a consensus. We further demonstrate that considerable detailed structural and phylogenetic information can be derived from StralSV analyses. StralSV is a new structure-based algorithm for identifying and aligning structure fragments that have similarity to a reference protein. StralSV analysis can be used to quantify residue-residue correspondences and identify residues that may be of particular structural or functional importance, as well as unusual or unexpected residues at a given sequence position.« less
Weerth, R. Sophia; Michalska, Karolina; Bingman, Craig A.; ...
2014-12-18
Here, proteins belonging to the cupin superfamily have a wide range of catalytic and noncatalytic functions. Cupin proteins commonly have the capacity to bind a metal ion with the metal frequently determining the function of the protein. We have been investigating the function of homologous cupin proteins that are conserved in more than 40 species of bacteria. In conclusion, to gain insights into the potential function of these proteins we have solved the structure of Plu4264 from Photorhabdus luminescens TTO1 at a resolution of 1.35 Å and identified manganese as the likely natural metal ligand of the protein. Proteins 2015;more » 83:383–388.« less
The evolution of function within the Nudix homology clan
Srouji, John R.; Xu, Anting; Park, Annsea; Kirsch, Jack F.
2017-01-01
ABSTRACT The Nudix homology clan encompasses over 80,000 protein domains from all three domains of life, defined by homology to each other. Proteins with a domain from this clan fall into four general functional classes: pyrophosphohydrolases, isopentenyl diphosphate isomerases (IDIs), adenine/guanine mismatch‐specific adenine glycosylases (A/G‐specific adenine glycosylases), and nonenzymatic activities such as protein/protein interaction and transcriptional regulation. The largest group, pyrophosphohydrolases, encompasses more than 100 distinct hydrolase specificities. To understand the evolution of this vast number of activities, we assembled and analyzed experimental and structural data for 205 Nudix proteins collected from the literature. We corrected erroneous functions or provided more appropriate descriptions for 53 annotations described in the Gene Ontology Annotation database in this family, and propose 275 new experimentally‐based annotations. We manually constructed a structure‐guided sequence alignment of 78 Nudix proteins. Using the structural alignment as a seed, we then made an alignment of 347 “select” Nudix homology domains, curated from structurally determined, functionally characterized, or phylogenetically important Nudix domains. Based on our review of Nudix pyrophosphohydrolase structures and specificities, we further analyzed a loop region downstream of the Nudix hydrolase motif previously shown to contact the substrate molecule and possess known functional motifs. This loop region provides a potential structural basis for the functional radiation and evolution of substrate specificity within the hydrolase family. Finally, phylogenetic analyses of the 347 select protein domains and of the complete Nudix homology clan revealed general monophyly with regard to function and a few instances of probable homoplasy. Proteins 2017; 85:775–811. © 2016 Wiley Periodicals, Inc. PMID:27936487
Caetano-Anollés, Gustavo; Kim, Kyung Mo; Caetano-Anollés, Derek
2012-02-01
The complexity of modern biochemistry developed gradually on early Earth as new molecules and structures populated the emerging cellular systems. Here, we generate a historical account of the gradual discovery of primordial proteins, cofactors, and molecular functions using phylogenomic information in the sequence of 420 genomes. We focus on structural and functional annotations of the 54 most ancient protein domains. We show how primordial functions are linked to folded structures and how their interaction with cofactors expanded the functional repertoire. We also reveal protocell membranes played a crucial role in early protein evolution and show translation started with RNA and thioester cofactor-mediated aminoacylation. Our findings allow elaboration of an evolutionary model of early biochemistry that is firmly grounded in phylogenomic information and biochemical, biophysical, and structural knowledge. The model describes how primordial α-helical bundles stabilized membranes, how these were decorated by layered arrangements of β-sheets and α-helices, and how these arrangements became globular. Ancient forms of aminoacyl-tRNA synthetase (aaRS) catalytic domains and ancient non-ribosomal protein synthetase (NRPS) modules gave rise to primordial protein synthesis and the ability to generate a code for specificity in their active sites. These structures diversified producing cofactor-binding molecular switches and barrel structures. Accretion of domains and molecules gave rise to modern aaRSs, NRPS, and ribosomal ensembles, first organized around novel emerging cofactors (tRNA and carrier proteins) and then more complex cofactor structures (rRNA). The model explains how the generation of protein structures acted as scaffold for nucleic acids and resulted in crystallization of modern translation.
Cuff, Alison L.; Sillitoe, Ian; Lewis, Tony; Clegg, Andrew B.; Rentzsch, Robert; Furnham, Nicholas; Pellegrini-Calace, Marialuisa; Jones, David; Thornton, Janet; Orengo, Christine A.
2011-01-01
CATH version 3.3 (class, architecture, topology, homology) contains 128 688 domains, 2386 homologous superfamilies and 1233 fold groups, and reflects a major focus on classifying structural genomics (SG) structures and transmembrane proteins, both of which are likely to add structural novelty to the database and therefore increase the coverage of protein fold space within CATH. For CATH version 3.4 we have significantly improved the presentation of sequence information and associated functional information for CATH superfamilies. The CATH superfamily pages now reflect both the functional and structural diversity within the superfamily and include structural alignments of close and distant relatives within the superfamily, annotated with functional information and details of conserved residues. A significantly more efficient search function for CATH has been established by implementing the search server Solr (http://lucene.apache.org/solr/). The CATH v3.4 webpages have been built using the Catalyst web framework. PMID:21097779
Bartho, Joseph D.; Bellini, Dom; Wuerges, Jochen; Demitri, Nicola; Toccafondi, Mirco; Schmitt, Armin O.; Zhao, Youfu; Walsh, Martin A.
2017-01-01
AmyR is a stress and virulence associated protein from the plant pathogenic Enterobacteriaceae species Erwinia amylovora, and is a functionally conserved ortholog of YbjN from Escherichia coli. The crystal structure of E. amylovora AmyR reveals a class I type III secretion chaperone-like fold, despite the lack of sequence similarity between these two classes of protein and lacking any evidence of a secretion-associated role. The results indicate that AmyR, and YbjN proteins in general, function through protein-protein interactions without any enzymatic action. The YbjN proteins of Enterobacteriaceae show remarkably low sequence similarity with other members of the YbjN protein family in Eubacteria, yet a high level of structural conservation is observed. Across the YbjN protein family sequence conservation is limited to residues stabilising the protein core and dimerization interface, while interacting regions are only conserved between closely related species. This study presents the first structure of a YbjN protein from Enterobacteriaceae, the most highly divergent and well-studied subgroup of YbjN proteins, and an in-depth sequence and structural analysis of this important but poorly understood protein family. PMID:28426806
Bartho, Joseph D; Bellini, Dom; Wuerges, Jochen; Demitri, Nicola; Toccafondi, Mirco; Schmitt, Armin O; Zhao, Youfu; Walsh, Martin A; Benini, Stefano
2017-01-01
AmyR is a stress and virulence associated protein from the plant pathogenic Enterobacteriaceae species Erwinia amylovora, and is a functionally conserved ortholog of YbjN from Escherichia coli. The crystal structure of E. amylovora AmyR reveals a class I type III secretion chaperone-like fold, despite the lack of sequence similarity between these two classes of protein and lacking any evidence of a secretion-associated role. The results indicate that AmyR, and YbjN proteins in general, function through protein-protein interactions without any enzymatic action. The YbjN proteins of Enterobacteriaceae show remarkably low sequence similarity with other members of the YbjN protein family in Eubacteria, yet a high level of structural conservation is observed. Across the YbjN protein family sequence conservation is limited to residues stabilising the protein core and dimerization interface, while interacting regions are only conserved between closely related species. This study presents the first structure of a YbjN protein from Enterobacteriaceae, the most highly divergent and well-studied subgroup of YbjN proteins, and an in-depth sequence and structural analysis of this important but poorly understood protein family.
Crystal growth of enzymes in low gravity (L-5)
NASA Technical Reports Server (NTRS)
Morita, Yuhei
1993-01-01
Recent developments in protein engineering have expanded the possibilities of studies of enzymes and other proteins. Now such studies are not limited to the elucidation of the relationship between the structure and function of the protein. They also aim at the production of proteins with new and practical functions, based on results obtained during investigation of structure and function. For continuing research in this field, investigation of the tertiary structure of proteins is important. X-ray diffraction of single crystals of protein is usually used for this purpose. The main difficulty is the preparation of the crystals. The theme of the research is to prepare such crystals at very low gravity, with the main purpose being to obtain large single crystals of proteins suitable for x-ray diffraction studies.
Protein sectors: evolutionary units of three-dimensional structure
Halabi, Najeeb; Rivoire, Olivier; Leibler, Stanislas; Ranganathan, Rama
2011-01-01
Proteins display a hierarchy of structural features at primary, secondary, tertiary, and higher-order levels, an organization that guides our current understanding of their biological properties and evolutionary origins. Here, we reveal a structural organization distinct from this traditional hierarchy by statistical analysis of correlated evolution between amino acids. Applied to the S1A serine proteases, the analysis indicates a decomposition of the protein into three quasi-independent groups of correlated amino acids that we term “protein sectors”. Each sector is physically connected in the tertiary structure, has a distinct functional role, and constitutes an independent mode of sequence divergence in the protein family. Functionally relevant sectors are evident in other protein families as well, suggesting that they may be general features of proteins. We propose that sectors represent a structural organization of proteins that reflects their evolutionary histories. PMID:19703402
Structural and evolutionary analysis of Leishmania Alba proteins.
da Costa, Kauê Santana; Galúcio, João Marcos Pereira; Leonardo, Elvis Santos; Cardoso, Guelber; Leal, Élcio; Conde, Guilherme; Lameira, Jerônimo
2017-10-01
The Alba superfamily proteins share a common RNA-binding domain. These proteins participate in a variety of regulatory pathways by controlling developmental gene expression. They also interact with ribosomal subunits, translation factors, and other RNA-binding proteins. The Leishmania infantum genome encodes two Alba-domain proteins, LiAlba1 and LiAlba3. In this work, we used homology modeling, protein-protein docking, and molecular dynamics (MD) simulations to explore the details of the Alba1-Alba3-RNA complex from Leishmania infantum at the molecular level. In addition, we compared the structure of LiAlba3 with the human ribonuclease P component, Rpp20. We also mapped the ligand-binding residues on the Alba3 surface to analyze its druggability and performed mutational analyses in Alba3 using alanine scanning to identify residues involved in its function and structural stability. These results suggest that the RGG-box motif of LiAlba1 is important for protein function and stability. Finally, we discuss the function of Alba proteins in the context of pathogen adaptation to host cells. The data provided herein will facilitate further translational research regarding Alba structure and function. Copyright © 2017 Elsevier B.V. All rights reserved.
The Classification of Protein Domains.
Dawson, Natalie; Sillitoe, Ian; Marsden, Russell L; Orengo, Christine A
2017-01-01
The significant expansion in protein sequence and structure data that we are now witnessing brings with it a pressing need to bring order to the protein world. Such order enables us to gain insights into the evolution of proteins, their function and the extent to which the functional repertoire can vary across the three kingdoms of life. This has lead to the creation of a wide range of protein family classifications that aim to group proteins based upon their evolutionary relationships.In this chapter we discuss the approaches and methods that are frequently used in the classification of proteins, with a specific emphasis on the classification of protein domains. The construction of both domain sequence and domain structure databases is considered and we show how the use of domain family annotations to assign structural and functional information is enhancing our understanding of genomes.
Wallace, A. C.; Borkakoti, N.; Thornton, J. M.
1997-01-01
It is well established that sequence templates such as those in the PROSITE and PRINTS databases are powerful tools for predicting the biological function and tertiary structure for newly derived protein sequences. The number of X-ray and NMR protein structures is increasing rapidly and it is apparent that a 3D equivalent of the sequence templates is needed. Here, we describe an algorithm called TESS that automatically derives 3D templates from structures deposited in the Brookhaven Protein Data Bank. While a new sequence can be searched for sequence patterns, a new structure can be scanned against these 3D templates to identify functional sites. As examples, 3D templates are derived for enzymes with an O-His-O "catalytic triad" and for the ribonucleases and lysozymes. When these 3D templates are applied to a large data set of nonidentical proteins, several interesting hits are located. This suggests that the development of a 3D template database may help to identify the function of new protein structures, if unknown, as well as to design proteins with specific functions. PMID:9385633
Whitmire, Jeannette M; Merrell, D Scott
2017-01-01
Mutagenesis is a valuable tool to examine the structure-function relationships of bacterial proteins. As such, a wide variety of mutagenesis techniques and strategies have been developed. This chapter details a selection of random mutagenesis methods and site-directed mutagenesis procedures that can be applied to an array of bacterial species. Additionally, the direct application of the techniques to study the Helicobacter pylori Ferric Uptake Regulator (Fur) protein is described. The varied approaches illustrated herein allow the robust investigation of the structural-functional relationships within a protein of interest.
Protein Assembly and Building Blocks: Beyond the Limits of the LEGO Brick Metaphor.
Levy, Yaakov
2017-09-26
Proteins, like other biomolecules, have a modular and hierarchical structure. Various building blocks are used to construct proteins of high structural complexity and diverse functionality. In multidomain proteins, for example, domains are fused to each other in different combinations to achieve different functions. Although the LEGO brick metaphor is justified as a means of simplifying the complexity of three-dimensional protein structures, several fundamental properties (such as allostery or the induced-fit mechanism) make deviation from it necessary to respect the plasticity, softness, and cross-talk that are essential to protein function. In this work, we illustrate recently reported protein behavior in multidomain proteins that deviates from the LEGO brick analogy. While earlier studies showed that a protein domain is often unaffected by being fused to another domain or becomes more stable following the formation of a new interface between the tethered domains, destabilization due to tethering has been reported for several systems. We illustrate that tethering may sometimes result in a multidomain protein behaving as "less than the sum of its parts". We survey these cases for which structure additivity does not guarantee thermodynamic additivity. Protein destabilization due to fusion to other domains may be linked in some cases to biological function and should be taken into account when designing large assemblies.
Xie, Hongbo; Vucetic, Slobodan; Iakoucheva, Lilia M.; Oldfield, Christopher J.; Dunker, A. Keith; Uversky, Vladimir N.; Obradovic, Zoran
2008-01-01
Identifying relationships between function, amino acid sequence and protein structure represents a major challenge. In this study we propose a bioinformatics approach that identifies functional keywords in the Swiss-Prot database that correlate with intrinsic disorder. A statistical evaluation is employed to rank the significance of these correlations. Protein sequence data redundancy and the relationship between protein length and protein structure were taken into consideration to ensure the quality of the statistical inferences. Over 200,000 proteins from Swiss-Prot database were analyzed using this approach. The predictions of intrinsic disorder were carried out using PONDR VL3E predictor of long disordered regions that achieves an accuracy of above 86%. Overall, out of the 710 Swiss-Prot functional keywords that were each associated with at least 20 proteins, 238 were found to be strongly positively correlated with predicted long intrinsically disordered regions, whereas 302 were strongly negatively correlated with such regions. The remaining 170 keywords were ambiguous without strong positive or negative correlation with the disorder predictions. These functions cover a large variety of biological activities and imply that disordered regions are characterized by a wide functional repertoire. Our results agree well with literature findings, as we were able to find at least one illustrative example of functional disorder or order shown experimentally for the vast majority of keywords showing the strongest positive or negative correlation with intrinsic disorder. This work opens a series of three papers, which enriches the current view of protein structure-function relationships, especially with regards to functionalities of intrinsically disordered proteins and provides researchers with a novel tool that could be used to improve the understanding of the relationships between protein structure and function. The first paper of the series describes our statistical approach, outlines the major findings and provides illustrative examples of biological processes and functions positively and negatively correlated with intrinsic disorder. PMID:17391014
Xie, Hongbo; Vucetic, Slobodan; Iakoucheva, Lilia M; Oldfield, Christopher J; Dunker, A Keith; Uversky, Vladimir N; Obradovic, Zoran
2007-05-01
Identifying relationships between function, amino acid sequence, and protein structure represents a major challenge. In this study, we propose a bioinformatics approach that identifies functional keywords in the Swiss-Prot database that correlate with intrinsic disorder. A statistical evaluation is employed to rank the significance of these correlations. Protein sequence data redundancy and the relationship between protein length and protein structure were taken into consideration to ensure the quality of the statistical inferences. Over 200,000 proteins from the Swiss-Prot database were analyzed using this approach. The predictions of intrinsic disorder were carried out using PONDR VL3E predictor of long disordered regions that achieves an accuracy of above 86%. Overall, out of the 710 Swiss-Prot functional keywords that were each associated with at least 20 proteins, 238 were found to be strongly positively correlated with predicted long intrinsically disordered regions, whereas 302 were strongly negatively correlated with such regions. The remaining 170 keywords were ambiguous without strong positive or negative correlation with the disorder predictions. These functions cover a large variety of biological activities and imply that disordered regions are characterized by a wide functional repertoire. Our results agree well with literature findings, as we were able to find at least one illustrative example of functional disorder or order shown experimentally for the vast majority of keywords showing the strongest positive or negative correlation with intrinsic disorder. This work opens a series of three papers, which enriches the current view of protein structure-function relationships, especially with regards to functionalities of intrinsically disordered proteins, and provides researchers with a novel tool that could be used to improve the understanding of the relationships between protein structure and function. The first paper of the series describes our statistical approach, outlines the major findings, and provides illustrative examples of biological processes and functions positively and negatively correlated with intrinsic disorder.
Lipid nanotechnologies for structural studies of membrane-associated proteins.
Stoilova-McPhie, Svetla; Grushin, Kirill; Dalm, Daniela; Miller, Jaimy
2014-11-01
We present a methodology of lipid nanotubes (LNT) and nanodisks technologies optimized in our laboratory for structural studies of membrane-associated proteins at close to physiological conditions. The application of these lipid nanotechnologies for structure determination by cryo-electron microscopy (cryo-EM) is fundamental for understanding and modulating their function. The LNTs in our studies are single bilayer galactosylceramide based nanotubes of ∼20 nm inner diameter and a few microns in length, that self-assemble in aqueous solutions. The lipid nanodisks (NDs) are self-assembled discoid lipid bilayers of ∼10 nm diameter, which are stabilized in aqueous solutions by a belt of amphipathic helical scaffold proteins. By combining LNT and ND technologies, we can examine structurally how the membrane curvature and lipid composition modulates the function of the membrane-associated proteins. As proof of principle, we have engineered these lipid nanotechnologies to mimic the activated platelet's phosphtaidylserine rich membrane and have successfully assembled functional membrane-bound coagulation factor VIII in vitro for structure determination by cryo-EM. The macromolecular organization of the proteins bound to ND and LNT are further defined by fitting the known atomic structures within the calculated three-dimensional maps. The combination of LNT and ND technologies offers a means to control the design and assembly of a wide range of functional membrane-associated proteins and complexes for structural studies by cryo-EM. The presented results confirm the suitability of the developed methodology for studying the functional structure of membrane-associated proteins, such as the coagulation factors, at a close to physiological environment. © 2014 Wiley Periodicals, Inc.
Protein crystallization X-ray diffraction data collection Protein structure determination Obtaining structures of protein-ligand complexes Site-directed mutagenesis Structure-function relationship Enzymatic CelA," Science (2013) "Sequence, Structure, and Evolution of Cellulases in Glycoside
Roles of water in protein structure and function studied by molecular liquid theory.
Imai, Takashi
2009-01-01
The roles of water in the structure and function of proteins have not been completely elucidated. Although molecular simulation has been widely used for the investigation of protein structure and function, it is not always useful for elucidating the roles of water because the effect of water ranges from atomic to thermodynamic level. The three-dimensional reference interaction site model (3D-RISM) theory, which is a statistical-mechanical theory of molecular liquids, can yield the solvation structure at the atomic level and calculate the thermodynamic quantities from the intermolecular potentials. In the last few years, the author and coworkers have succeeded in applying the 3D-RISM theory to protein aqueous solution systems and demonstrated that the theory is useful for investigating the roles of water. This article reviews some of the recent applications and findings, which are concerned with molecular recognition by protein, protein folding, and the partial molar volume of protein which is related to the pressure effect on protein.
PASS2: an automated database of protein alignments organised as structural superfamilies.
Bhaduri, Anirban; Pugalenthi, Ganesan; Sowdhamini, Ramanathan
2004-04-02
The functional selection and three-dimensional structural constraints of proteins in nature often relates to the retention of significant sequence similarity between proteins of similar fold and function despite poor sequence identity. Organization of structure-based sequence alignments for distantly related proteins, provides a map of the conserved and critical regions of the protein universe that is useful for the analysis of folding principles, for the evolutionary unification of protein families and for maximizing the information return from experimental structure determination. The Protein Alignment organised as Structural Superfamily (PASS2) database represents continuously updated, structural alignments for evolutionary related, sequentially distant proteins. An automated and updated version of PASS2 is, in direct correspondence with SCOP 1.63, consisting of sequences having identity below 40% among themselves. Protein domains have been grouped into 628 multi-member superfamilies and 566 single member superfamilies. Structure-based sequence alignments for the superfamilies have been obtained using COMPARER, while initial equivalencies have been derived from a preliminary superposition using LSQMAN or STAMP 4.0. The final sequence alignments have been annotated for structural features using JOY4.0. The database is supplemented with sequence relatives belonging to different genomes, conserved spatially interacting and structural motifs, probabilistic hidden markov models of superfamilies based on the alignments and useful links to other databases. Probabilistic models and sensitive position specific profiles obtained from reliable superfamily alignments aid annotation of remote homologues and are useful tools in structural and functional genomics. PASS2 presents the phylogeny of its members both based on sequence and structural dissimilarities. Clustering of members allows us to understand diversification of the family members. The search engine has been improved for simpler browsing of the database. The database resolves alignments among the structural domains consisting of evolutionarily diverged set of sequences. Availability of reliable sequence alignments of distantly related proteins despite poor sequence identity and single-member superfamilies permit better sampling of structures in libraries for fold recognition of new sequences and for the understanding of protein structure-function relationships of individual superfamilies. PASS2 is accessible at http://www.ncbs.res.in/~faculty/mini/campass/pass2.html
The BioPlex Network: A Systematic Exploration of the Human Interactome.
Huttlin, Edward L; Ting, Lily; Bruckner, Raphael J; Gebreab, Fana; Gygi, Melanie P; Szpyt, John; Tam, Stanley; Zarraga, Gabriela; Colby, Greg; Baltier, Kurt; Dong, Rui; Guarani, Virginia; Vaites, Laura Pontano; Ordureau, Alban; Rad, Ramin; Erickson, Brian K; Wühr, Martin; Chick, Joel; Zhai, Bo; Kolippakkam, Deepak; Mintseris, Julian; Obar, Robert A; Harris, Tim; Artavanis-Tsakonas, Spyros; Sowa, Mathew E; De Camilli, Pietro; Paulo, Joao A; Harper, J Wade; Gygi, Steven P
2015-07-16
Protein interactions form a network whose structure drives cellular function and whose organization informs biological inquiry. Using high-throughput affinity-purification mass spectrometry, we identify interacting partners for 2,594 human proteins in HEK293T cells. The resulting network (BioPlex) contains 23,744 interactions among 7,668 proteins with 86% previously undocumented. BioPlex accurately depicts known complexes, attaining 80%-100% coverage for most CORUM complexes. The network readily subdivides into communities that correspond to complexes or clusters of functionally related proteins. More generally, network architecture reflects cellular localization, biological process, and molecular function, enabling functional characterization of thousands of proteins. Network structure also reveals associations among thousands of protein domains, suggesting a basis for examining structurally related proteins. Finally, BioPlex, in combination with other approaches, can be used to reveal interactions of biological or clinical significance. For example, mutations in the membrane protein VAPB implicated in familial amyotrophic lateral sclerosis perturb a defined community of interactors. Copyright © 2015 Elsevier Inc. All rights reserved.
The BioPlex Network: A Systematic Exploration of the Human Interactome
Huttlin, Edward L.; Ting, Lily; Bruckner, Raphael J.; Gebreab, Fana; Gygi, Melanie P.; Szpyt, John; Tam, Stanley; Zarraga, Gabriela; Colby, Greg; Baltier, Kurt; Dong, Rui; Guarani, Virginia; Vaites, Laura Pontano; Ordureau, Alban; Rad, Ramin; Erickson, Brian K.; Wühr, Martin; Chick, Joel; Zhai, Bo; Kolippakkam, Deepak; Mintseris, Julian; Obar, Robert A.; Harris, Tim; Artavanis-Tsakonas, Spyros; Sowa, Mathew E.; DeCamilli, Pietro; Paulo, Joao A.; Harper, J. Wade; Gygi, Steven P.
2015-01-01
SUMMARY Protein interactions form a network whose structure drives cellular function and whose organization informs biological inquiry. Using high-throughput affinity-purification mass spectrometry, we identify interacting partners for 2,594 human proteins in HEK293T cells. The resulting network (BioPlex) contains 23,744 interactions among 7,668 proteins with 86% previously undocumented. BioPlex accurately depicts known complexes, attaining 80-100% coverage for most CORUM complexes. The network readily subdivides into communities that correspond to complexes or clusters of functionally related proteins. More generally, network architecture reflects cellular localization, biological process, and molecular function, enabling functional characterization of thousands of proteins. Network structure also reveals associations among thousands of protein domains, suggesting a basis for examining structurally-related proteins. Finally, BioPlex, in combination with other approaches can be used to reveal interactions of biological or clinical significance. For example, mutations in the membrane protein VAPB implicated in familial Amyotrophic Lateral Sclerosis perturb a defined community of interactors. PMID:26186194
Martin, Juliette; Regad, Leslie; Etchebest, Catherine; Camproux, Anne-Claude
2008-11-15
Interresidue protein contacts in proteins structures and at protein-protein interface are classically described by the amino acid types of interacting residues and the local structural context of the contact, if any, is described using secondary structures. In this study, we present an alternate analysis of interresidue contact using local structures defined by the structural alphabet introduced by Camproux et al. This structural alphabet allows to describe a 3D structure as a sequence of prototype fragments called structural letters, of 27 different types. Each residue can then be assigned to a particular local structure, even in loop regions. The analysis of interresidue contacts within protein structures defined using Voronoï tessellations reveals that pairwise contact specificity is greater in terms of structural letters than amino acids. Using a simple heuristic based on specificity score comparison, we find that 74% of the long-range contacts within protein structures are better described using structural letters than amino acid types. The investigation is extended to a set of protein-protein complexes, showing that the similar global rules apply as for intraprotein contacts, with 64% of the interprotein contacts best described by local structures. We then present an evaluation of pairing functions integrating structural letters to decoy scoring and show that some complexes could benefit from the use of structural letter-based pairing functions.
Computational approaches for rational design of proteins with novel functionalities
Tiwari, Manish Kumar; Singh, Ranjitha; Singh, Raushan Kumar; Kim, In-Won; Lee, Jung-Kul
2012-01-01
Proteins are the most multifaceted macromolecules in living systems and have various important functions, including structural, catalytic, sensory, and regulatory functions. Rational design of enzymes is a great challenge to our understanding of protein structure and physical chemistry and has numerous potential applications. Protein design algorithms have been applied to design or engineer proteins that fold, fold faster, catalyze, catalyze faster, signal, and adopt preferred conformational states. The field of de novo protein design, although only a few decades old, is beginning to produce exciting results. Developments in this field are already having a significant impact on biotechnology and chemical biology. The application of powerful computational methods for functional protein designing has recently succeeded at engineering target activities. Here, we review recently reported de novo functional proteins that were developed using various protein design approaches, including rational design, computational optimization, and selection from combinatorial libraries, highlighting recent advances and successes. PMID:24688643
The use of experimental structures to model protein dynamics.
Katebi, Ataur R; Sankar, Kannan; Jia, Kejue; Jernigan, Robert L
2015-01-01
The number of solved protein structures submitted in the Protein Data Bank (PDB) has increased dramatically in recent years. For some specific proteins, this number is very high-for example, there are over 550 solved structures for HIV-1 protease, one protein that is essential for the life cycle of human immunodeficiency virus (HIV) which causes acquired immunodeficiency syndrome (AIDS) in humans. The large number of structures for the same protein and its variants include a sample of different conformational states of the protein. A rich set of structures solved experimentally for the same protein has information buried within the dataset that can explain the functional dynamics and structural mechanism of the protein. To extract the dynamics information and functional mechanism from the experimental structures, this chapter focuses on two methods-Principal Component Analysis (PCA) and Elastic Network Models (ENM). PCA is a widely used statistical dimensionality reduction technique to classify and visualize high-dimensional data. On the other hand, ENMs are well-established simple biophysical method for modeling the functionally important global motions of proteins. This chapter covers the basics of these two. Moreover, an improved ENM version that utilizes the variations found within a given set of structures for a protein is described. As a practical example, we have extracted the functional dynamics and mechanism of HIV-1 protease dimeric structure by using a set of 329 PDB structures of this protein. We have described, step by step, how to select a set of protein structures, how to extract the needed information from the PDB files for PCA, how to extract the dynamics information using PCA, how to calculate ENM modes, how to measure the congruency between the dynamics computed from the principal components (PCs) and the ENM modes, and how to compute entropies using the PCs. We provide the computer programs or references to software tools to accomplish each step and show how to use these programs and tools. We also include computer programs to generate movies based on PCs and ENM modes and describe how to visualize them.
The Use of Experimental Structures to Model Protein Dynamics
Katebi, Ataur R.; Sankar, Kannan; Jia, Kejue; Jernigan, Robert L.
2014-01-01
Summary The number of solved protein structures submitted in the Protein Data Bank (PDB) has increased dramatically in recent years. For some specific proteins, this number is very high – for example, there are over 550 solved structures for HIV-1 protease, one protein that is essential for the life cycle of human immunodeficiency virus (HIV) which causes acquired immunodeficiency syndrome (AIDS) in humans. The large number of structures for the same protein and its variants include a sample of different conformational states of the protein. A rich set of structures solved experimentally for the same protein has information buried within the dataset that can explain the functional dynamics and structural mechanism of the protein. To extract the dynamics information and functional mechanism from the experimental structures, this chapter focuses on two methods – Principal Component Analysis (PCA) and Elastic Network Models (ENM). PCA is a widely used statistical dimensionality reduction technique to classify and visualize high-dimensional data. On the other hand, ENMs are well-established simple biophysical method for modeling the functionally important global motions of proteins. This chapter covers the basics of these two. Moreover, an improved ENM version that utilizes the variations found within a given set of structures for a protein is described. As a practical example, we have extracted the functional dynamics and mechanism of HIV-1 protease dimeric structure by using a set of 329 PDB structures of this protein. We have described, step by step, how to select a set of protein structures, how to extract the needed information from the PDB files for PCA, how to extract the dynamics information using PCA, how to calculate ENM modes, how to measure the congruency between the dynamics computed from the principal components (PCs) and the ENM modes, and how to compute entropies using the PCs. We provide the computer programs or references to software tools to accomplish each step and show how to use these programs and tools. We also include computer programs to generate movies based on PCs and ENM modes and describe how to visualize them. PMID:25330965
Protein-protein structure prediction by scoring molecular dynamics trajectories of putative poses.
Sarti, Edoardo; Gladich, Ivan; Zamuner, Stefano; Correia, Bruno E; Laio, Alessandro
2016-09-01
The prediction of protein-protein interactions and their structural configuration remains a largely unsolved problem. Most of the algorithms aimed at finding the native conformation of a protein complex starting from the structure of its monomers are based on searching the structure corresponding to the global minimum of a suitable scoring function. However, protein complexes are often highly flexible, with mobile side chains and transient contacts due to thermal fluctuations. Flexibility can be neglected if one aims at finding quickly the approximate structure of the native complex, but may play a role in structure refinement, and in discriminating solutions characterized by similar scores. We here benchmark the capability of some state-of-the-art scoring functions (BACH-SixthSense, PIE/PISA and Rosetta) in discriminating finite-temperature ensembles of structures corresponding to the native state and to non-native configurations. We produce the ensembles by running thousands of molecular dynamics simulations in explicit solvent starting from poses generated by rigid docking and optimized in vacuum. We find that while Rosetta outperformed the other two scoring functions in scoring the structures in vacuum, BACH-SixthSense and PIE/PISA perform better in distinguishing near-native ensembles of structures generated by molecular dynamics in explicit solvent. Proteins 2016; 84:1312-1320. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
An Interactive Introduction to Protein Structure
ERIC Educational Resources Information Center
Lee, W. Theodore
2004-01-01
To improve student understanding of protein structure and the significance of noncovalent interactions in protein structure and function, students are assigned a project to write a paper complemented with computer-generated images. The assignment provides an opportunity for students to select a protein structure that is of interest and detail…
Efficient protein structure search using indexing methods
2013-01-01
Understanding functions of proteins is one of the most important challenges in many studies of biological processes. The function of a protein can be predicted by analyzing the functions of structurally similar proteins, thus finding structurally similar proteins accurately and efficiently from a large set of proteins is crucial. A protein structure can be represented as a vector by 3D-Zernike Descriptor (3DZD) which compactly represents the surface shape of the protein tertiary structure. This simplified representation accelerates the searching process. However, computing the similarity of two protein structures is still computationally expensive, thus it is hard to efficiently process many simultaneous requests of structurally similar protein search. This paper proposes indexing techniques which substantially reduce the search time to find structurally similar proteins. In particular, we first exploit two indexing techniques, i.e., iDistance and iKernel, on the 3DZDs. After that, we extend the techniques to further improve the search speed for protein structures. The extended indexing techniques build and utilize an reduced index constructed from the first few attributes of 3DZDs of protein structures. To retrieve top-k similar structures, top-10 × k similar structures are first found using the reduced index, and top-k structures are selected among them. We also modify the indexing techniques to support θ-based nearest neighbor search, which returns data points less than θ to the query point. The results show that both iDistance and iKernel significantly enhance the searching speed. In top-k nearest neighbor search, the searching time is reduced 69.6%, 77%, 77.4% and 87.9%, respectively using iDistance, iKernel, the extended iDistance, and the extended iKernel. In θ-based nearest neighbor serach, the searching time is reduced 80%, 81%, 95.6% and 95.6% using iDistance, iKernel, the extended iDistance, and the extended iKernel, respectively. PMID:23691543
Efficient protein structure search using indexing methods.
Kim, Sungchul; Sael, Lee; Yu, Hwanjo
2013-01-01
Understanding functions of proteins is one of the most important challenges in many studies of biological processes. The function of a protein can be predicted by analyzing the functions of structurally similar proteins, thus finding structurally similar proteins accurately and efficiently from a large set of proteins is crucial. A protein structure can be represented as a vector by 3D-Zernike Descriptor (3DZD) which compactly represents the surface shape of the protein tertiary structure. This simplified representation accelerates the searching process. However, computing the similarity of two protein structures is still computationally expensive, thus it is hard to efficiently process many simultaneous requests of structurally similar protein search. This paper proposes indexing techniques which substantially reduce the search time to find structurally similar proteins. In particular, we first exploit two indexing techniques, i.e., iDistance and iKernel, on the 3DZDs. After that, we extend the techniques to further improve the search speed for protein structures. The extended indexing techniques build and utilize an reduced index constructed from the first few attributes of 3DZDs of protein structures. To retrieve top-k similar structures, top-10 × k similar structures are first found using the reduced index, and top-k structures are selected among them. We also modify the indexing techniques to support θ-based nearest neighbor search, which returns data points less than θ to the query point. The results show that both iDistance and iKernel significantly enhance the searching speed. In top-k nearest neighbor search, the searching time is reduced 69.6%, 77%, 77.4% and 87.9%, respectively using iDistance, iKernel, the extended iDistance, and the extended iKernel. In θ-based nearest neighbor serach, the searching time is reduced 80%, 81%, 95.6% and 95.6% using iDistance, iKernel, the extended iDistance, and the extended iKernel, respectively.
RNA helicase proteins as chaperones and remodelers
Jarmoskaite, Inga; Russell, Rick
2014-01-01
Superfamily 2 helicase proteins are ubiquitous in RNA biology and have an extraordinarily broad set of functional roles. Central among these roles are to promote rearrangements of structured RNAs and to remodel RNA-protein complexes (RNPs), allowing formation of native RNA structure or progression through a functional cycle of structures. While all superfamily 2 helicases share a conserved helicase core, they are divided evolutionarily into several families, and it is principally proteins from three families, the DEAD-box, DEAH/RHA and Ski2-like families, that function to manipulate structured RNAs and RNPs. Strikingly, there are emerging differences in the mechanisms of these proteins, both between families and within the largest family (DEAD-box), and these differences appear to be tuned to their RNA or RNP substrates and their specific roles. This review outlines basic mechanistic features of the three families and surveys individual proteins and the current understanding of their biological substrates and mechanisms. PMID:24635478
Ghosh, Pritha; Mathew, Oommen K; Sowdhamini, Ramanathan
2016-10-07
RNA-binding proteins (RBPs) interact with their cognate RNA(s) to form large biomolecular assemblies. They are versatile in their functionality and are involved in a myriad of processes inside the cell. RBPs with similar structural features and common biological functions are grouped together into families and superfamilies. It will be useful to obtain an early understanding and association of RNA-binding property of sequences of gene products. Here, we report a web server, RStrucFam, to predict the structure, type of cognate RNA(s) and function(s) of proteins, where possible, from mere sequence information. The web server employs Hidden Markov Model scan (hmmscan) to enable association to a back-end database of structural and sequence families. The database (HMMRBP) comprises of 437 HMMs of RBP families of known structure that have been generated using structure-based sequence alignments and 746 sequence-centric RBP family HMMs. The input protein sequence is associated with structural or sequence domain families, if structure or sequence signatures exist. In case of association of the protein with a family of known structures, output features like, multiple structure-based sequence alignment (MSSA) of the query with all others members of that family is provided. Further, cognate RNA partner(s) for that protein, Gene Ontology (GO) annotations, if any and a homology model of the protein can be obtained. The users can also browse through the database for details pertaining to each family, protein or RNA and their related information based on keyword search or RNA motif search. RStrucFam is a web server that exploits structurally conserved features of RBPs, derived from known family members and imprinted in mathematical profiles, to predict putative RBPs from sequence information. Proteins that fail to associate with such structure-centric families are further queried against the sequence-centric RBP family HMMs in the HMMRBP database. Further, all other essential information pertaining to an RBP, like overall function annotations, are provided. The web server can be accessed at the following link: http://caps.ncbs.res.in/rstrucfam .
Membrane proteins bind lipids selectively to modulate their structure and function.
Laganowsky, Arthur; Reading, Eamonn; Allison, Timothy M; Ulmschneider, Martin B; Degiacomi, Matteo T; Baldwin, Andrew J; Robinson, Carol V
2014-06-05
Previous studies have established that the folding, structure and function of membrane proteins are influenced by their lipid environments and that lipids can bind to specific sites, for example, in potassium channels. Fundamental questions remain however regarding the extent of membrane protein selectivity towards lipids. Here we report a mass spectrometry approach designed to determine the selectivity of lipid binding to membrane protein complexes. We investigate the mechanosensitive channel of large conductance (MscL) from Mycobacterium tuberculosis and aquaporin Z (AqpZ) and the ammonia channel (AmtB) from Escherichia coli, using ion mobility mass spectrometry (IM-MS), which reports gas-phase collision cross-sections. We demonstrate that folded conformations of membrane protein complexes can exist in the gas phase. By resolving lipid-bound states, we then rank bound lipids on the basis of their ability to resist gas phase unfolding and thereby stabilize membrane protein structure. Lipids bind non-selectively and with high avidity to MscL, all imparting comparable stability; however, the highest-ranking lipid is phosphatidylinositol phosphate, in line with its proposed functional role in mechanosensation. AqpZ is also stabilized by many lipids, with cardiolipin imparting the most significant resistance to unfolding. Subsequently, through functional assays we show that cardiolipin modulates AqpZ function. Similar experiments identify AmtB as being highly selective for phosphatidylglycerol, prompting us to obtain an X-ray structure in this lipid membrane-like environment. The 2.3 Å resolution structure, when compared with others obtained without lipid bound, reveals distinct conformational changes that re-position AmtB residues to interact with the lipid bilayer. Our results demonstrate that resistance to unfolding correlates with specific lipid-binding events, enabling a distinction to be made between lipids that merely bind from those that modulate membrane protein structure and/or function. We anticipate that these findings will be important not only for defining the selectivity of membrane proteins towards lipids, but also for understanding the role of lipids in modulating protein function or drug binding.
Hidden relationships between metalloproteins unveiled by structural comparison of their metal sites
NASA Astrophysics Data System (ADS)
Valasatava, Yana; Andreini, Claudia; Rosato, Antonio
2015-03-01
Metalloproteins account for a substantial fraction of all proteins. They incorporate metal atoms, which are required for their structure and/or function. Here we describe a new computational protocol to systematically compare and classify metal-binding sites on the basis of their structural similarity. These sites are extracted from the MetalPDB database of minimal functional sites (MFSs) in metal-binding biological macromolecules. Structural similarity is measured by the scoring function of the available MetalS2 program. Hierarchical clustering was used to organize MFSs into clusters, for each of which a representative MFS was identified. The comparison of all representative MFSs provided a thorough structure-based classification of the sites analyzed. As examples, the application of the proposed computational protocol to all heme-binding proteins and zinc-binding proteins of known structure highlighted the existence of structural subtypes, validated known evolutionary links and shed new light on the occurrence of similar sites in systems at different evolutionary distances. The present approach thus makes available an innovative viewpoint on metalloproteins, where the functionally crucial metal sites effectively lead the discovery of structural and functional relationships in a largely protein-independent manner.
Quality assessment of protein model-structures based on structural and functional similarities
2012-01-01
Background Experimental determination of protein 3D structures is expensive, time consuming and sometimes impossible. A gap between number of protein structures deposited in the World Wide Protein Data Bank and the number of sequenced proteins constantly broadens. Computational modeling is deemed to be one of the ways to deal with the problem. Although protein 3D structure prediction is a difficult task, many tools are available. These tools can model it from a sequence or partial structural information, e.g. contact maps. Consequently, biologists have the ability to generate automatically a putative 3D structure model of any protein. However, the main issue becomes evaluation of the model quality, which is one of the most important challenges of structural biology. Results GOBA - Gene Ontology-Based Assessment is a novel Protein Model Quality Assessment Program. It estimates the compatibility between a model-structure and its expected function. GOBA is based on the assumption that a high quality model is expected to be structurally similar to proteins functionally similar to the prediction target. Whereas DALI is used to measure structure similarity, protein functional similarity is quantified using standardized and hierarchical description of proteins provided by Gene Ontology combined with Wang's algorithm for calculating semantic similarity. Two approaches are proposed to express the quality of protein model-structures. One is a single model quality assessment method, the other is its modification, which provides a relative measure of model quality. Exhaustive evaluation is performed on data sets of model-structures submitted to the CASP8 and CASP9 contests. Conclusions The validation shows that the method is able to discriminate between good and bad model-structures. The best of tested GOBA scores achieved 0.74 and 0.8 as a mean Pearson correlation to the observed quality of models in our CASP8 and CASP9-based validation sets. GOBA also obtained the best result for two targets of CASP8, and one of CASP9, compared to the contest participants. Consequently, GOBA offers a novel single model quality assessment program that addresses the practical needs of biologists. In conjunction with other Model Quality Assessment Programs (MQAPs), it would prove useful for the evaluation of single protein models. PMID:22998498
Khafizov, Kamil; Madrid-Aliste, Carlos; Almo, Steven C; Fiser, Andras
2014-03-11
The exponential growth of protein sequence data provides an ever-expanding body of unannotated and misannotated proteins. The National Institutes of Health-supported Protein Structure Initiative and related worldwide structural genomics efforts facilitate functional annotation of proteins through structural characterization. Recently there have been profound changes in the taxonomic composition of sequence databases, which are effectively redefining the scope and contribution of these large-scale structure-based efforts. The faster-growing bacterial genomic entries have overtaken the eukaryotic entries over the last 5 y, but also have become more redundant. Despite the enormous increase in the number of sequences, the overall structural coverage of proteins--including proteins for which reliable homology models can be generated--on the residue level has increased from 30% to 40% over the last 10 y. Structural genomics efforts contributed ∼50% of this new structural coverage, despite determining only ∼10% of all new structures. Based on current trends, it is expected that ∼55% structural coverage (the level required for significant functional insight) will be achieved within 15 y, whereas without structural genomics efforts, realizing this goal will take approximately twice as long.
Single-Molecule Microscopy and Force Spectroscopy of Membrane Proteins
NASA Astrophysics Data System (ADS)
Engel, Andreas; Janovjak, Harald; Fotiadis, Dimtrios; Kedrov, Alexej; Cisneros, David; Müller, Daniel J.
Single-molecule atomic force microscopy (AFM) provides novel ways to characterize the structure-function relationship of native membrane proteins. High-resolution AFM topographs allow observing the structure of single proteins at sub-nanometer resolution as well as their conformational changes, oligomeric state, molecular dynamics and assembly. We will review these feasibilities illustrating examples of membrane proteins in native and reconstituted membranes. Classification of individual topographs of single proteins allows understanding the principles of motions of their extrinsic domains, to learn about their local structural flexibilities and to find the entropy minima of certain conformations. Combined with the visualization of functionally related conformational changes these insights allow understanding why certain flexibilities are required for the protein to function and how structurally flexible regions allow certain conformational changes. Complementary to AFM imaging, single-molecule force spectroscopy (SMFS) experiments detect molecular interactions established within and between membrane proteins. The sensitivity of this method makes it possible to measure interactions that stabilize secondary structures such as transmembrane α-helices, polypeptide loops and segments within. Changes in temperature or protein-protein assembly do not change the locations of stable structural segments, but influence their stability established by collective molecular interactions. Such changes alter the probability of proteins to choose a certain unfolding pathway. Recent examples have elucidated unfolding and refolding pathways of membrane proteins as well as their energy landscapes.
Leite, Wellington C; Galvão, Carolina W; Saab, Sérgio C; Iulek, Jorge; Etto, Rafael M; Steffens, Maria B R; Chitteni-Pattu, Sindhu; Stanage, Tyler; Keck, James L; Cox, Michael M
2016-01-01
The bacterial RecA protein plays a role in the complex system of DNA damage repair. Here, we report the functional and structural characterization of the Herbaspirillum seropedicae RecA protein (HsRecA). HsRecA protein is more efficient at displacing SSB protein from ssDNA than Escherichia coli RecA protein. HsRecA also promotes DNA strand exchange more efficiently. The three dimensional structure of HsRecA-ADP/ATP complex has been solved to 1.7 Å resolution. HsRecA protein contains a small N-terminal domain, a central core ATPase domain and a large C-terminal domain, that are similar to homologous bacterial RecA proteins. Comparative structural analysis showed that the N-terminal polymerization motif of archaeal and eukaryotic RecA family proteins are also present in bacterial RecAs. Reconstruction of electrostatic potential from the hexameric structure of HsRecA-ADP/ATP revealed a high positive charge along the inner side, where ssDNA is bound inside the filament. The properties of this surface may explain the greater capacity of HsRecA protein to bind ssDNA, forming a contiguous nucleoprotein filament, displace SSB and promote DNA exchange relative to EcRecA. Our functional and structural analyses provide insight into the molecular mechanisms of polymerization of bacterial RecA as a helical nucleoprotein filament.
NASA Astrophysics Data System (ADS)
Gegner, Julie; Spruill, Natalie; Plesniak, Leigh A.
1999-11-01
The terms "structure" and "function" can assume a variety of meanings. In biochemistry, the "structure" of a protein can refer to its sequence of amino acids, the three-dimensional arrangement of atoms within a subunit, or the arrangement of subunits into a larger oligomeric or filamentous state. Likewise, the function of biological macromolecules can be examined at many levels. The function of a protein can be described by its role in an organism's survival or by a chemical reaction that it promotes. We have designed a three-part biochemical laboratory experiment that characterizes the structure and function of the Escherichia coli RecA protein. The first part examines the importance of RecA in the survival of bacteria that have been exposed to UV light. This is the broadest view of function of the enzyme. Second, the students use an in vitro assay of RecA whereby the protein promotes homologous recombination. Because RecA functions not catalytically, but rather stoichiometrically, in this recombination reaction, the oligomeric state of RecA in complex with DNA must also be discussed. Finally, through molecular modeling of X-ray crystallographic structures, students identify functionally important features of the ATP cofactor binding site of RecA.
Decomposition of Proteins into Dynamic Units from Atomic Cross-Correlation Functions.
Calligari, Paolo; Gerolin, Marco; Abergel, Daniel; Polimeno, Antonino
2017-01-10
In this article, we present a clustering method of atoms in proteins based on the analysis of the correlation times of interatomic distance correlation functions computed from MD simulations. The goal is to provide a coarse-grained description of the protein in terms of fewer elements that can be treated as dynamically independent subunits. Importantly, this domain decomposition method does not take into account structural properties of the protein. Instead, the clustering of protein residues in terms of networks of dynamically correlated domains is defined on the basis of the effective correlation times of the pair distance correlation functions. For these properties, our method stands as a complementary analysis to the customary protein decomposition in terms of quasi-rigid, structure-based domains. Results obtained for a prototypal protein structure illustrate the approach proposed.
Lorenzo, J Ramiro; Alonso, Leonardo G; Sánchez, Ignacio E
2015-01-01
Asparagine residues in proteins undergo spontaneous deamidation, a post-translational modification that may act as a molecular clock for the regulation of protein function and turnover. Asparagine deamidation is modulated by protein local sequence, secondary structure and hydrogen bonding. We present NGOME, an algorithm able to predict non-enzymatic deamidation of internal asparagine residues in proteins in the absence of structural data, using sequence-based predictions of secondary structure and intrinsic disorder. Compared to previous algorithms, NGOME does not require three-dimensional structures yet yields better predictions than available sequence-only methods. Four case studies of specific proteins show how NGOME may help the user identify deamidation-prone asparagine residues, often related to protein gain of function, protein degradation or protein misfolding in pathological processes. A fifth case study applies NGOME at a proteomic scale and unveils a correlation between asparagine deamidation and protein degradation in yeast. NGOME is freely available as a webserver at the National EMBnet node Argentina, URL: http://www.embnet.qb.fcen.uba.ar/ in the subpage "Protein and nucleic acid structure and sequence analysis".
The Popeye Domain Containing Genes and Their Function as cAMP Effector Proteins in Striated Muscle.
Brand, Thomas
2018-03-13
The Popeye domain containing (POPDC) genes encode transmembrane proteins, which are abundantly expressed in striated muscle cells. Hallmarks of the POPDC proteins are the presence of three transmembrane domains and the Popeye domain, which makes up a large part of the cytoplasmic portion of the protein and functions as a cAMP-binding domain. Interestingly, despite the prediction of structural similarity between the Popeye domain and other cAMP binding domains, at the protein sequence level they strongly differ from each other suggesting an independent evolutionary origin of POPDC proteins. Loss-of-function experiments in zebrafish and mouse established an important role of POPDC proteins for cardiac conduction and heart rate adaptation after stress. Loss-of function mutations in patients have been associated with limb-girdle muscular dystrophy and AV-block. These data suggest an important role of these proteins in the maintenance of structure and function of striated muscle cells.
Zurawski, S M; Zurawski, G
1988-01-01
We have analyzed structure--function relationships of the protein hormone murine interleukin 2 by fine structural deletion mapping. A total of 130 deletion mutant proteins, together with some substitution and insertion mutant proteins, was expressed in Escherichia coli and analyzed for their ability to sustain the proliferation of a cloned murine T cell line. This analysis has permitted a functional map of the protein to be drawn and classifies five segments of the protein, which together contain 48% of the sequence, as unessential to the biological activity of the protein. A further 26% of the protein is classified as important, but not crucial, for the activity. Three regions, consisting of amino acids 32-35, 66-77 and 119-141 contain the remaining 26% of the protein and are critical to the biological activity of the protein. The functional map is discussed in the context of the possible role of the identified critical regions in the structure of the hormone and its binding to the interleukin 2 receptor complex. Images PMID:3261239
Stability and the Evolvability of Function in a Model Protein
Bloom, Jesse D.; Wilke, Claus O.; Arnold, Frances H.; Adami, Christoph
2004-01-01
Functional proteins must fold with some minimal stability to a structure that can perform a biochemical task. Here we use a simple model to investigate the relationship between the stability requirement and the capacity of a protein to evolve the function of binding to a ligand. Although our model contains no built-in tradeoff between stability and function, proteins evolved function more efficiently when the stability requirement was relaxed. Proteins with both high stability and high function evolved more efficiently when the stability requirement was gradually increased than when there was constant selection for high stability. These results show that in our model, the evolution of function is enhanced by allowing proteins to explore sequences corresponding to marginally stable structures, and that it is easier to improve stability while maintaining high function than to improve function while maintaining high stability. Our model also demonstrates that even in the absence of a fundamental biophysical tradeoff between stability and function, the speed with which function can evolve is limited by the stability requirement imposed on the protein. PMID:15111394
Bunney, Tom D.; Cole, Ambrose R.; Broncel, Malgorzata; Esposito, Diego; Tate, Edward W.; Katan, Matilda
2014-01-01
Summary Protein AMPylation, the transfer of AMP from ATP to protein targets, has been recognized as a new mechanism of host-cell disruption by some bacterial effectors that typically contain a FIC-domain. Eukaryotic genomes also encode one FIC-domain protein, HYPE, which has remained poorly characterized. Here we describe the structure of human HYPE, solved by X-ray crystallography, representing the first structure of a eukaryotic FIC-domain protein. We demonstrate that HYPE forms stable dimers with structurally and functionally integrated FIC-domains and with TPR-motifs exposed for protein-protein interactions. As HYPE also uniquely possesses a transmembrane helix, dimerization is likely to affect its positioning and function in the membrane vicinity. The low rate of autoAMPylation of the wild-type HYPE could be due to autoinhibition, consistent with the mechanism proposed for a number of putative FIC AMPylators. Our findings also provide a basis to further consider possible alternative cofactors of HYPE and distinct modes of target-recognition. PMID:25435325
Bunney, Tom D; Cole, Ambrose R; Broncel, Malgorzata; Esposito, Diego; Tate, Edward W; Katan, Matilda
2014-12-02
Protein AMPylation, the transfer of AMP from ATP to protein targets, has been recognized as a new mechanism of host-cell disruption by some bacterial effectors that typically contain a FIC-domain. Eukaryotic genomes also encode one FIC-domain protein,HYPE, which has remained poorly characterized.Here we describe the structure of human HYPE, solved by X-ray crystallography, representing the first structure of a eukaryotic FIC-domain protein. We demonstrate that HYPE forms stable dimers with structurally and functionally integrated FIC-domains and with TPR-motifs exposed for protein-protein interactions. As HYPE also uniquely possesses a transmembrane helix, dimerization is likely to affect its positioning and function in the membrane vicinity. The low rate of auto AMPylation of the wild-type HYPE could be due to autoinhibition, consistent with the mechanism proposed for a number of putative FIC AMPylators. Our findings also provide a basis to further consider possible alternative cofactors of HYPE and distinct modes of target-recognition.
Structure-Based Phylogenetic Analysis of the Lipocalin Superfamily.
Lakshmi, Balasubramanian; Mishra, Madhulika; Srinivasan, Narayanaswamy; Archunan, Govindaraju
2015-01-01
Lipocalins constitute a superfamily of extracellular proteins that are found in all three kingdoms of life. Although very divergent in their sequences and functions, they show remarkable similarity in 3-D structures. Lipocalins bind and transport small hydrophobic molecules. Earlier sequence-based phylogenetic studies of lipocalins highlighted that they have a long evolutionary history. However the molecular and structural basis of their functional diversity is not completely understood. The main objective of the present study is to understand functional diversity of the lipocalins using a structure-based phylogenetic approach. The present study with 39 protein domains from the lipocalin superfamily suggests that the clusters of lipocalins obtained by structure-based phylogeny correspond well with the functional diversity. The detailed analysis on each of the clusters and sub-clusters reveals that the 39 lipocalin domains cluster based on their mode of ligand binding though the clustering was performed on the basis of gross domain structure. The outliers in the phylogenetic tree are often from single member families. Also structure-based phylogenetic approach has provided pointers to assign putative function for the domains of unknown function in lipocalin family. The approach employed in the present study can be used in the future for the functional identification of new lipocalin proteins and may be extended to other protein families where members show poor sequence similarity but high structural similarity.
NASA Astrophysics Data System (ADS)
Xu, Xianjin; Yan, Chengfei; Zou, Xiaoqin
2017-08-01
The growing number of protein-ligand complex structures, particularly the structures of proteins co-bound with different ligands, in the Protein Data Bank helps us tackle two major challenges in molecular docking studies: the protein flexibility and the scoring function. Here, we introduced a systematic strategy by using the information embedded in the known protein-ligand complex structures to improve both binding mode and binding affinity predictions. Specifically, a ligand similarity calculation method was employed to search a receptor structure with a bound ligand sharing high similarity with the query ligand for the docking use. The strategy was applied to the two datasets (HSP90 and MAP4K4) in recent D3R Grand Challenge 2015. In addition, for the HSP90 dataset, a system-specific scoring function (ITScore2_hsp90) was generated by recalibrating our statistical potential-based scoring function (ITScore2) using the known protein-ligand complex structures and the statistical mechanics-based iterative method. For the HSP90 dataset, better performances were achieved for both binding mode and binding affinity predictions comparing with the original ITScore2 and with ensemble docking. For the MAP4K4 dataset, although there were only eight known protein-ligand complex structures, our docking strategy achieved a comparable performance with ensemble docking. Our method for receptor conformational selection and iterative method for the development of system-specific statistical potential-based scoring functions can be easily applied to other protein targets that have a number of protein-ligand complex structures available to improve predictions on binding.
Hoyer, Lois L.; Cota, Ernesto
2016-01-01
Approximately two decades have passed since the description of the first gene in the Candida albicans ALS (agglutinin-like sequence) family. Since that time, much has been learned about the composition of the family and the function of its encoded cell-surface glycoproteins. Solution of the structure of the Als adhesive domain provides the opportunity to evaluate the molecular basis for protein function. This review article is formatted as a series of fundamental questions and explores the diversity of the Als proteins, as well as their role in ligand binding, aggregative effects, and attachment to abiotic surfaces. Interaction of Als proteins with each other, their functional equivalence, and the effects of protein abundance on phenotypic conclusions are also examined. Structural features of Als proteins that may facilitate invasive function are considered. Conclusions that are firmly supported by the literature are presented while highlighting areas that require additional investigation to reveal basic features of the Als proteins, their relatedness to each other, and their roles in C. albicans biology. PMID:27014205
Insights from molecular dynamics simulations for computational protein design.
Childers, Matthew Carter; Daggett, Valerie
2017-02-01
A grand challenge in the field of structural biology is to design and engineer proteins that exhibit targeted functions. Although much success on this front has been achieved, design success rates remain low, an ever-present reminder of our limited understanding of the relationship between amino acid sequences and the structures they adopt. In addition to experimental techniques and rational design strategies, computational methods have been employed to aid in the design and engineering of proteins. Molecular dynamics (MD) is one such method that simulates the motions of proteins according to classical dynamics. Here, we review how insights into protein dynamics derived from MD simulations have influenced the design of proteins. One of the greatest strengths of MD is its capacity to reveal information beyond what is available in the static structures deposited in the Protein Data Bank. In this regard simulations can be used to directly guide protein design by providing atomistic details of the dynamic molecular interactions contributing to protein stability and function. MD simulations can also be used as a virtual screening tool to rank, select, identify, and assess potential designs. MD is uniquely poised to inform protein design efforts where the application requires realistic models of protein dynamics and atomic level descriptions of the relationship between dynamics and function. Here, we review cases where MD simulations was used to modulate protein stability and protein function by providing information regarding the conformation(s), conformational transitions, interactions, and dynamics that govern stability and function. In addition, we discuss cases where conformations from protein folding/unfolding simulations have been exploited for protein design, yielding novel outcomes that could not be obtained from static structures.
Insights from molecular dynamics simulations for computational protein design
Childers, Matthew Carter; Daggett, Valerie
2017-01-01
A grand challenge in the field of structural biology is to design and engineer proteins that exhibit targeted functions. Although much success on this front has been achieved, design success rates remain low, an ever-present reminder of our limited understanding of the relationship between amino acid sequences and the structures they adopt. In addition to experimental techniques and rational design strategies, computational methods have been employed to aid in the design and engineering of proteins. Molecular dynamics (MD) is one such method that simulates the motions of proteins according to classical dynamics. Here, we review how insights into protein dynamics derived from MD simulations have influenced the design of proteins. One of the greatest strengths of MD is its capacity to reveal information beyond what is available in the static structures deposited in the Protein Data Bank. In this regard simulations can be used to directly guide protein design by providing atomistic details of the dynamic molecular interactions contributing to protein stability and function. MD simulations can also be used as a virtual screening tool to rank, select, identify, and assess potential designs. MD is uniquely poised to inform protein design efforts where the application requires realistic models of protein dynamics and atomic level descriptions of the relationship between dynamics and function. Here, we review cases where MD simulations was used to modulate protein stability and protein function by providing information regarding the conformation(s), conformational transitions, interactions, and dynamics that govern stability and function. In addition, we discuss cases where conformations from protein folding/unfolding simulations have been exploited for protein design, yielding novel outcomes that could not be obtained from static structures. PMID:28239489
Genshaft, Alexander; Moser, Joe-Ann S.; D'Antonio, Edward L.; Bowman, Christine M.; Christianson, David W.
2013-01-01
The reversible acetylation of lysine to form N6-acetyllysine in the regulation of protein function is a hallmark of epigenetics. Acetylation of the positively charged amino group of the lysine side chain generates a neutral N-alkylacetamide moiety that serves as a molecular “switch” for the modulation of protein function and protein-protein interactions. We now report the analysis of 381 N6-acetyllysine side chain amide conformations as found in 79 protein crystal structures and 11 protein NMR structures deposited in the Protein Data Bank (PDB) of the Research Collaboratory for Structural Bioinformatics. We find that only 74.3% of N6-acetyllysine residues in protein crystal structures and 46.5% in protein NMR structures contain amide groups with energetically preferred trans or generously trans conformations. Surprisingly, 17.6% of N6-acetyllysine residues in protein crystal structures and 5.3% in protein NMR structures contain amide groups with energetically unfavorable cis or generously cis conformations. Even more surprisingly, 8.1% of N6-acetyllysine residues in protein crystal structures and 48.2% in NMR structures contain amide groups with energetically prohibitive twisted conformations that approach the transition state structure for cis-trans isomerization. In contrast, 109 unique N-alkylacetamide groups contained in 84 highly-accurate small molecule crystal structures retrieved from the Cambridge Structural Database exclusively adopt energetically preferred trans conformations. Therefore, we conclude that cis and twisted N6-acetyllysine amides in protein structures deposited in the PDB are erroneously modeled due to their energetically unfavorable or prohibitive conformations. PMID:23401043
Mulnix, Amy B
2003-01-01
Undergraduate biology curricula are being modified to model and teach the activities of scientists better. The assignment described here, one that investigates protein structure and function, was designed for use in a sophomore-level cell physiology course at Earlham College. Students work in small groups to read and present in poster format on the content of a single research article reporting on the structure and/or function of a protein. Goals of the assignment include highlighting the interdependence of protein structure and function; asking students to review, integrate, and apply previously acquired knowledge; and helping students see protein structure/function in a context larger than cell physiology. The assignment also is designed to build skills in reading scientific literature, oral and written communication, and collaboration among peers. Assessment of student perceptions of the assignment in two separate offerings indicates that the project successfully achieves these goals. Data specifically show that students relied heavily on their peers to understand their article. The assignment was also shown to require students to read articles more carefully than previously. In addition, the data suggest that the assignment could be modified and used successfully in other courses and at other institutions.
Li, Yang; Yang, Jianyi
2017-04-24
The prediction of protein-ligand binding affinity has recently been improved remarkably by machine-learning-based scoring functions. For example, using a set of simple descriptors representing the atomic distance counts, the RF-Score improves the Pearson correlation coefficient to about 0.8 on the core set of the PDBbind 2007 database, which is significantly higher than the performance of any conventional scoring function on the same benchmark. A few studies have been made to discuss the performance of machine-learning-based methods, but the reason for this improvement remains unclear. In this study, by systemically controlling the structural and sequence similarity between the training and test proteins of the PDBbind benchmark, we demonstrate that protein structural and sequence similarity makes a significant impact on machine-learning-based methods. After removal of training proteins that are highly similar to the test proteins identified by structure alignment and sequence alignment, machine-learning-based methods trained on the new training sets do not outperform the conventional scoring functions any more. On the contrary, the performance of conventional functions like X-Score is relatively stable no matter what training data are used to fit the weights of its energy terms.
Some of the most interesting CASP11 targets through the eyes of their authors
Kryshtafovych, Andriy; Moult, John; Baslé, Arnaud; Burgin, Alex; Craig, Timothy K.; Edwards, Robert A.; Fass, Deborah; Hartmann, Marcus D.; Korycinski, Mateusz; Lewis, Richard J.; Lorimer, Donald; Lupas, Andrei N.; Newman, Janet; Peat, Thomas S.; Piepenbrink, Kurt H.; Prahlad, Janani; van Raaij, Mark J.; Rohwer, Forest; Segall, Anca M.; Seguritan, Victor; Sundberg, Eric J.; Singh, Abhimanyu K.; Wilson, Mark A.
2015-01-01
ABSTRACT The Critical Assessment of protein Structure Prediction (CASP) experiment would not have been possible without the prediction targets provided by the experimental structural biology community. In this article, selected crystallographers providing targets for the CASP11 experiment discuss the functional and biological significance of the target proteins, highlight their most interesting structural features, and assess whether these features were correctly reproduced in the predictions submitted to CASP11. Proteins 2016; 84(Suppl 1):34–50. © 2015 The Authors. Proteins: Structure, Function, and Bioinformatics Published by Wiley Periodicals, Inc. PMID:26473983
Structure-function insights of membrane and soluble proteins revealed by electron crystallography.
Dreaden, Tina M; Devarajan, Bharanidharan; Barry, Bridgette A; Schmidt-Krey, Ingeborg
2013-01-01
Electron crystallography is emerging as an important method in solving protein structures. While it has found extensive applications in the understanding of membrane protein structure and function at a wide range of resolutions, from revealing oligomeric arrangements to atomic models, electron crystallography has also provided invaluable information on the soluble α/β-tubulin which could not be obtained by any other method to date. Examples of critical insights from selected structures of membrane proteins as well as α/β-tubulin are described here, demonstrating the vast potential of electron crystallography that is first beginning to unfold.
Controllable assembly and disassembly of nanoparticle systems via protein and DNA agents
Lee, Soo-Kwan; Gang, Oleg; van der Lelie, Daniel
2014-05-20
The invention relates to the use of peptides, proteins, and other oligomers to provide a means by which normally quenched nanoparticle fluorescence may be recovered upon detection of a target molecule. Further, the inventive technology provides a structure and method to carry out detection of target molecules without the need to label the target molecules before detection. In another aspect, a method for forming arbitrarily shaped two- and three-dimensional protein-mediated nanoparticle structures and the resulting structures are described. Proteins mediating structure formation may themselves be functionalized with a variety of useful moieties, including catalytic functional groups.
Towards fully automated structure-based function prediction in structural genomics: a case study.
Watson, James D; Sanderson, Steve; Ezersky, Alexandra; Savchenko, Alexei; Edwards, Aled; Orengo, Christine; Joachimiak, Andrzej; Laskowski, Roman A; Thornton, Janet M
2007-04-13
As the global Structural Genomics projects have picked up pace, the number of structures annotated in the Protein Data Bank as hypothetical protein or unknown function has grown significantly. A major challenge now involves the development of computational methods to assign functions to these proteins accurately and automatically. As part of the Midwest Center for Structural Genomics (MCSG) we have developed a fully automated functional analysis server, ProFunc, which performs a battery of analyses on a submitted structure. The analyses combine a number of sequence-based and structure-based methods to identify functional clues. After the first stage of the Protein Structure Initiative (PSI), we review the success of the pipeline and the importance of structure-based function prediction. As a dataset, we have chosen all structures solved by the MCSG during the 5 years of the first PSI. Our analysis suggests that two of the structure-based methods are particularly successful and provide examples of local similarity that is difficult to identify using current sequence-based methods. No one method is successful in all cases, so, through the use of a number of complementary sequence and structural approaches, the ProFunc server increases the chances that at least one method will find a significant hit that can help elucidate function. Manual assessment of the results is a time-consuming process and subject to individual interpretation and human error. We present a method based on the Gene Ontology (GO) schema using GO-slims that can allow the automated assessment of hits with a success rate approaching that of expert manual assessment.
Introduction to Protein Structure through Genetic Diseases
ERIC Educational Resources Information Center
Schneider, Tanya L.; Linton, Brian R.
2008-01-01
An illuminating way to learn about protein function is to explore high-resolution protein structures. Analysis of the proteins involved in genetic diseases has been used to introduce students to protein structure and the role that individual mutations can play in the onset of disease. Known mutations can be correlated to changes in protein…
NMR relaxation studies on the hydrate layer of intrinsically unstructured proteins.
Bokor, Mónika; Csizmók, Veronika; Kovács, Dénes; Bánki, Péter; Friedrich, Peter; Tompa, Peter; Tompa, Kálmán
2005-03-01
Intrinsically unstructured/disordered proteins (IUPs) exist in a disordered and largely solvent-exposed, still functional, structural state under physiological conditions. As their function is often directly linked with structural disorder, understanding their structure-function relationship in detail is a great challenge to structural biology. In particular, their hydration and residual structure, both closely linked with their mechanism of action, require close attention. Here we demonstrate that the hydration of IUPs can be adequately approached by a technique so far unexplored with respect to IUPs, solid-state NMR relaxation measurements. This technique provides quantitative information on various features of hydrate water bound to these proteins. By freezing nonhydrate (bulk) water out, we have been able to measure free induction decays pertaining to protons of bound water from which the amount of hydrate water, its activation energy, and correlation times could be calculated. Thus, for three IUPs, the first inhibitory domain of calpastatin, microtubule-associated protein 2c, and plant dehydrin early responsive to dehydration 10, we demonstrate that they bind a significantly larger amount of water than globular proteins, whereas their suboptimal hydration and relaxation parameters are correlated with their differing modes of function. The theoretical treatment and experimental approach presented in this article may have general utility in characterizing proteins that belong to this novel structural class.
From protein structure to function via single crystal optical spectroscopy
Ronda, Luca; Bruno, Stefano; Bettati, Stefano; Storici, Paola; Mozzarelli, Andrea
2015-01-01
The more than 100,000 protein structures determined by X-ray crystallography provide a wealth of information for the characterization of biological processes at the molecular level. However, several crystallographic “artifacts,” including conformational selection, crystallization conditions and radiation damages, may affect the quality and the interpretation of the electron density maps, thus limiting the relevance of structure determinations. Moreover, for most of these structures, no functional data have been obtained in the crystalline state, thus posing serious questions on their validity in infereing protein mechanisms. In order to solve these issues, spectroscopic methods have been applied for the determination of equilibrium and kinetic properties of proteins in the crystalline state. These methods are UV-vis spectrophotometry, spectrofluorimetry, IR, EPR, Raman, and resonance Raman spectroscopy. Some of these approaches have been implemented with on-line instruments at X-ray synchrotron beamlines. Here, we provide an overview of investigations predominantly carried out in our laboratory by single crystal polarized absorption UV-vis microspectrophotometry, the most applied technique for the functional characterization of proteins in the crystalline state. Studies on hemoglobins, pyridoxal 5′-phosphate dependent enzymes and green fluorescent protein in the crystalline state have addressed key biological issues, leading to either straightforward structure-function correlations or limitations to structure-based mechanisms. PMID:25988179
Liu, Mengjie; Duan, Liangwei; Wang, Meifang; Zeng, Hongmei; Liu, Xinqi; Qiu, Dewen
2016-01-01
The protein elicitor MoHrip2, which was extracted from Magnaporthe oryzae as an exocrine protein, triggers the tobacco immune system and enhances blast resistance in rice. However, the detailed mechanisms by which MoHrip2 acts as an elicitor remain unclear. Here, we investigated the structure of MoHrip2 to elucidate its functions based on molecular structure. The three-dimensional structure of MoHrip2 was obtained. Overall, the crystal structure formed a β-barrel structure and showed high similarity to the pathogenesis-related (PR) thaumatin superfamily protein thaumatin-like xylanase inhibitor (TL-XI). To investigate the functional regions responsible for MoHrip2 elicitor activities, the full length and eight truncated proteins were expressed in Escherichia coli and were evaluated for elicitor activity in tobacco. Biological function analysis showed that MoHrip2 triggered the defense system against Botrytis cinerea in tobacco. Moreover, only MoHrip2M14 and other fragments containing the 14 amino acids residues in the middle region of the protein showed the elicitor activity of inducing a hypersensitive response and resistance related pathways, which were similar to that of full-length MoHrip2. These results revealed that the central 14 amino acid residues were essential for anti-pathogenic activity.
Protein Design Using Unnatural Amino Acids
NASA Astrophysics Data System (ADS)
Bilgiçer, Basar; Kumar, Krishna
2003-11-01
With the increasing availability of whole organism genome sequences, understanding protein structure and function is of capital importance. Recent developments in the methodology of incorporation of unnatural amino acids into proteins allow the exploration of proteins at a very detailed level. Furthermore, de novo design of novel protein structures and function is feasible with unprecedented sophistication. Using examples from the literature, this article describes the available methods for unnatural amino acid incorporation and highlights some recent applications including the design of hyperstable protein folds.
PROFESS: a PROtein Function, Evolution, Structure and Sequence database
Triplet, Thomas; Shortridge, Matthew D.; Griep, Mark A.; Stark, Jaime L.; Powers, Robert; Revesz, Peter
2010-01-01
The proliferation of biological databases and the easy access enabled by the Internet is having a beneficial impact on biological sciences and transforming the way research is conducted. There are ∼1100 molecular biology databases dispersed throughout the Internet. To assist in the functional, structural and evolutionary analysis of the abundant number of novel proteins continually identified from whole-genome sequencing, we introduce the PROFESS (PROtein Function, Evolution, Structure and Sequence) database. Our database is designed to be versatile and expandable and will not confine analysis to a pre-existing set of data relationships. A fundamental component of this approach is the development of an intuitive query system that incorporates a variety of similarity functions capable of generating data relationships not conceived during the creation of the database. The utility of PROFESS is demonstrated by the analysis of the structural drift of homologous proteins and the identification of potential pancreatic cancer therapeutic targets based on the observation of protein–protein interaction networks. Database URL: http://cse.unl.edu/∼profess/ PMID:20624718
Accounting for epistatic interactions improves the functional analysis of protein structures.
Wilkins, Angela D; Venner, Eric; Marciano, David C; Erdin, Serkan; Atri, Benu; Lua, Rhonald C; Lichtarge, Olivier
2013-11-01
The constraints under which sequence, structure and function coevolve are not fully understood. Bringing this mutual relationship to light can reveal the molecular basis of binding, catalysis and allostery, thereby identifying function and rationally guiding protein redesign. Underlying these relationships are the epistatic interactions that occur when the consequences of a mutation to a protein are determined by the genetic background in which it occurs. Based on prior data, we hypothesize that epistatic forces operate most strongly between residues nearby in the structure, resulting in smooth evolutionary importance across the structure. We find that when residue scores of evolutionary importance are distributed smoothly between nearby residues, functional site prediction accuracy improves. Accordingly, we designed a novel measure of evolutionary importance that focuses on the interaction between pairs of structurally neighboring residues. This measure that we term pair-interaction Evolutionary Trace yields greater functional site overlap and better structure-based proteome-wide functional predictions. Our data show that the structural smoothness of evolutionary importance is a fundamental feature of the coevolution of sequence, structure and function. Mutations operate on individual residues, but selective pressure depends in part on the extent to which a mutation perturbs interactions with neighboring residues. In practice, this principle led us to redefine the importance of a residue in terms of the importance of its epistatic interactions with neighbors, yielding better annotation of functional residues, motivating experimental validation of a novel functional site in LexA and refining protein function prediction. lichtarge@bcm.edu. Supplementary data are available at Bioinformatics online.
Accounting for epistatic interactions improves the functional analysis of protein structures
Wilkins, Angela D.; Venner, Eric; Marciano, David C.; Erdin, Serkan; Atri, Benu; Lua, Rhonald C.; Lichtarge, Olivier
2013-01-01
Motivation: The constraints under which sequence, structure and function coevolve are not fully understood. Bringing this mutual relationship to light can reveal the molecular basis of binding, catalysis and allostery, thereby identifying function and rationally guiding protein redesign. Underlying these relationships are the epistatic interactions that occur when the consequences of a mutation to a protein are determined by the genetic background in which it occurs. Based on prior data, we hypothesize that epistatic forces operate most strongly between residues nearby in the structure, resulting in smooth evolutionary importance across the structure. Methods and Results: We find that when residue scores of evolutionary importance are distributed smoothly between nearby residues, functional site prediction accuracy improves. Accordingly, we designed a novel measure of evolutionary importance that focuses on the interaction between pairs of structurally neighboring residues. This measure that we term pair-interaction Evolutionary Trace yields greater functional site overlap and better structure-based proteome-wide functional predictions. Conclusions: Our data show that the structural smoothness of evolutionary importance is a fundamental feature of the coevolution of sequence, structure and function. Mutations operate on individual residues, but selective pressure depends in part on the extent to which a mutation perturbs interactions with neighboring residues. In practice, this principle led us to redefine the importance of a residue in terms of the importance of its epistatic interactions with neighbors, yielding better annotation of functional residues, motivating experimental validation of a novel functional site in LexA and refining protein function prediction. Contact: lichtarge@bcm.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:24021383
Knutson, Stacy T; Westwood, Brian M; Leuthaeuser, Janelle B; Turner, Brandon E; Nguyendac, Don; Shea, Gabrielle; Kumar, Kiran; Hayden, Julia D; Harper, Angela F; Brown, Shoshana D; Morris, John H; Ferrin, Thomas E; Babbitt, Patricia C; Fetrow, Jacquelyn S
2017-04-01
Protein function identification remains a significant problem. Solving this problem at the molecular functional level would allow mechanistic determinant identification-amino acids that distinguish details between functional families within a superfamily. Active site profiling was developed to identify mechanistic determinants. DASP and DASP2 were developed as tools to search sequence databases using active site profiling. Here, TuLIP (Two-Level Iterative clustering Process) is introduced as an iterative, divisive clustering process that utilizes active site profiling to separate structurally characterized superfamily members into functionally relevant clusters. Underlying TuLIP is the observation that functionally relevant families (curated by Structure-Function Linkage Database, SFLD) self-identify in DASP2 searches; clusters containing multiple functional families do not. Each TuLIP iteration produces candidate clusters, each evaluated to determine if it self-identifies using DASP2. If so, it is deemed a functionally relevant group. Divisive clustering continues until each structure is either a functionally relevant group member or a singlet. TuLIP is validated on enolase and glutathione transferase structures, superfamilies well-curated by SFLD. Correlation is strong; small numbers of structures prevent statistically significant analysis. TuLIP-identified enolase clusters are used in DASP2 GenBank searches to identify sequences sharing functional site features. Analysis shows a true positive rate of 96%, false negative rate of 4%, and maximum false positive rate of 4%. F-measure and performance analysis on the enolase search results and comparison to GEMMA and SCI-PHY demonstrate that TuLIP avoids the over-division problem of these methods. Mechanistic determinants for enolase families are evaluated and shown to correlate well with literature results. © 2017 The Authors Protein Science published by Wiley Periodicals, Inc. on behalf of The Protein Society.
Leuthaeuser, Janelle B; Knutson, Stacy T; Kumar, Kiran; Babbitt, Patricia C; Fetrow, Jacquelyn S
2015-01-01
The development of accurate protein function annotation methods has emerged as a major unsolved biological problem. Protein similarity networks, one approach to function annotation via annotation transfer, group proteins into similarity-based clusters. An underlying assumption is that the edge metric used to identify such clusters correlates with functional information. In this contribution, this assumption is evaluated by observing topologies in similarity networks using three different edge metrics: sequence (BLAST), structure (TM-Align), and active site similarity (active site profiling, implemented in DASP). Network topologies for four well-studied protein superfamilies (enolase, peroxiredoxin (Prx), glutathione transferase (GST), and crotonase) were compared with curated functional hierarchies and structure. As expected, network topology differs, depending on edge metric; comparison of topologies provides valuable information on structure/function relationships. Subnetworks based on active site similarity correlate with known functional hierarchies at a single edge threshold more often than sequence- or structure-based networks. Sequence- and structure-based networks are useful for identifying sequence and domain similarities and differences; therefore, it is important to consider the clustering goal before deciding appropriate edge metric. Further, conserved active site residues identified in enolase and GST active site subnetworks correspond with published functionally important residues. Extension of this analysis yields predictions of functionally determinant residues for GST subgroups. These results support the hypothesis that active site similarity-based networks reveal clusters that share functional details and lay the foundation for capturing functionally relevant hierarchies using an approach that is both automatable and can deliver greater precision in function annotation than current similarity-based methods. PMID:26073648
Su, Min-Gang; Weng, Julia Tzu-Ya; Hsu, Justin Bo-Kai; Huang, Kai-Yao; Chi, Yu-Hsiang; Lee, Tzong-Yi
2017-12-21
Protein post-translational modification (PTM) plays an essential role in various cellular processes that modulates the physical and chemical properties, folding, conformation, stability and activity of proteins, thereby modifying the functions of proteins. The improved throughput of mass spectrometry (MS) or MS/MS technology has not only brought about a surge in proteome-scale studies, but also contributed to a fruitful list of identified PTMs. However, with the increase in the number of identified PTMs, perhaps the more crucial question is what kind of biological mechanisms these PTMs are involved in. This is particularly important in light of the fact that most protein-based pharmaceuticals deliver their therapeutic effects through some form of PTM. Yet, our understanding is still limited with respect to the local effects and frequency of PTM sites near pharmaceutical binding sites and the interfaces of protein-protein interaction (PPI). Understanding PTM's function is critical to our ability to manipulate the biological mechanisms of protein. In this study, to understand the regulation of protein functions by PTMs, we mapped 25,835 PTM sites to proteins with available three-dimensional (3D) structural information in the Protein Data Bank (PDB), including 1785 modified PTM sites on the 3D structure. Based on the acquired structural PTM sites, we proposed to use five properties for the structural characterization of PTM substrate sites: the spatial composition of amino acids, residues and side-chain orientations surrounding the PTM substrate sites, as well as the secondary structure, division of acidity and alkaline residues, and solvent-accessible surface area. We further mapped the structural PTM sites to the structures of drug binding and PPI sites, identifying a total of 1917 PTM sites that may affect PPI and 3951 PTM sites associated with drug-target binding. An integrated analytical platform (CruxPTM), with a variety of methods and online molecular docking tools for exploring the structural characteristics of PTMs, is presented. In addition, all tertiary structures of PTM sites on proteins can be visualized using the JSmol program. Resolving the function of PTM sites is important for understanding the role that proteins play in biological mechanisms. Our work attempted to delineate the structural correlation between PTM sites and PPI or drug-target binding. CurxPTM could help scientists narrow the scope of their PTM research and enhance the efficiency of PTM identification in the face of big proteome data. CruxPTM is now available at http://csb.cse.yzu.edu.tw/CruxPTM/ .
Ruller, Roberto; Silva-Rocha, Rafael; Silva, Artur; Cruz Schneider, Maria Paula; Ward, Richard John
2011-01-01
Protein engineering is a powerful tool, which correlates protein structure with specific functions, both in applied biotechnology and in basic research. Here, we present a practical teaching course for engineering the green fluorescent protein (GFP) from Aequorea victoria by a random mutagenesis strategy using error-prone polymerase chain reaction. Screening of bacterial colonies transformed with random mutant libraries identified GFP variants with increased fluorescence yields. Mapping the three-dimensional structure of these mutants demonstrated how alterations in structural features such as the environment around the fluorophore and properties of the protein surface can influence functional properties such as the intensity of fluorescence and protein solubility. Copyright © 2011 Wiley Periodicals, Inc.
del Sol, Antonio; Araúzo-Bravo, Marcos J; Amoros, Dolors; Nussinov, Ruth
2007-01-01
Background Allosteric communications are vital for cellular signaling. Here we explore a relationship between protein architectural organization and shortcuts in signaling pathways. Results We show that protein domains consist of modules interconnected by residues that mediate signaling through the shortest pathways. These mediating residues tend to be located at the inter-modular boundaries, which are more rigid and display a larger number of long-range interactions than intra-modular regions. The inter-modular boundaries contain most of the residues centrally conserved in the protein fold, which may be crucial for information transfer between amino acids. Our approach to modular decomposition relies on a representation of protein structures as residue-interacting networks, and removal of the most central residue contacts, which are assumed to be crucial for allosteric communications. The modular decomposition of 100 multi-domain protein structures indicates that modules constitute the building blocks of domains. The analysis of 13 allosteric proteins revealed that modules characterize experimentally identified functional regions. Based on the study of an additional functionally annotated dataset of 115 proteins, we propose that high-modularity modules include functional sites and are the basic functional units. We provide examples (the Gαs subunit and P450 cytochromes) to illustrate that the modular architecture of active sites is linked to their functional specialization. Conclusion Our method decomposes protein structures into modules, allowing the study of signal transmission between functional sites. A modular configuration might be advantageous: it allows signaling proteins to expand their regulatory linkages and may elicit a broader range of control mechanisms either via modular combinations or through modulation of inter-modular linkages. PMID:17531094
A new definition and properties of the similarity value between two protein structures.
Saberi Fathi, S M
2016-10-01
Knowledge regarding the 3D structure of a protein provides useful information about the protein's functional properties. Particularly, structural similarity between proteins can be used as a good predictor of functional similarity. One method that uses the 3D geometrical structure of proteins in order to compare them is the similarity value (SV). In this paper, we introduce a new definition of the SV measure for comparing two proteins. To this end, we consider the mass of the protein's atoms and concentrate on the number of protein's atoms to be compared. This defines a new measure, called the weighted similarity value (WSV), adding physical properties to geometrical properties. We also show that our results are in good agreement with the results obtained by TM-SCORE and DALILITE. WSV can be of use in protein classification and in drug discovery.
Unusual biophysics of intrinsically disordered proteins.
Uversky, Vladimir N
2013-05-01
Research of a past decade and a half leaves no doubt that complete understanding of protein functionality requires close consideration of the fact that many functional proteins do not have well-folded structures. These intrinsically disordered proteins (IDPs) and proteins with intrinsically disordered protein regions (IDPRs) are highly abundant in nature and play a number of crucial roles in a living cell. Their functions, which are typically associated with a wide range of intermolecular interactions where IDPs possess remarkable binding promiscuity, complement functional repertoire of ordered proteins. All this requires a close attention to the peculiarities of biophysics of these proteins. In this review, some key biophysical features of IDPs are covered. In addition to the peculiar sequence characteristics of IDPs these biophysical features include sequential, structural, and spatiotemporal heterogeneity of IDPs; their rough and relatively flat energy landscapes; their ability to undergo both induced folding and induced unfolding; the ability to interact specifically with structurally unrelated partners; the ability to gain different structures at binding to different partners; and the ability to keep essential amount of disorder even in the bound form. IDPs are also characterized by the "turned-out" response to the changes in their environment, where they gain some structure under conditions resulting in denaturation or even unfolding of ordered proteins. It is proposed that the heterogeneous spatiotemporal structure of IDPs/IDPRs can be described as a set of foldons, inducible foldons, semi-foldons, non-foldons, and unfoldons. They may lose their function when folded, and activation of some IDPs is associated with the awaking of the dormant disorder. It is possible that IDPs represent the "edge of chaos" systems which operate in a region between order and complete randomness or chaos, where the complexity is maximal. This article is part of a Special Issue entitled: The emerging dynamic view of proteins: Protein plasticity in allostery, evolution and self-assembly. Copyright © 2012 Elsevier B.V. All rights reserved.
Optimizing physical energy functions for protein folding.
Fujitsuka, Yoshimi; Takada, Shoji; Luthey-Schulten, Zaida A; Wolynes, Peter G
2004-01-01
We optimize a physical energy function for proteins with the use of the available structural database and perform three benchmark tests of the performance: (1) recognition of native structures in the background of predefined decoy sets of Levitt, (2) de novo structure prediction using fragment assembly sampling, and (3) molecular dynamics simulations. The energy parameter optimization is based on the energy landscape theory and uses a Monte Carlo search to find a set of parameters that seeks the largest ratio deltaE(s)/DeltaE for all proteins in a training set simultaneously. Here, deltaE(s) is the stability gap between the native and the average in the denatured states and DeltaE is the energy fluctuation among these states. Some of the energy parameters optimized are found to show significant correlation with experimentally observed quantities: (1) In the recognition test, the optimized function assigns the lowest energy to either the native or a near-native structure among many decoy structures for all the proteins studied. (2) Structure prediction with the fragment assembly sampling gives structure models with root mean square deviation less than 6 A in one of the top five cluster centers for five of six proteins studied. (3) Structure prediction using molecular dynamics simulation gives poorer performance, implying the importance of having a more precise description of local structures. The physical energy function solely inferred from a structural database neither utilizes sequence information from the family of the target nor the outcome of the secondary structure prediction but can produce the correct native fold for many small proteins. Copyright 2003 Wiley-Liss, Inc.
Modularity in protein structures: study on all-alpha proteins.
Khan, Taushif; Ghosh, Indira
2015-01-01
Modularity is known as one of the most important features of protein's robust and efficient design. The architecture and topology of proteins play a vital role by providing necessary robust scaffolds to support organism's growth and survival in constant evolutionary pressure. These complex biomolecules can be represented by several layers of modular architecture, but it is pivotal to understand and explore the smallest biologically relevant structural component. In the present study, we have developed a component-based method, using protein's secondary structures and their arrangements (i.e. patterns) in order to investigate its structural space. Our result on all-alpha protein shows that the known structural space is highly populated with limited set of structural patterns. We have also noticed that these frequently observed structural patterns are present as modules or "building blocks" in large proteins (i.e. higher secondary structure content). From structural descriptor analysis, observed patterns are found to be within similar deviation; however, frequent patterns are found to be distinctly occurring in diverse functions e.g. in enzymatic classes and reactions. In this study, we are introducing a simple approach to explore protein structural space using combinatorial- and graph-based geometry methods, which can be used to describe modularity in protein structures. Moreover, analysis indicates that protein function seems to be the driving force that shapes the known structure space.
MacRae, T H
2000-06-01
Small heat shock/alpha-crystallin proteins are defined by conserved sequence of approximately 90 amino acid residues, termed the alpha-crystallin domain, which is bounded by variable amino- and carboxy-terminal extensions. These proteins form oligomers, most of uncertain quaternary structure, and oligomerization is prerequisite to their function as molecular chaperones. Sequence modelling and physical analyses show that the secondary structure of small heat shock/alpha-crystallin proteins is predominately beta-pleated sheet. Crystallography, site-directed spin-labelling and yeast two-hybrid selection demonstrate regions of secondary structure within the alpha-crystallin domain that interact during oligomer assembly, a process also dependent on the amino terminus. Oligomers are dynamic, exhibiting subunit exchange and organizational plasticity, perhaps leading to functional diversity. Exposure of hydrophobic residues by structural modification facilitates chaperoning where denaturing proteins in the molten globule state associate with oligomers. The flexible carboxy-terminal extension contributes to chaperone activity by enhancing the solubility of small heat shock/alpha-crystallin proteins. Site-directed mutagenesis has yielded proteins where the effect of the change on structure and function depends upon the residue modified, the organism under study and the analytical techniques used. Most revealing, substitution of a conserved arginine residue within the alpha-crystallin domain has a major impact on quaternary structure and chaperone action probably through realignment of beta-sheets. These mutations are linked to inherited diseases. Oligomer size is regulated by a stress-responsive cascade including MAPKAP kinase 2/3 and p38. Phosphorylation of small heat shock/alpha-crystallin proteins has important consequences within stressed cells, especially for microfilaments.
CASTp 3.0: computed atlas of surface topography of proteins.
Tian, Wei; Chen, Chang; Lei, Xue; Zhao, Jieling; Liang, Jie
2018-06-01
Geometric and topological properties of protein structures, including surface pockets, interior cavities and cross channels, are of fundamental importance for proteins to carry out their functions. Computed Atlas of Surface Topography of proteins (CASTp) is a web server that provides online services for locating, delineating and measuring these geometric and topological properties of protein structures. It has been widely used since its inception in 2003. In this article, we present the latest version of the web server, CASTp 3.0. CASTp 3.0 continues to provide reliable and comprehensive identifications and quantifications of protein topography. In addition, it now provides: (i) imprints of the negative volumes of pockets, cavities and channels, (ii) topographic features of biological assemblies in the Protein Data Bank, (iii) improved visualization of protein structures and pockets, and (iv) more intuitive structural and annotated information, including information of secondary structure, functional sites, variant sites and other annotations of protein residues. The CASTp 3.0 web server is freely accessible at http://sts.bioe.uic.edu/castp/.
Kim, Sanggil; Ko, Wooseok; Sung, Bong Hyun; Kim, Sun Chang; Lee, Hyun Soo
2016-11-15
Proteins often function as complex structures in conjunction with other proteins. Because these complex structures are essential for sophisticated functions, developing protein-protein conjugates has gained research interest. In this study, site-specific protein-protein conjugation was performed by genetically incorporating an azide-containing amino acid into one protein and a bicyclononyne (BCN)-containing amino acid into the other. Three to four sites in each of the proteins were tested for conjugation efficiency, and three combinations showed excellent conjugation efficiency. The genetic incorporation of unnatural amino acids (UAAs) is technically simple and produces the mutant protein in high yield. In addition, the conjugation reaction can be conducted by simple mixing, and does not require additional reagents or linker molecules. Therefore, this method may prove very useful for generating protein-protein conjugates and protein complexes of biochemical significance. Copyright © 2016. Published by Elsevier Ltd.
Challenges in NMR-based structural genomics
NASA Astrophysics Data System (ADS)
Sue, Shih-Che; Chang, Chi-Fon; Huang, Yao-Te; Chou, Ching-Yu; Huang, Tai-huang
2005-05-01
Understanding the functions of the vast number of proteins encoded in many genomes that have been completely sequenced recently is the main challenge for biologists in the post-genomics era. Since the function of a protein is determined by its exact three-dimensional structure it is paramount to determine the 3D structures of all proteins. This need has driven structural biologists to undertake the structural genomics project aimed at determining the structures of all known proteins. Several centers for structural genomics studies have been established throughout the world. Nuclear magnetic resonance (NMR) spectroscopy has played a major role in determining protein structures in atomic details and in a physiologically relevant solution state. Since the number of new genes being discovered daily far exceeds the number of structures determined by both NMR and X-ray crystallography, a high-throughput method for speeding up the process of protein structure determination is essential for the success of the structural genomics effort. In this article we will describe NMR methods currently being employed for protein structure determination. We will also describe methods under development which may drastically increase the throughput, as well as point out areas where opportunities exist for biophysicists to make significant contribution in this important field.
Khafizov, Kamil; Madrid-Aliste, Carlos; Almo, Steven C.; Fiser, Andras
2014-01-01
The exponential growth of protein sequence data provides an ever-expanding body of unannotated and misannotated proteins. The National Institutes of Health-supported Protein Structure Initiative and related worldwide structural genomics efforts facilitate functional annotation of proteins through structural characterization. Recently there have been profound changes in the taxonomic composition of sequence databases, which are effectively redefining the scope and contribution of these large-scale structure-based efforts. The faster-growing bacterial genomic entries have overtaken the eukaryotic entries over the last 5 y, but also have become more redundant. Despite the enormous increase in the number of sequences, the overall structural coverage of proteins—including proteins for which reliable homology models can be generated—on the residue level has increased from 30% to 40% over the last 10 y. Structural genomics efforts contributed ∼50% of this new structural coverage, despite determining only ∼10% of all new structures. Based on current trends, it is expected that ∼55% structural coverage (the level required for significant functional insight) will be achieved within 15 y, whereas without structural genomics efforts, realizing this goal will take approximately twice as long. PMID:24567391
Thermostability promotes the cooperative function of split adenylate kinases.
Nguyen, Peter Q; Liu, Shirley; Thompson, Jeremy C; Silberg, Jonathan J
2008-05-01
Proteins can often be cleaved to create inactive polypeptides that associate into functional complexes through non-covalent interactions, but little is known about what influences the cooperative function of the ensuing protein fragments. Here, we examine whether protein thermostability affects protein fragment complementation by characterizing the function of split adenylate kinases from the mesophile Bacillus subtilis (AKBs) and the hyperthermophile Thermotoga neapolitana (AKTn). Complementation studies revealed that the split AKTn supported the growth of Escherichia coli with a temperature-sensitive AK, but not the fragmented AKBs. However, weak complementation occurred when the AKBs fragments were fused to polypeptides that strongly associate, and this was enhanced by a Q16L mutation that thermostabilizes the full-length protein. To examine how the split AK homologs differ in structure and function, their catalytic activity, zinc content, and circular dichroism spectra were characterized. The reconstituted AKTn had higher levels of zinc, greater secondary structure, and >10(3)-fold more activity than the AKBs pair, albeit 17-fold less active than full-length AKTn. These findings provide evidence that the design of protein fragments that cooperatively function can be improved by choosing proteins with the greatest thermostability for bisection, and they suggest that this arises because hyperthermophilic protein fragments exhibit greater residual structure compared to their mesophilic counterparts.
Network Analysis of Protein Adaptation: Modeling the Functional Impact of Multiple Mutations
Beleva Guthrie, Violeta; Masica, David L; Fraser, Andrew; Federico, Joseph; Fan, Yunfan; Camps, Manel; Karchin, Rachel
2018-01-01
Abstract The evolution of new biochemical activities frequently involves complex dependencies between mutations and rapid evolutionary radiation. Mutation co-occurrence and covariation have previously been used to identify compensating mutations that are the result of physical contacts and preserve protein function and fold. Here, we model pairwise functional dependencies and higher order interactions that enable evolution of new protein functions. We use a network model to find complex dependencies between mutations resulting from evolutionary trade-offs and pleiotropic effects. We present a method to construct these networks and to identify functionally interacting mutations in both extant and reconstructed ancestral sequences (Network Analysis of Protein Adaptation). The time ordering of mutations can be incorporated into the networks through phylogenetic reconstruction. We apply NAPA to three distantly homologous β-lactamase protein clusters (TEM, CTX-M-3, and OXA-51), each of which has experienced recent evolutionary radiation under substantially different selective pressures. By analyzing the network properties of each protein cluster, we identify key adaptive mutations, positive pairwise interactions, different adaptive solutions to the same selective pressure, and complex evolutionary trajectories likely to increase protein fitness. We also present evidence that incorporating information from phylogenetic reconstruction and ancestral sequence inference can reduce the number of spurious links in the network, whereas preserving overall network community structure. The analysis does not require structural or biochemical data. In contrast to function-preserving mutation dependencies, which are frequently from structural contacts, gain-of-function mutation dependencies are most commonly between residues distal in protein structure. PMID:29522102
Agrawal, Neeraj J; Helk, Bernhard; Trout, Bernhardt L
2014-01-21
Identifying hot-spot residues - residues that are critical to protein-protein binding - can help to elucidate a protein's function and assist in designing therapeutic molecules to target those residues. We present a novel computational tool, termed spatial-interaction-map (SIM), to predict the hot-spot residues of an evolutionarily conserved protein-protein interaction from the structure of an unbound protein alone. SIM can predict the protein hot-spot residues with an accuracy of 36-57%. Thus, the SIM tool can be used to predict the yet unknown hot-spot residues for many proteins for which the structure of the protein-protein complexes are not available, thereby providing a clue to their functions and an opportunity to design therapeutic molecules to target these proteins. Copyright © 2013 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved.
Sequential Release of Proteins from Structured Multishell Microcapsules.
Shimanovich, Ulyana; Michaels, Thomas C T; De Genst, Erwin; Matak-Vinkovic, Dijana; Dobson, Christopher M; Knowles, Tuomas P J
2017-10-09
In nature, a wide range of functional materials is based on proteins. Increasing attention is also turning to the use of proteins as artificial biomaterials in the form of films, gels, particles, and fibrils that offer great potential for applications in areas ranging from molecular medicine to materials science. To date, however, most such applications have been limited to single component materials despite the fact that their natural analogues are composed of multiple types of proteins with a variety of functionalities that are coassembled in a highly organized manner on the micrometer scale, a process that is currently challenging to achieve in the laboratory. Here, we demonstrate the fabrication of multicomponent protein microcapsules where the different components are positioned in a controlled manner. We use molecular self-assembly to generate multicomponent structures on the nanometer scale and droplet microfluidics to bring together the different components on the micrometer scale. Using this approach, we synthesize a wide range of multiprotein microcapsules containing three well-characterized proteins: glucagon, insulin, and lysozyme. The localization of each protein component in multishell microcapsules has been detected by labeling protein molecules with different fluorophores, and the final three-dimensional microcapsule structure has been resolved by using confocal microscopy together with image analysis techniques. In addition, we show that these structures can be used to tailor the release of such functional proteins in a sequential manner. Moreover, our observations demonstrate that the protein release mechanism from multishell capsules is driven by the kinetic control of mass transport of the cargo and by the dissolution of the shells. The ability to generate artificial materials that incorporate a variety of different proteins with distinct functionalities increases the breadth of the potential applications of artificial protein-based materials and provides opportunities to design more refined functional protein delivery systems.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Leite, Wellington C.; Galvão, Carolina W.; Saab, Sérgio C.
The bacterial RecA protein plays a role in the complex system of DNA damage repair. Here, we report the functional and structural characterization of the Herbaspirillum seropedicae RecA protein (HsRecA). HsRecA protein is more efficient at displacing SSB protein from ssDNA than Escherichia coli RecA protein. HsRecA also promotes DNA strand exchange more efficiently. The three dimensional structure of HsRecA-ADP/ATP complex has been solved to 1.7 Å resolution. HsRecA protein contains a small N-terminal domain, a central core ATPase domain and a large C-terminal domain, that are similar to homologous bacterial RecA proteins. Comparative structural analysis showed that the N-terminalmore » polymerization motif of archaeal and eukaryotic RecA family proteins are also present in bacterial RecAs. Reconstruction of electrostatic potential from the hexameric structure of HsRecA-ADP/ATP revealed a high positive charge along the inner side, where ssDNA is bound inside the filament. The properties of this surface may explain the greater capacity of HsRecA protein to bind ssDNA, forming a contiguous nucleoprotein filament, displace SSB and promote DNA exchange relative to EcRecA. In conclusion, our functional and structural analyses provide insight into the molecular mechanisms of polymerization of bacterial RecA as a helical nucleoprotein filament.« less
Galvão, Carolina W.; Saab, Sérgio C.; Iulek, Jorge; Etto, Rafael M.; Steffens, Maria B. R.; Chitteni-Pattu, Sindhu; Stanage, Tyler; Keck, James L.; Cox, Michael M.
2016-01-01
The bacterial RecA protein plays a role in the complex system of DNA damage repair. Here, we report the functional and structural characterization of the Herbaspirillum seropedicae RecA protein (HsRecA). HsRecA protein is more efficient at displacing SSB protein from ssDNA than Escherichia coli RecA protein. HsRecA also promotes DNA strand exchange more efficiently. The three dimensional structure of HsRecA-ADP/ATP complex has been solved to 1.7 Å resolution. HsRecA protein contains a small N-terminal domain, a central core ATPase domain and a large C-terminal domain, that are similar to homologous bacterial RecA proteins. Comparative structural analysis showed that the N-terminal polymerization motif of archaeal and eukaryotic RecA family proteins are also present in bacterial RecAs. Reconstruction of electrostatic potential from the hexameric structure of HsRecA-ADP/ATP revealed a high positive charge along the inner side, where ssDNA is bound inside the filament. The properties of this surface may explain the greater capacity of HsRecA protein to bind ssDNA, forming a contiguous nucleoprotein filament, displace SSB and promote DNA exchange relative to EcRecA. Our functional and structural analyses provide insight into the molecular mechanisms of polymerization of bacterial RecA as a helical nucleoprotein filament. PMID:27447485
Native State Volume Fluctuations in Proteins as a Mechanism for Dynamic Allostery.
Law, Anthony B; Sapienza, Paul J; Zhang, Jun; Zuo, Xiaobing; Petit, Chad M
2017-03-15
Allostery enables tight regulation of protein function in the cellular environment. Although existing models of allostery are firmly rooted in the current structure-function paradigm, the mechanistic basis for allostery in the absence of structural change remains unclear. In this study, we show that a typical globular protein is able to undergo significant changes in volume under native conditions while exhibiting no additional changes in protein structure. These native state volume fluctuations were found to correlate with changes in internal motions that were previously recognized as a source of allosteric entropy. This finding offers a novel mechanistic basis for allostery in the absence of canonical structural change. The unexpected observation that function can be derived from expanded, low density protein states has broad implications for our understanding of allostery and suggests that the general concept of the native state be expanded to allow for more variable physical dimensions with looser packing.
Taylor, Gregory K.; Stoddard, Barry L.
2012-01-01
Homing endonucleases (HEs) are highly specific DNA-cleaving enzymes that are encoded by invasive DNA elements (usually mobile introns or inteins) within the genomes of phage, bacteria, archea, protista and eukaryotic organelles. Six unique structural HE families, that collectively span four distinct nuclease catalytic motifs, have been characterized to date. Members of each family display structural homology and functional relationships to a wide variety of proteins from various organisms. The biological functions of those proteins are highly disparate and include non-specific DNA-degradation enzymes, restriction endonucleases, DNA-repair enzymes, resolvases, intron splicing factors and transcription factors. These relationships suggest that modern day HEs share common ancestors with proteins involved in genome fidelity, maintenance and gene expression. This review summarizes the results of structural studies of HEs and corresponding proteins from host organisms that have illustrated the manner in which these factors are related. PMID:22406833
Fast protein tertiary structure retrieval based on global surface shape similarity.
Sael, Lee; Li, Bin; La, David; Fang, Yi; Ramani, Karthik; Rustamov, Raif; Kihara, Daisuke
2008-09-01
Characterization and identification of similar tertiary structure of proteins provides rich information for investigating function and evolution. The importance of structure similarity searches is increasing as structure databases continue to expand, partly due to the structural genomics projects. A crucial drawback of conventional protein structure comparison methods, which compare structures by their main-chain orientation or the spatial arrangement of secondary structure, is that a database search is too slow to be done in real-time. Here we introduce a global surface shape representation by three-dimensional (3D) Zernike descriptors, which represent a protein structure compactly as a series expansion of 3D functions. With this simplified representation, the search speed against a few thousand structures takes less than a minute. To investigate the agreement between surface representation defined by 3D Zernike descriptor and conventional main-chain based representation, a benchmark was performed against a protein classification generated by the combinatorial extension algorithm. Despite the different representation, 3D Zernike descriptor retrieved proteins of the same conformation defined by combinatorial extension in 89.6% of the cases within the top five closest structures. The real-time protein structure search by 3D Zernike descriptor will open up new possibility of large-scale global and local protein surface shape comparison. 2008 Wiley-Liss, Inc.
Zhou, Ren-Bin; Lu, Hui-Meng; Liu, Jie; Shi, Jian-Yu; Zhu, Jing; Lu, Qin-Qin; Yin, Da-Chuan
2016-01-01
Recombinant expression of proteins has become an indispensable tool in modern day research. The large yields of recombinantly expressed proteins accelerate the structural and functional characterization of proteins. Nevertheless, there are literature reported that the recombinant proteins show some differences in structure and function as compared with the native ones. Now there have been more than 100,000 structures (from both recombinant and native sources) publicly available in the Protein Data Bank (PDB) archive, which makes it possible to investigate if there exist any proteins in the RCSB PDB archive that have identical sequence but have some difference in structures. In this paper, we present the results of a systematic comparative study of the 3D structures of identical naturally purified versus recombinantly expressed proteins. The structural data and sequence information of the proteins were mined from the RCSB PDB archive. The combinatorial extension (CE), FATCAT-flexible and TM-Align methods were employed to align the protein structures. The root-mean-square distance (RMSD), TM-score, P-value, Z-score, secondary structural elements and hydrogen bonds were used to assess the structure similarity. A thorough analysis of the PDB archive generated five-hundred-seventeen pairs of native and recombinant proteins that have identical sequence. There were no pairs of proteins that had the same sequence and significantly different structural fold, which support the hypothesis that expression in a heterologous host usually could fold correctly into their native forms.
Zhou, Ren-Bin; Lu, Hui-Meng; Liu, Jie; Shi, Jian-Yu; Zhu, Jing; Lu, Qin-Qin; Yin, Da-Chuan
2016-01-01
Recombinant expression of proteins has become an indispensable tool in modern day research. The large yields of recombinantly expressed proteins accelerate the structural and functional characterization of proteins. Nevertheless, there are literature reported that the recombinant proteins show some differences in structure and function as compared with the native ones. Now there have been more than 100,000 structures (from both recombinant and native sources) publicly available in the Protein Data Bank (PDB) archive, which makes it possible to investigate if there exist any proteins in the RCSB PDB archive that have identical sequence but have some difference in structures. In this paper, we present the results of a systematic comparative study of the 3D structures of identical naturally purified versus recombinantly expressed proteins. The structural data and sequence information of the proteins were mined from the RCSB PDB archive. The combinatorial extension (CE), FATCAT-flexible and TM-Align methods were employed to align the protein structures. The root-mean-square distance (RMSD), TM-score, P-value, Z-score, secondary structural elements and hydrogen bonds were used to assess the structure similarity. A thorough analysis of the PDB archive generated five-hundred-seventeen pairs of native and recombinant proteins that have identical sequence. There were no pairs of proteins that had the same sequence and significantly different structural fold, which support the hypothesis that expression in a heterologous host usually could fold correctly into their native forms. PMID:27517583
Liao, Fei; Yuan, Hong; Du, Ke-Jie; You, Yong; Gao, Shu-Qin; Wen, Ge-Bo; Lin, Ying-Wu; Tan, Xiangshi
2016-10-20
A hydrogen-bond (H-bond) network, specifically a Tyr-associated H-bond network, plays key roles in regulating the structure and function of proteins, as exemplified by abundant heme proteins in nature. To explore an approach for fine-tuning the structure and function of artificial heme proteins, we herein used myoglobin (Mb) as a model protein and introduced a Tyr residue in the secondary sphere of the heme active site at two different positions (107 and 138). We performed X-ray crystallography, UV-Vis spectroscopy, stopped-flow kinetics, and electron paramagnetic resonance (EPR) studies for the two single mutants, I107Y Mb and F138Y Mb, and compared to that of wild-type Mb under the same conditions. The results showed that both Tyr107 and Tyr138 form a distinct H-bond network involving water molecules and neighboring residues, which fine-tunes ligand binding to the heme iron and enhances the protein stability, respectively. Moreover, the Tyr107-associated H-bond network was shown to fine-tune both H2O2 binding and activation. With two cases demonstrated for Mb, this study suggests that the Tyr-associated H-bond network has distinct roles in regulating the protein structure, properties and functions, depending on its location in the protein scaffold. Therefore, it is possible to design a Tyr-associated H-bond network in general to create other artificial heme proteins with improved properties and functions.
Protein Delivery into Plant Cells: Toward In vivo Structural Biology
Cedeño, Cesyen; Pauwels, Kris; Tompa, Peter
2017-01-01
Understanding the biologically relevant structural and functional behavior of proteins inside living plant cells is only possible through the combination of structural biology and cell biology. The state-of-the-art structural biology techniques are typically applied to molecules that are isolated from their native context. Although most experimental conditions can be easily controlled while dealing with an isolated, purified protein, a serious shortcoming of such in vitro work is that we cannot mimic the extremely complex intracellular environment in which the protein exists and functions. Therefore, it is highly desirable to investigate proteins in their natural habitat, i.e., within live cells. This is the major ambition of in-cell NMR, which aims to approach structure-function relationship under true in vivo conditions following delivery of labeled proteins into cells under physiological conditions. With a multidisciplinary approach that includes recombinant protein production, confocal fluorescence microscopy, nuclear magnetic resonance (NMR) spectroscopy and different intracellular protein delivery strategies, we explore the possibility to develop in-cell NMR studies in living plant cells. While we provide a comprehensive framework to set-up in-cell NMR, we identified the efficient intracellular introduction of isotope-labeled proteins as the major bottleneck. Based on experiments with the paradigmatic intrinsically disordered proteins (IDPs) Early Response to Dehydration protein 10 and 14, we also established the subcellular localization of ERD14 under abiotic stress. PMID:28469623
Tools to evaluate the conformation of protein products.
Manta, Bruno; Obal, Gonzalo; Ricciardi, Alejandro; Pritsch, Otto; Denicola, Ana
2011-06-01
Production of recombinant proteins is a process intensively used in the research laboratory. In addition, the main biotechnology market products are recombinant proteins and monoclonal antibodies. The biological (and clinical) properties of the protein product strongly depend on the conformation of the polypeptide. Therefore, assessment of the correct conformation of the produced protein is crucial. There is no single method to assess every aspect of protein structure or function. Depending on the protein, the methods of choice vary. There are general methods to evaluate not only mass and primary sequence of the protein, but also higher-order structure. This review outlines the principal techniques for determining the conformation of a protein from structural (biophysical methods) to functional (in vitro binding assays) analyses. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Goblirsch, Brandon; Kurker, Richard C.; Streit, Bennett R.; Wilmot, Carrie M.; DuBois, Jennifer L.
2011-01-01
Heme proteins are extremely diverse, widespread, and versatile biocatalysts, sensors, and molecular transporters. The chlorite dismutase family of hemoproteins received its name due to the ability of the first-isolated members to detoxify anthropogenic ClO2−, a function believed to have evolved only in the last few decades. Family members have since been found in fifteen bacterial and archaeal genera, suggesting ancient roots. A structure- and sequence-based examination of the family is presented, in which key sequence and structural motifs are identified and possible functions for family proteins are proposed. Newly identified structural homologies moreover demonstrate clear connections to two other large, ancient, and functionally mysterious protein families. We propose calling them collectively the CDE superfamily of heme proteins. PMID:21354424
Protein–DNA Interactions: The Story so Far and a New Method for Prediction
Jones, Susan; Thornton, Janet M.
2003-01-01
This review describes methods for the prediction of DNA binding function, and specifically summarizes a new method using 3D structural templates. The new method features the HTH motif that is found in approximately one-third of DNAbinding protein families. A library of 3D structural templates of HTH motifs was derived from proteins in the PDB. Templates were scanned against complete protein structures and the optimal superposition of a template on a structure calculated. Significance thresholds in terms of a minimum root mean squared deviation (rmsd) of an optimal superposition, and a minimum motif accessible surface area (ASA), have been calculated. Inmore » this way, it is possible to scan the template library against proteins of unknown function to make predictions about DNA-binding functionality.« less
T-RMSD: a web server for automated fine-grained protein structural classification.
Magis, Cedrik; Di Tommaso, Paolo; Notredame, Cedric
2013-07-01
This article introduces the T-RMSD web server (tree-based on root-mean-square deviation), a service allowing the online computation of structure-based protein classification. It has been developed to address the relation between structural and functional similarity in proteins, and it allows a fine-grained structural clustering of a given protein family or group of structurally related proteins using distance RMSD (dRMSD) variations. These distances are computed between all pairs of equivalent residues, as defined by the ungapped columns within a given multiple sequence alignment. Using these generated distance matrices (one per equivalent position), T-RMSD produces a structural tree with support values for each cluster node, reminiscent of bootstrap values. These values, associated with the tree topology, allow a quantitative estimate of structural distances between proteins or group of proteins defined by the tree topology. The clusters thus defined have been shown to be structurally and functionally informative. The T-RMSD web server is a free website open to all users and available at http://tcoffee.crg.cat/apps/tcoffee/do:trmsd.
T-RMSD: a web server for automated fine-grained protein structural classification
Magis, Cedrik; Di Tommaso, Paolo; Notredame, Cedric
2013-01-01
This article introduces the T-RMSD web server (tree-based on root-mean-square deviation), a service allowing the online computation of structure-based protein classification. It has been developed to address the relation between structural and functional similarity in proteins, and it allows a fine-grained structural clustering of a given protein family or group of structurally related proteins using distance RMSD (dRMSD) variations. These distances are computed between all pairs of equivalent residues, as defined by the ungapped columns within a given multiple sequence alignment. Using these generated distance matrices (one per equivalent position), T-RMSD produces a structural tree with support values for each cluster node, reminiscent of bootstrap values. These values, associated with the tree topology, allow a quantitative estimate of structural distances between proteins or group of proteins defined by the tree topology. The clusters thus defined have been shown to be structurally and functionally informative. The T-RMSD web server is a free website open to all users and available at http://tcoffee.crg.cat/apps/tcoffee/do:trmsd. PMID:23716642
Allostery in the ferredoxin protein motif does not involve a conformational switch.
Nechushtai, Rachel; Lammert, Heiko; Michaeli, Dorit; Eisenberg-Domovich, Yael; Zuris, John A; Luca, Maria A; Capraro, Dominique T; Fish, Alex; Shimshon, Odelia; Roy, Melinda; Schug, Alexander; Whitford, Paul C; Livnah, Oded; Onuchic, José N; Jennings, Patricia A
2011-02-08
Regulation of protein function via cracking, or local unfolding and refolding of substructures, is becoming a widely recognized mechanism of functional control. Oftentimes, cracking events are localized to secondary and tertiary structure interactions between domains that control the optimal position for catalysis and/or the formation of protein complexes. Small changes in free energy associated with ligand binding, phosphorylation, etc., can tip the balance and provide a regulatory functional switch. However, understanding the factors controlling function in single-domain proteins is still a significant challenge to structural biologists. We investigated the functional landscape of a single-domain plant-type ferredoxin protein and the effect of a distal loop on the electron-transfer center. We find the global stability and structure are minimally perturbed with mutation, whereas the functional properties are altered. Specifically, truncating the L1,2 loop does not lead to large-scale changes in the structure, determined via X-ray crystallography. Further, the overall thermal stability of the protein is only marginally perturbed by the mutation. However, even though the mutation is distal to the iron-sulfur cluster (∼20 Å), it leads to a significant change in the redox potential of the iron-sulfur cluster (57 mV). Structure-based all-atom simulations indicate correlated dynamical changes between the surface-exposed loop and the iron-sulfur cluster-binding region. Our results suggest intrinsic communication channels within the ferredoxin fold, composed of many short-range interactions, lead to the propagation of long-range signals. Accordingly, protein interface interactions that involve L1,2 could potentially signal functional changes in distal regions, similar to what is observed in other allosteric systems.
Ukleja, Marta; Valpuesta, José María; Dziembowski, Andrzej; Cuellar, Jorge
2016-10-01
Large protein assemblies are usually the effectors of major cellular processes. The intricate cell homeostasis network is divided into numerous interconnected pathways, each controlled by a set of protein machines. One of these master regulators is the CCR4-NOT complex, which ultimately controls protein expression levels. This multisubunit complex assembles around a scaffold platform, which enables a wide variety of well-studied functions from mRNA synthesis to transcript decay, as well as other tasks still being identified. Solving the structure of the entire CCR4-NOT complex will help to define the distribution of its functions. The recently published three-dimensional reconstruction of the complex, in combination with the known crystal structures of some of the components, has begun to address this. Methodological improvements in structural biology, especially in cryoelectron microscopy, encourage further structural and protein-protein interaction studies, which will advance our comprehension of the gene expression machinery. © 2016 WILEY Periodicals, Inc.
Self-assembly in the ferritin nano-cage protein superfamily.
Zhang, Yu; Orner, Brendan P
2011-01-01
Protein self-assembly, through specific, high affinity, and geometrically constraining protein-protein interactions, can control and lead to complex cellular nano-structures. Establishing an understanding of the underlying principles that govern protein self-assembly is not only essential to appreciate the fundamental biological functions of these structures, but could also provide a basis for their enhancement for nano-material applications. The ferritins are a superfamily of well studied proteins that self-assemble into hollow cage-like structures which are ubiquitously found in both prokaryotes and eukaryotes. Structural studies have revealed that many members of the ferritin family can self-assemble into nano-cages of two types. Maxi-ferritins form hollow spheres with octahedral symmetry composed of twenty-four monomers. Mini-ferritins, on the other hand, are tetrahedrally symmetric, hollow assemblies composed of twelve monomers. This review will focus on the structure of members of the ferritin superfamily, the mechanism of ferritin self-assembly and the structure-function relations of these proteins.
De Novo Protein Structure Prediction
NASA Astrophysics Data System (ADS)
Hung, Ling-Hong; Ngan, Shing-Chung; Samudrala, Ram
An unparalleled amount of sequence data is being made available from large-scale genome sequencing efforts. The data provide a shortcut to the determination of the function of a gene of interest, as long as there is an existing sequenced gene with similar sequence and of known function. This has spurred structural genomic initiatives with the goal of determining as many protein folds as possible (Brenner and Levitt, 2000; Burley, 2000; Brenner, 2001; Heinemann et al., 2001). The purpose of this is twofold: First, the structure of a gene product can often lead to direct inference of its function. Second, since the function of a protein is dependent on its structure, direct comparison of the structures of gene products can be more sensitive than the comparison of sequences of genes for detecting homology. Presently, structural determination by crystallography and NMR techniques is still slow and expensive in terms of manpower and resources, despite attempts to automate the processes. Computer structure prediction algorithms, while not providing the accuracy of the traditional techniques, are extremely quick and inexpensive and can provide useful low-resolution data for structure comparisons (Bonneau and Baker, 2001). Given the immense number of structures which the structural genomic projects are attempting to solve, there would be a considerable gain even if the computer structure prediction approach were applicable to a subset of proteins.
Ogawa, Seiji; Watanabe, Toshihide; Moriyuki, Kazumi; Goto, Yoshikazu; Yamane, Shinsaku; Watanabe, Akio; Tsuboi, Kazuma; Kinoshita, Atsushi; Okada, Takuya; Takeda, Hiroyuki; Tani, Kousuke; Maruyama, Toru
2016-05-15
The modification of the novel G protein-biased EP2 agonist 1 has been investigated to improve its G protein activity and develop a better understanding of its structure-functional selectivity relationship (SFSR). The optimization of the substituents on the phenyl ring of 1, followed by the inversion of the hydroxyl group on the cyclopentane moiety led to compound 9, which showed a 100-fold increase in its G protein activity compared with 1 without any increase in β-arrestin recruitment. Furthermore, SFSR studies revealed that the combination of meta and para substituents on the phenyl moiety was crucial to the functional selectivity. Copyright © 2016 Elsevier Ltd. All rights reserved.
ERIC Educational Resources Information Center
Ray, Gigi B.; Cook, J. Whitney
2005-01-01
A biochemical molecular modeling project on heme proteins suitable for an introductory Biochemistry I class has been designed with a 2-fold objective: i) to reinforce the correlation between protein three-dimensional structure and function through a discovery oriented project, and ii) to introduce students to the fields of bioinorganic and…
Structure refinement of membrane proteins via molecular dynamics simulations.
Dutagaci, Bercem; Heo, Lim; Feig, Michael
2018-07-01
A refinement protocol based on physics-based techniques established for water soluble proteins is tested for membrane protein structures. Initial structures were generated by homology modeling and sampled via molecular dynamics simulations in explicit lipid bilayer and aqueous solvent systems. Snapshots from the simulations were selected based on scoring with either knowledge-based or implicit membrane-based scoring functions and averaged to obtain refined models. The protocol resulted in consistent and significant refinement of the membrane protein structures similar to the performance of refinement methods for soluble proteins. Refinement success was similar between sampling in the presence of lipid bilayers and aqueous solvent but the presence of lipid bilayers may benefit the improvement of lipid-facing residues. Scoring with knowledge-based functions (DFIRE and RWplus) was found to be as good as scoring using implicit membrane-based scoring functions suggesting that differences in internal packing is more important than orientations relative to the membrane during the refinement of membrane protein homology models. © 2018 Wiley Periodicals, Inc.
Diab, Ahmed; Foca, Adrien; Zoulim, Fabien; Durantel, David; Andrisani, Ourania
2018-01-01
Virally encoded proteins have evolved to perform multiple functions, and the core protein (HBc) of the hepatitis B virus (HBV) is a perfect example. While HBc is the structural component of the viral nucleocapsid, additional novel functions for the nucleus-localized HBc have recently been described. These results extend for HBc, beyond its structural role, a regulatory function in the viral life cycle and potentially a role in pathogenesis. In this article, we review the diverse roles of HBc in HBV replication and pathogenesis, emphasizing how the unique structure of this protein is key to its various functions. We focus in particular on recent advances in understanding the significance of HBc phosphorylations, its interaction with host proteins and the role of HBc in regulating the transcription of host genes. We also briefly allude to the emerging niche for new direct-acting antivirals targeting HBc, known as Core (protein) Allosteric Modulators (CAMs). Copyright © 2017 Elsevier B.V. All rights reserved.
Crystal Structure of a Plant Multidrug and Toxic Compound Extrusion Family Protein.
Tanaka, Yoshiki; Iwaki, Shigehiro; Tsukazaki, Tomoya
2017-09-05
The multidrug and toxic compound extrusion (MATE) family of proteins consists of transporters responsible for multidrug resistance in prokaryotes. In plants, a number of MATE proteins were identified by recent genomic and functional studies, which imply that the proteins have substrate-specific transport functions instead of multidrug extrusion. The three-dimensional structure of eukaryotic MATE proteins, including those of plants, has not been reported, preventing a better understanding of the molecular mechanism of these proteins. Here, we describe the crystal structure of a MATE protein from the plant Camelina sativa at 2.9 Å resolution. Two sets of six transmembrane α helices, assembled pseudo-symmetrically, possess a negatively charged internal pocket with an outward-facing shape. The crystal structure provides insight into the diversity of plant MATE proteins and their substrate recognition and transport through the membrane. Copyright © 2017 Elsevier Ltd. All rights reserved.
Columba: an integrated database of proteins, structures, and annotations.
Trissl, Silke; Rother, Kristian; Müller, Heiko; Steinke, Thomas; Koch, Ina; Preissner, Robert; Frömmel, Cornelius; Leser, Ulf
2005-03-31
Structural and functional research often requires the computation of sets of protein structures based on certain properties of the proteins, such as sequence features, fold classification, or functional annotation. Compiling such sets using current web resources is tedious because the necessary data are spread over many different databases. To facilitate this task, we have created COLUMBA, an integrated database of annotations of protein structures. COLUMBA currently integrates twelve different databases, including PDB, KEGG, Swiss-Prot, CATH, SCOP, the Gene Ontology, and ENZYME. The database can be searched using either keyword search or data source-specific web forms. Users can thus quickly select and download PDB entries that, for instance, participate in a particular pathway, are classified as containing a certain CATH architecture, are annotated as having a certain molecular function in the Gene Ontology, and whose structures have a resolution under a defined threshold. The results of queries are provided in both machine-readable extensible markup language and human-readable format. The structures themselves can be viewed interactively on the web. The COLUMBA database facilitates the creation of protein structure data sets for many structure-based studies. It allows to combine queries on a number of structure-related databases not covered by other projects at present. Thus, information on both many and few protein structures can be used efficiently. The web interface for COLUMBA is available at http://www.columba-db.de.
Protein functional features are reflected in the patterns of mRNA translation speed.
López, Daniel; Pazos, Florencio
2015-07-09
The degeneracy of the genetic code makes it possible for the same amino acid string to be coded by different messenger RNA (mRNA) sequences. These "synonymous mRNAs" may differ largely in a number of aspects related to their overall translational efficiency, such as secondary structure content and availability of the encoded transfer RNAs (tRNAs). Consequently, they may render different yields of the translated polypeptides. These mRNA features related to translation efficiency are also playing a role locally, resulting in a non-uniform translation speed along the mRNA, which has been previously related to some protein structural features and also used to explain some dramatic effects of "silent" single-nucleotide-polymorphisms (SNPs). In this work we perform the first large scale analysis of the relationship between three experimental proxies of mRNA local translation efficiency and the local features of the corresponding encoded proteins. We found that a number of protein functional and structural features are reflected in the patterns of ribosome occupancy, secondary structure and tRNA availability along the mRNA. One or more of these proxies of translation speed have distinctive patterns around the mRNA regions coding for certain protein local features. In some cases the three patterns follow a similar trend. We also show specific examples where these patterns of translation speed point to the protein's important structural and functional features. This support the idea that the genome not only codes the protein functional features as sequences of amino acids, but also as subtle patterns of mRNA properties which, probably through local effects on the translation speed, have some consequence on the final polypeptide. These results open the possibility of predicting a protein's functional regions based on a single genomic sequence, and have implications for heterologous protein expression and fine-tuning protein function.
Combs, Steven A; Mueller, Benjamin K; Meiler, Jens
2018-05-29
Partial covalent interactions (PCIs) in proteins, which include hydrogen bonds, salt bridges, cation-π, and π-π interactions, contribute to thermodynamic stability and facilitate interactions with other biomolecules. Several score functions have been developed within the Rosetta protein modeling framework that identify and evaluate these PCIs through analyzing the geometry between participating atoms. However, we hypothesize that PCIs can be unified through a simplified electron orbital representation. To test this hypothesis, we have introduced orbital based chemical descriptors for PCIs into Rosetta, called the PCI score function. Optimal geometries for the PCIs are derived from a statistical analysis of high-quality protein structures obtained from the Protein Data Bank (PDB), and the relative orientation of electron deficient hydrogen atoms and electron-rich lone pair or π orbitals are evaluated. We demonstrate that nativelike geometries of hydrogen bonds, salt bridges, cation-π, and π-π interactions are recapitulated during minimization of protein conformation. The packing density of tested protein structures increased from the standard score function from 0.62 to 0.64, closer to the native value of 0.70. Overall, rotamer recovery improved when using the PCI score function (75%) as compared to the standard Rosetta score function (74%). The PCI score function represents an improvement over the standard Rosetta score function for protein model scoring; in addition, it provides a platform for future directions in the analysis of small molecule to protein interactions, which depend on partial covalent interactions.
Binding ligand prediction for proteins using partial matching of local surface patches.
Sael, Lee; Kihara, Daisuke
2010-01-01
Functional elucidation of uncharacterized protein structures is an important task in bioinformatics. We report our new approach for structure-based function prediction which captures local surface features of ligand binding pockets. Function of proteins, specifically, binding ligands of proteins, can be predicted by finding similar local surface regions of known proteins. To enable partial comparison of binding sites in proteins, a weighted bipartite matching algorithm is used to match pairs of surface patches. The surface patches are encoded with the 3D Zernike descriptors. Unlike the existing methods which compare global characteristics of the protein fold or the global pocket shape, the local surface patch method can find functional similarity between non-homologous proteins and binding pockets for flexible ligand molecules. The proposed method improves prediction results over global pocket shape-based method which was previously developed by our group.
Binding Ligand Prediction for Proteins Using Partial Matching of Local Surface Patches
Sael, Lee; Kihara, Daisuke
2010-01-01
Functional elucidation of uncharacterized protein structures is an important task in bioinformatics. We report our new approach for structure-based function prediction which captures local surface features of ligand binding pockets. Function of proteins, specifically, binding ligands of proteins, can be predicted by finding similar local surface regions of known proteins. To enable partial comparison of binding sites in proteins, a weighted bipartite matching algorithm is used to match pairs of surface patches. The surface patches are encoded with the 3D Zernike descriptors. Unlike the existing methods which compare global characteristics of the protein fold or the global pocket shape, the local surface patch method can find functional similarity between non-homologous proteins and binding pockets for flexible ligand molecules. The proposed method improves prediction results over global pocket shape-based method which was previously developed by our group. PMID:21614188
Jensen, Malene Ringkjøbing; Bernadó, Pau; Houben, Klaartje; Blanchard, Laurence; Marion, Dominque; Ruigrok, Rob W H; Blackledge, Martin
2010-08-01
Intrinsically disordered regions of significant length are present throughout eukaryotic genomes, and are particularly prevalent in viral proteins. Due to their inherent flexibility, these proteins inhabit a conformational landscape that is too complex to be described by classical structural biology. The elucidation of the role that conformational flexibility plays in molecular function will redefine our understanding of the molecular basis of biological function, and the development of appropriate technology to achieve this aim remains one of the major challenges for the future of structural biology. NMR is the technique of choice for studying intrinsically disordered proteins, providing information about structure, flexibility and interactions at atomic resolution even in completely disordered proteins. In particular residual dipolar couplings (RDCs) are sensitive and powerful tools for determining local and long-range structural behaviour in flexible proteins. Here we describe recent applications of the use of RDCs to quantitatively describe the level of local structure in intrinsically disordered proteins involved in replication and transcription in Sendai virus.
ERIC Educational Resources Information Center
Lawrence, Sarah H.; Jaffe, Eileen K.
2008-01-01
A morpheein is a homo-oligomeric protein that can exist as an ensemble of physiologically significant and functionally distinct alternate quaternary assemblies. Morpheeins exist in nature and use conformational equilibria between different tertiary structures to form distinct oligomers as a means of regulating their function. Notably, alternate…
The Prediction of Botulinum Toxin Structure Based on in Silico and in Vitro Analysis
NASA Astrophysics Data System (ADS)
Suzuki, Tomonori; Miyazaki, Satoru
2011-01-01
Many of biological system mediated through protein-protein interactions. Knowledge of protein-protein complex structure is required for understanding the function. The determination of huge size and flexible protein-protein complex structure by experimental studies remains difficult, costly and five-consuming, therefore computational prediction of protein structures by homolog modeling and docking studies is valuable method. In addition, MD simulation is also one of the most powerful methods allowing to see the real dynamics of proteins. Here, we predict protein-protein complex structure of botulinum toxin to analyze its property. These bioinformatics methods are useful to report the relation between the flexibility of backbone structure and the activity.
Protein-protein binding before and after photo-modification of albumin
NASA Astrophysics Data System (ADS)
Rozinek, Sarah C.; Glickman, Randolph D.; Thomas, Robert J.; Brancaleon, Lorenzo
2016-03-01
Bioeffects of directed-optical-energy encompass a wide range of applications. One aspect of photochemical interactions involves irradiating a photosensitizer with visible light in order to induce protein unfolding and consequent changes in function. In the past, irradiation of several dye-protein combinations has revealed effects on protein structure. Beta lactoglobulin, human serum albumin (HSA) and tubulin have all been photo-modified with meso-tetrakis(4- sulfonatophenyl)porphyrin (TSPP) bound, but only in the case of tubulin has binding caused a verified loss of biological function (loss of ability to form microtubules) as a result of this light-induced structural change. The current work questions if the photo-induced structural changes that occur to HSA, are sufficient to disable its biological function of binding to osteonectin. The albumin-binding protein, osteonectin, is about half the molecular weight of HSA, so the two proteins and their bound product can be separated and quantified by size exclusion high performance liquid chromatography. TSPP was first bound to HSA and irradiated, photo-modifying the structure of HSA. Then native HSA or photo-modified HSA (both with TSPP bound) were compared, to assess loss in HSA's innate binding ability as a result of light-induced structure modification.
Tan, Kemin; Chang, Changsoo; Cuff, Marianne; Osipiuk, Jerzy; Landorf, Elizabeth; Mack, Jamey C; Zerbs, Sarah; Joachimiak, Andrzej; Collart, Frank R
2013-10-01
Lignin comprises 15-25% of plant biomass and represents a major environmental carbon source for utilization by soil microorganisms. Access to this energy resource requires the action of fungal and bacterial enzymes to break down the lignin polymer into a complex assortment of aromatic compounds that can be transported into the cells. To improve our understanding of the utilization of lignin by microorganisms, we characterized the molecular properties of solute binding proteins of ATP-binding cassette transporter proteins that interact with these compounds. A combination of functional screens and structural studies characterized the binding specificity of the solute binding proteins for aromatic compounds derived from lignin such as p-coumarate, 3-phenylpropionic acid and compounds with more complex ring substitutions. A ligand screen based on thermal stabilization identified several binding protein clusters that exhibit preferences based on the size or number of aromatic ring substituents. Multiple X-ray crystal structures of protein-ligand complexes for these clusters identified the molecular basis of the binding specificity for the lignin-derived aromatic compounds. The screens and structural data provide new functional assignments for these solute-binding proteins which can be used to infer their transport specificity. This knowledge of the functional roles and molecular binding specificity of these proteins will support the identification of the specific enzymes and regulatory proteins of peripheral pathways that funnel these compounds to central metabolic pathways and will improve the predictive power of sequence-based functional annotation methods for this family of proteins. Copyright © 2013 Wiley Periodicals, Inc.
Tan, Kemin; Chang, Changsoo; Cuff, Marianne; Osipiuk, Jerzy; Landorf, Elizabeth; Mack, Jamey C.; Zerbs, Sarah; Joachimiak, Andrzej; Collart, Frank R.
2013-01-01
Lignin comprises 15.25% of plant biomass and represents a major environmental carbon source for utilization by soil microorganisms. Access to this energy resource requires the action of fungal and bacterial enzymes to break down the lignin polymer into a complex assortment of aromatic compounds that can be transported into the cells. To improve our understanding of the utilization of lignin by microorganisms, we characterized the molecular properties of solute binding proteins of ATP.binding cassette transporter proteins that interact with these compounds. A combination of functional screens and structural studies characterized the binding specificity of the solute binding proteins for aromatic compounds derived from lignin such as p-coumarate, 3-phenylpropionic acid and compounds with more complex ring substitutions. A ligand screen based on thermal stabilization identified several binding protein clusters that exhibit preferences based on the size or number of aromatic ring substituents. Multiple X-ray crystal structures of protein-ligand complexes for these clusters identified the molecular basis of the binding specificity for the lignin-derived aromatic compounds. The screens and structural data provide new functional assignments for these solute.binding proteins which can be used to infer their transport specificity. This knowledge of the functional roles and molecular binding specificity of these proteins will support the identification of the specific enzymes and regulatory proteins of peripheral pathways that funnel these compounds to central metabolic pathways and will improve the predictive power of sequence-based functional annotation methods for this family of proteins. PMID:23606130
Shield, Alison J; Murray, Tracy P; Board, Philip G
2006-09-08
Mutations in the ganglioside-induced differentiation-associated protein 1 (GDAP1) gene have been linked with Charcot-Marie-Tooth (CMT) disease. This protein, and its paralogue GDAP1L1, appear to be structurally related to the cytosolic glutathione S-transferases (GST) including an N-terminal thioredoxin fold domain with conserved active site residues. The specific function, of GDAP1 remains unknown. To further characterise their structure and function we purified recombinant human GDAP1 and GDAP1L1 proteins using bacterial expression and immobilised metal affinity chromatography. Like other cytosolic GSTs, GDAP1 protein has a dimeric structure. Although the full-length proteins were largely insoluble, the deletion of a proposed C-terminal transmembrane domain allowed the preparation of soluble protein. The purified proteins were assayed for glutathione-dependent activity against a library of 'prototypic' GST substrates. No evidence of glutathione-dependent activity or an ability to bind glutathione immobilised on agarose was found.
Protein Bricks: 2D and 3D Bio-Nanostructures with Shape and Function on Demand.
Jiang, Jianjuan; Zhang, Shaoqing; Qian, Zhigang; Qin, Nan; Song, Wenwen; Sun, Long; Zhou, Zhitao; Shi, Zhifeng; Chen, Liang; Li, Xinxin; Mao, Ying; Kaplan, David L; Gilbert Corder, Stephanie N; Chen, Xinzhong; Liu, Mengkun; Omenetto, Fiorenzo G; Xia, Xiaoxia; Tao, Tiger H
2018-05-01
Precise patterning of polymer-based biomaterials for functional bio-nanostructures has extensive applications including biosensing, tissue engineering, and regenerative medicine. Remarkable progress is made in both top-down (based on lithographic methods) and bottom-up (via self-assembly) approaches with natural and synthetic biopolymers. However, most methods only yield 2D and pseudo-3D structures with restricted geometries and functionalities. Here, it is reported that precise nanostructuring on genetically engineered spider silk by accurately directing ion and electron beam interactions with the protein's matrix at the nanoscale to create well-defined 2D bionanopatterns and further assemble 3D bionanoarchitectures with shape and function on demand, termed "Protein Bricks." The added control over protein sequence and molecular weight of recombinant spider silk via genetic engineering provides unprecedented lithographic resolution (approaching the molecular limit), sharpness, and biological functions compared to natural proteins. This approach provides a facile method for patterning and immobilizing functional molecules within nanoscopic, hierarchical protein structures, which sheds light on a wide range of biomedical applications such as structure-enhanced fluorescence and biomimetic microenvironments for controlling cell fate. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Sequence-similar, structure-dissimilar protein pairs in the PDB.
Kosloff, Mickey; Kolodny, Rachel
2008-05-01
It is often assumed that in the Protein Data Bank (PDB), two proteins with similar sequences will also have similar structures. Accordingly, it has proved useful to develop subsets of the PDB from which "redundant" structures have been removed, based on a sequence-based criterion for similarity. Similarly, when predicting protein structure using homology modeling, if a template structure for modeling a target sequence is selected by sequence alone, this implicitly assumes that all sequence-similar templates are equivalent. Here, we show that this assumption is often not correct and that standard approaches to create subsets of the PDB can lead to the loss of structurally and functionally important information. We have carried out sequence-based structural superpositions and geometry-based structural alignments of a large number of protein pairs to determine the extent to which sequence similarity ensures structural similarity. We find many examples where two proteins that are similar in sequence have structures that differ significantly from one another. The source of the structural differences usually has a functional basis. The number of such proteins pairs that are identified and the magnitude of the dissimilarity depend on the approach that is used to calculate the differences; in particular sequence-based structure superpositioning will identify a larger number of structurally dissimilar pairs than geometry-based structural alignments. When two sequences can be aligned in a statistically meaningful way, sequence-based structural superpositioning provides a meaningful measure of structural differences. This approach and geometry-based structure alignments reveal somewhat different information and one or the other might be preferable in a given application. Our results suggest that in some cases, notably homology modeling, the common use of nonredundant datasets, culled from the PDB based on sequence, may mask important structural and functional information. We have established a data base of sequence-similar, structurally dissimilar protein pairs that will help address this problem (http://luna.bioc.columbia.edu/rachel/seqsimstrdiff.htm).
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lee, Myeongsang; Baek, Inchul; Choi, Hyunsung
Pathological amyloid proteins have been implicated in neuro-degenerative diseases, specifically Alzheimer's, Parkinson's, Lewy-body diseases and prion related diseases. In prion related diseases, functional tau proteins can be transformed into pathological agents by environmental factors, including oxidative stress, inflammation, Aβ-mediated toxicity and covalent modification. These pathological agents are stable under physiological conditions and are not easily degraded. This un-degradable characteristic of tau proteins enables their utilization as functional materials to capturing the carbon dioxides. For the proper utilization of amyloid proteins as functional materials efficiently, a basic study regarding their structural characteristic is necessary. Here, we investigated the basic tau proteinmore » structure of wild-type (WT) and tau proteins with lysine residues mutation at glutamic residue (Q2K) on tau protein at atomistic scale. We also reported the size effect of both the WT and Q2K structures, which allowed us to identify the stability of those amyloid structures. - Highlights: • Lysine mutation effect alters the structure conformation and characteristic of tau. • Over the 15 layers both WT and Q2K models, both tau proteins undergo fractions. • Lysine mutation causes the increment of non-bonded energy and solvent accessible surface area. • Structural instability of Q2K model was proved by the number of hydrogen bonds analysis.« less
Illuminating structural proteins in viral "dark matter" with metaproteomics
Brum, Jennifer R.; Ignacio-Espinoza, J. Cesar; Kim, Eun -Hae; ...
2016-02-16
Viruses are ecologically important, yet environmental virology is limited by dominance of unannotated genomic sequences representing taxonomic and functional "viral dark matter." Although recent analytical advances are rapidly improving taxonomic annotations, identifying functional darkmatter remains problematic. Here, we apply paired metaproteomics and dsDNA-targeted metagenomics to identify 1,875 virion-associated proteins from the ocean. Over one-half of these proteins were newly functionally annotated and represent abundant and widespread viral metagenome-derived protein clusters (PCs). One primarily unannotated PC dominated the dataset, but structural modeling and genomic context identified this PC as a previously unidentified capsid protein from multiple uncultivated tailed virus families. Furthermore,more » four of the five most abundant PCs in the metaproteome represent capsid proteins containing the HK97-like protein fold previously found in many viruses that infect all three domains of life. The dominance of these proteins within our dataset, as well as their global distribution throughout the world's oceans and seas, supports prior hypotheses that this HK97-like protein fold is the most abundant biological structure on Earth. Altogether, these culture-independent analyses improve virion-associated protein annotations, facilitate the investigation of proteins within natural viral communities, and offer a high-throughput means of illuminating functional viral dark matter.« less
Illuminating structural proteins in viral "dark matter" with metaproteomics.
Brum, Jennifer R; Ignacio-Espinoza, J Cesar; Kim, Eun-Hae; Trubl, Gareth; Jones, Robert M; Roux, Simon; VerBerkmoes, Nathan C; Rich, Virginia I; Sullivan, Matthew B
2016-03-01
Viruses are ecologically important, yet environmental virology is limited by dominance of unannotated genomic sequences representing taxonomic and functional "viral dark matter." Although recent analytical advances are rapidly improving taxonomic annotations, identifying functional dark matter remains problematic. Here, we apply paired metaproteomics and dsDNA-targeted metagenomics to identify 1,875 virion-associated proteins from the ocean. Over one-half of these proteins were newly functionally annotated and represent abundant and widespread viral metagenome-derived protein clusters (PCs). One primarily unannotated PC dominated the dataset, but structural modeling and genomic context identified this PC as a previously unidentified capsid protein from multiple uncultivated tailed virus families. Furthermore, four of the five most abundant PCs in the metaproteome represent capsid proteins containing the HK97-like protein fold previously found in many viruses that infect all three domains of life. The dominance of these proteins within our dataset, as well as their global distribution throughout the world's oceans and seas, supports prior hypotheses that this HK97-like protein fold is the most abundant biological structure on Earth. Together, these culture-independent analyses improve virion-associated protein annotations, facilitate the investigation of proteins within natural viral communities, and offer a high-throughput means of illuminating functional viral dark matter.
Illuminating structural proteins in viral “dark matter” with metaproteomics
Brum, Jennifer R.; Ignacio-Espinoza, J. Cesar; Kim, Eun-Hae; Trubl, Gareth; Jones, Robert M.; Roux, Simon; VerBerkmoes, Nathan C.; Rich, Virginia I.; Sullivan, Matthew B.
2016-01-01
Viruses are ecologically important, yet environmental virology is limited by dominance of unannotated genomic sequences representing taxonomic and functional “viral dark matter.” Although recent analytical advances are rapidly improving taxonomic annotations, identifying functional dark matter remains problematic. Here, we apply paired metaproteomics and dsDNA-targeted metagenomics to identify 1,875 virion-associated proteins from the ocean. Over one-half of these proteins were newly functionally annotated and represent abundant and widespread viral metagenome-derived protein clusters (PCs). One primarily unannotated PC dominated the dataset, but structural modeling and genomic context identified this PC as a previously unidentified capsid protein from multiple uncultivated tailed virus families. Furthermore, four of the five most abundant PCs in the metaproteome represent capsid proteins containing the HK97-like protein fold previously found in many viruses that infect all three domains of life. The dominance of these proteins within our dataset, as well as their global distribution throughout the world’s oceans and seas, supports prior hypotheses that this HK97-like protein fold is the most abundant biological structure on Earth. Together, these culture-independent analyses improve virion-associated protein annotations, facilitate the investigation of proteins within natural viral communities, and offer a high-throughput means of illuminating functional viral dark matter. PMID:26884177
Chen, Fu; Sun, Huiyong; Wang, Junmei; Zhu, Feng; Liu, Hui; Wang, Zhe; Lei, Tailong; Li, Youyong; Hou, Tingjun
2018-06-21
Molecular docking provides a computationally efficient way to predict the atomic structural details of protein-RNA interactions (PRI), but accurate prediction of the three-dimensional structures and binding affinities for PRI is still notoriously difficult, partly due to the unreliability of the existing scoring functions for PRI. MM/PBSA and MM/GBSA are more theoretically rigorous than most scoring functions for protein-RNA docking, but their prediction performance for protein-RNA systems remains unclear. Here, we systemically evaluated the capability of MM/PBSA and MM/GBSA to predict the binding affinities and recognize the near-native binding structures for protein-RNA systems with different solvent models and interior dielectric constants (ϵ in ). For predicting the binding affinities, the predictions given by MM/GBSA based on the minimized structures in explicit solvent and the GBGBn1 model with ϵ in = 2 yielded the highest correlation with the experimental data. Moreover, the MM/GBSA calculations based on the minimized structures in implicit solvent and the GBGBn1 model distinguished the near-native binding structures within the top 10 decoys for 118 out of the 149 protein-RNA systems (79.2%). This performance is better than all docking scoring functions studied here. Therefore, the MM/GBSA rescoring is an efficient way to improve the prediction capability of scoring functions for protein-RNA systems. Published by Cold Spring Harbor Laboratory Press for the RNA Society.
Mulnix, Amy B.
2003-01-01
Undergraduate biology curricula are being modified to model and teach the activities of scientists better. The assignment described here, one that investigates protein structure and function, was designed for use in a sophomore-level cell physiology course at Earlham College. Students work in small groups to read and present in poster format on the content of a single research article reporting on the structure and/or function of a protein. Goals of the assignment include highlighting the interdependence of protein structure and function; asking students to review, integrate, and apply previously acquired knowledge; and helping students see protein structure/function in a context larger than cell physiology. The assignment also is designed to build skills in reading scientific literature, oral and written communication, and collaboration among peers. Assessment of student perceptions of the assignment in two separate offerings indicates that the project successfully achieves these goals. Data specifically show that students relied heavily on their peers to understand their article. The assignment was also shown to require students to read articles more carefully than previously. In addition, the data suggest that the assignment could be modified and used successfully in other courses and at other institutions. PMID:14673490
De Jaco, Antonella; Comoletti, Davide; Dubi, Noga; Camp, Shelley; Taylor, Palmer
2016-01-01
The α/β hydrolase fold family is perhaps the largest group of proteins presenting significant structural homology with divergent functions, ranging from catalytic hydrolysis to heterophilic cell adhesive interactions to chaperones in hormone production. All the proteins of the family share a common three-dimensional core structure containing the α/β-hydrolase fold domain that is crucial for proper protein function. Several mutations associated with congenital diseases or disorders have been reported in conserved residues within the α/β-hydrolase fold domain of cholinesterase-like proteins, neuroligins, butyrylcholinesterase and thyroglobulin. These mutations are known to disrupt the architecture of the common structural domain either globally or locally. Characterization of the natural mutations affecting the α/β-hydrolase fold domain in these proteins has shown that they mainly impair processing and trafficking along the secretory pathway causing retention of the mutant protein in the endoplasmic reticulum. Studying the processing of α/β-hydrolase fold mutant proteins should uncover new functions for this domain, that in some cases require structural integrity for both export of the protein from the ER and for facilitating subunit dimerization. A comparative study of homologous mutations in proteins that are closely related family members, along with the definition of new three-dimensional crystal structures, will identify critical residues for the assembly of the α/β-hydrolase fold. PMID:21933121
Computation-Guided Backbone Grafting of a Discontinuous Motif onto a Protein Scaffold
DOE Office of Scientific and Technical Information (OSTI.GOV)
Azoitei, Mihai L.; Correia, Bruno E.; Ban, Yih-En Andrew
2012-02-07
The manipulation of protein backbone structure to control interaction and function is a challenge for protein engineering. We integrated computational design with experimental selection for grafting the backbone and side chains of a two-segment HIV gp120 epitope, targeted by the cross-neutralizing antibody b12, onto an unrelated scaffold protein. The final scaffolds bound b12 with high specificity and with affinity similar to that of gp120, and crystallographic analysis of a scaffold bound to b12 revealed high structural mimicry of the gp120-b12 complex structure. The method can be generalized to design other functional proteins through backbone grafting.
NegGOA: negative GO annotations selection using ontology structure.
Fu, Guangyuan; Wang, Jun; Yang, Bo; Yu, Guoxian
2016-10-01
Predicting the biological functions of proteins is one of the key challenges in the post-genomic era. Computational models have demonstrated the utility of applying machine learning methods to predict protein function. Most prediction methods explicitly require a set of negative examples-proteins that are known not carrying out a particular function. However, Gene Ontology (GO) almost always only provides the knowledge that proteins carry out a particular function, and functional annotations of proteins are incomplete. GO structurally organizes more than tens of thousands GO terms and a protein is annotated with several (or dozens) of these terms. For these reasons, the negative examples of a protein can greatly help distinguishing true positive examples of the protein from such a large candidate GO space. In this paper, we present a novel approach (called NegGOA) to select negative examples. Specifically, NegGOA takes advantage of the ontology structure, available annotations and potentiality of additional annotations of a protein to choose negative examples of the protein. We compare NegGOA with other negative examples selection algorithms and find that NegGOA produces much fewer false negatives than them. We incorporate the selected negative examples into an efficient function prediction model to predict the functions of proteins in Yeast, Human, Mouse and Fly. NegGOA also demonstrates improved accuracy than these comparing algorithms across various evaluation metrics. In addition, NegGOA is less suffered from incomplete annotations of proteins than these comparing methods. The Matlab and R codes are available at https://sites.google.com/site/guoxian85/neggoa gxyu@swu.edu.cn Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Roche, Daniel Barry; Brackenridge, Danielle Allison; McGuffin, Liam James
2015-12-15
Elucidating the biological and biochemical roles of proteins, and subsequently determining their interacting partners, can be difficult and time consuming using in vitro and/or in vivo methods, and consequently the majority of newly sequenced proteins will have unknown structures and functions. However, in silico methods for predicting protein-ligand binding sites and protein biochemical functions offer an alternative practical solution. The characterisation of protein-ligand binding sites is essential for investigating new functional roles, which can impact the major biological research spheres of health, food, and energy security. In this review we discuss the role in silico methods play in 3D modelling of protein-ligand binding sites, along with their role in predicting biochemical functionality. In addition, we describe in detail some of the key alternative in silico prediction approaches that are available, as well as discussing the Critical Assessment of Techniques for Protein Structure Prediction (CASP) and the Continuous Automated Model EvaluatiOn (CAMEO) projects, and their impact on developments in the field. Furthermore, we discuss the importance of protein function prediction methods for tackling 21st century problems.
Kaur, Gurmeet; Subramanian, Srikrishna
2016-08-26
Treble clef (TC) zinc fingers constitute a large fold-group of structural zinc-binding protein domains that mediate numerous cellular functions. We have analysed the sequence, structure, and function relationships among all TCs in the Protein Data Bank. This led to the identification of novel TCs, such as lsr2, YggX and TFIIIC τ 60 kDa subunit, and prediction of a nuclease-like function for the DUF1364 family. The structural malleability of TCs is evident from the many examples with variations to the core structural elements of the fold. We observe domains wherein the structural core of the TC fold is circularly permuted, and also some examples where the overall fold resembles both the TC motif and another unrelated fold. All extant TC families do not share a monophyletic origin, as several TC proteins are known to have been present in the last universal common ancestor and the last eukaryotic common ancestor. We identify several TCs where the zinc-chelating site and residues are not merely responsible for structure stabilization but also perform other functions, such as being redox active in C1B domain of protein kinase C, a nucleophilic acceptor in Ada and catalytic in organomercurial lyase, MerB.
NASA Astrophysics Data System (ADS)
Kaur, Gurmeet; Subramanian, Srikrishna
2016-08-01
Treble clef (TC) zinc fingers constitute a large fold-group of structural zinc-binding protein domains that mediate numerous cellular functions. We have analysed the sequence, structure, and function relationships among all TCs in the Protein Data Bank. This led to the identification of novel TCs, such as lsr2, YggX and TFIIIC τ 60 kDa subunit, and prediction of a nuclease-like function for the DUF1364 family. The structural malleability of TCs is evident from the many examples with variations to the core structural elements of the fold. We observe domains wherein the structural core of the TC fold is circularly permuted, and also some examples where the overall fold resembles both the TC motif and another unrelated fold. All extant TC families do not share a monophyletic origin, as several TC proteins are known to have been present in the last universal common ancestor and the last eukaryotic common ancestor. We identify several TCs where the zinc-chelating site and residues are not merely responsible for structure stabilization but also perform other functions, such as being redox active in C1B domain of protein kinase C, a nucleophilic acceptor in Ada and catalytic in organomercurial lyase, MerB.
Biological and functional relevance of CASP predictions
Liu, Tianyun; Ish‐Shalom, Shirbi; Torng, Wen; Lafita, Aleix; Bock, Christian; Mort, Matthew; Cooper, David N; Bliven, Spencer; Capitani, Guido; Mooney, Sean D.
2017-01-01
Abstract Our goal is to answer the question: compared with experimental structures, how useful are predicted models for functional annotation? We assessed the functional utility of predicted models by comparing the performances of a suite of methods for functional characterization on the predictions and the experimental structures. We identified 28 sites in 25 protein targets to perform functional assessment. These 28 sites included nine sites with known ligand binding (holo‐sites), nine sites that are expected or suggested by experimental authors for small molecule binding (apo‐sites), and Ten sites containing important motifs, loops, or key residues with important disease‐associated mutations. We evaluated the utility of the predictions by comparing their microenvironments to the experimental structures. Overall structural quality correlates with functional utility. However, the best‐ranked predictions (global) may not have the best functional quality (local). Our assessment provides an ability to discriminate between predictions with high structural quality. When assessing ligand‐binding sites, most prediction methods have higher performance on apo‐sites than holo‐sites. Some servers show consistently high performance for certain types of functional sites. Finally, many functional sites are associated with protein‐protein interaction. We also analyzed biologically relevant features from the protein assemblies of two targets where the active site spanned the protein‐protein interface. For the assembly targets, we find that the features in the models are mainly determined by the choice of template. PMID:28975675
Markov State Models Provide Insights into Dynamic Modulation of Protein Function
2015-01-01
Conspectus Protein function is inextricably linked to protein dynamics. As we move from a static structural picture to a dynamic ensemble view of protein structure and function, novel computational paradigms are required for observing and understanding conformational dynamics of proteins and its functional implications. In principle, molecular dynamics simulations can provide the time evolution of atomistic models of proteins, but the long time scales associated with functional dynamics make it difficult to observe rare dynamical transitions. The issue of extracting essential functional components of protein dynamics from noisy simulation data presents another set of challenges in obtaining an unbiased understanding of protein motions. Therefore, a methodology that provides a statistical framework for efficient sampling and a human-readable view of the key aspects of functional dynamics from data analysis is required. The Markov state model (MSM), which has recently become popular worldwide for studying protein dynamics, is an example of such a framework. In this Account, we review the use of Markov state models for efficient sampling of the hierarchy of time scales associated with protein dynamics, automatic identification of key conformational states, and the degrees of freedom associated with slow dynamical processes. Applications of MSMs for studying long time scale phenomena such as activation mechanisms of cellular signaling proteins has yielded novel insights into protein function. In particular, from MSMs built using large-scale simulations of GPCRs and kinases, we have shown that complex conformational changes in proteins can be described in terms of structural changes in key structural motifs or “molecular switches” within the protein, the transitions between functionally active and inactive states of proteins proceed via multiple pathways, and ligand or substrate binding modulates the flux through these pathways. Finally, MSMs also provide a theoretical toolbox for studying the effect of nonequilibrium perturbations on conformational dynamics. Considering that protein dynamics in vivo occur under nonequilibrium conditions, MSMs coupled with nonequilibrium statistical mechanics provide a way to connect cellular components to their functional environments. Nonequilibrium perturbations of protein folding MSMs reveal the presence of dynamically frozen glass-like states in their conformational landscape. These frozen states are also observed to be rich in β-sheets, which indicates their possible role in the nucleation of β-sheet rich aggregates such as those observed in amyloid-fibril formation. Finally, we describe how MSMs have been used to understand the dynamical behavior of intrinsically disordered proteins such as amyloid-β, human islet amyloid polypeptide, and p53. While certainly not a panacea for studying functional dynamics, MSMs provide a rigorous theoretical foundation for understanding complex entropically dominated processes and a convenient lens for viewing protein motions. PMID:25625937
Cabeen, Matthew T; Herrmann, Harald; Jacobs-Wagner, Christine
2011-01-01
Crescentin is a bacterial filament-forming protein that exhibits domain organization features found in metazoan intermediate filament (IF) proteins. Structure-function studies of eukaryotic IFs have been hindered by a lack of simple genetic systems and easily quantifiable phenotypes. Here we exploit the characteristic localization of the crescentin structure along the inner curvature of Caulobacter crescentus cells and the loss of cell curvature associated with impaired crescentin function to analyze the importance of the domain organization of crescentin. By combining biochemistry and ultrastructural analysis in vitro with cellular localization and functional studies, we show that crescentin requires its distinctive domain organization, and furthermore that different structural elements have distinct structural and functional contributions. The head domain can be functionally subdivided into two subdomains; the first (amino-terminal) is required for function but not assembly, while the second is necessary for structure assembly. The rod domain is similarly required for structure assembly, and the linker L1 appears important to prevent runaway assembly into nonfunctional aggregates. The data also suggest that the stutter and the tail domain have critical functional roles in stabilizing crescentin structures against disassembly by monovalent cations in the cytoplasm. This study suggests that the IF-like behavior of crescentin is a consequence of its domain organization, implying that the IF protein layout is an adaptable cytoskeletal motif, much like the actin and tubulin folds, that is broadly exploited for various functions throughout life from bacteria to humans. © 2011 Wiley-Liss, Inc. PMID:21360832
The La and related RNA-binding proteins (LARPs): structures, functions, and evolving perspectives.
Maraia, Richard J; Mattijssen, Sandy; Cruz-Gallardo, Isabel; Conte, Maria R
2017-11-01
La was first identified as a polypeptide component of ribonucleic protein complexes targeted by antibodies in autoimmune patients and is now known to be a eukaryote cell-ubiquitous protein. Structure and function studies have shown that La binds to a common terminal motif, UUU-3'-OH, of nascent RNA polymerase III (RNAP III) transcripts and protects them from exonucleolytic decay. For precursor-tRNAs, the most diverse and abundant of these transcripts, La also functions as an RNA chaperone that helps to prevent their misfolding. Related to this, we review evidence that suggests that La and its link to RNAP III were significant in the great expansions of the tRNAomes that occurred in eukaryotes. Four families of La-related proteins (LARPs) emerged during eukaryotic evolution with specialized functions. We provide an overview of the high-resolution structural biology of La and LARPs. LARP7 family members most closely resemble La but function with a single RNAP III nuclear transcript, 7SK, or telomerase RNA. A cytoplasmic isoform of La protein as well as LARPs 6, 4, and 1 function in mRNA metabolism and translation in distinct but similar ways, sometimes with the poly(A)-binding protein, and in some cases by direct binding to poly(A)-RNA. New structures of LARP domains, some complexed with RNA, provide novel insights into the functional versatility of these proteins. We also consider LARPs in relation to ancestral La protein and potential retention of links to specific RNA-related pathways. One such link may be tRNA surveillance and codon usage by LARP-associated mRNAs. WIREs RNA 2017, 8:e1430. doi: 10.1002/wrna.1430 For further resources related to this article, please visit the WIREs website. © 2017 Wiley Periodicals, Inc.
Structural basis for host membrane remodeling induced by protein 2B of hepatitis A virus.
Vives-Adrián, Laia; Garriga, Damià; Buxaderas, Mònica; Fraga, Joana; Pereira, Pedro José Barbosa; Macedo-Ribeiro, Sandra; Verdaguer, Núria
2015-04-01
The complexity of viral RNA synthesis and the numerous participating factors require a mechanism to topologically coordinate and concentrate these multiple viral and cellular components, ensuring a concerted function. Similarly to all other positive-strand RNA viruses, picornaviruses induce rearrangements of host intracellular membranes to create structures that act as functional scaffolds for genome replication. The membrane-targeting proteins 2B and 2C, their precursor 2BC, and protein 3A appear to be primarily involved in membrane remodeling. Little is known about the structure of these proteins and the mechanisms by which they induce massive membrane remodeling. Here we report the crystal structure of the soluble region of hepatitis A virus (HAV) protein 2B, consisting of two domains: a C-terminal helical bundle preceded by an N-terminally curved five-stranded antiparallel β-sheet that displays striking structural similarity to the β-barrel domain of enteroviral 2A proteins. Moreover, the helicoidal arrangement of the protein molecules in the crystal provides a model for 2B-induced host membrane remodeling during HAV infection. No structural information is currently available for the 2B protein of any picornavirus despite it being involved in a critical process in viral factory formation: the rearrangement of host intracellular membranes. Here we present the structure of the soluble domain of the 2B protein of hepatitis A virus (HAV). Its arrangement, both in crystals and in solution under physiological conditions, can help to understand its function and sheds some light on the membrane rearrangement process, a putative target of future antiviral drugs. Moreover, this first structure of a picornaviral 2B protein also unveils a closer evolutionary relationship between the hepatovirus and enterovirus genera within the Picornaviridae family. Copyright © 2015, American Society for Microbiology. All Rights Reserved.
Structural Basis for Host Membrane Remodeling Induced by Protein 2B of Hepatitis A Virus
Vives-Adrián, Laia; Garriga, Damià; Buxaderas, Mònica; Fraga, Joana; Pereira, Pedro José Barbosa
2015-01-01
ABSTRACT The complexity of viral RNA synthesis and the numerous participating factors require a mechanism to topologically coordinate and concentrate these multiple viral and cellular components, ensuring a concerted function. Similarly to all other positive-strand RNA viruses, picornaviruses induce rearrangements of host intracellular membranes to create structures that act as functional scaffolds for genome replication. The membrane-targeting proteins 2B and 2C, their precursor 2BC, and protein 3A appear to be primarily involved in membrane remodeling. Little is known about the structure of these proteins and the mechanisms by which they induce massive membrane remodeling. Here we report the crystal structure of the soluble region of hepatitis A virus (HAV) protein 2B, consisting of two domains: a C-terminal helical bundle preceded by an N-terminally curved five-stranded antiparallel β-sheet that displays striking structural similarity to the β-barrel domain of enteroviral 2A proteins. Moreover, the helicoidal arrangement of the protein molecules in the crystal provides a model for 2B-induced host membrane remodeling during HAV infection. IMPORTANCE No structural information is currently available for the 2B protein of any picornavirus despite it being involved in a critical process in viral factory formation: the rearrangement of host intracellular membranes. Here we present the structure of the soluble domain of the 2B protein of hepatitis A virus (HAV). Its arrangement, both in crystals and in solution under physiological conditions, can help to understand its function and sheds some light on the membrane rearrangement process, a putative target of future antiviral drugs. Moreover, this first structure of a picornaviral 2B protein also unveils a closer evolutionary relationship between the hepatovirus and enterovirus genera within the Picornaviridae family. PMID:25589659
Proteins with Novel Structure, Function and Dynamics
NASA Technical Reports Server (NTRS)
Pohorille, Andrew
2014-01-01
Recently, a small enzyme that ligates two RNA fragments with the rate of 10(exp 6) above background was evolved in vitro (Seelig and Szostak, Nature 448:828-831, 2007). This enzyme does not resemble any contemporary protein (Chao et al., Nature Chem. Biol. 9:81-83, 2013). It consists of a dynamic, catalytic loop, a small, rigid core containing two zinc ions coordinated by neighboring amino acids, and two highly flexible tails that might be unimportant for protein function. In contrast to other proteins, this enzyme does not contain ordered secondary structure elements, such as alpha-helix or beta-sheet. The loop is kept together by just two interactions of a charged residue and a histidine with a zinc ion, which they coordinate on the opposite side of the loop. Such structure appears to be very fragile. Surprisingly, computer simulations indicate otherwise. As the coordinating, charged residue is mutated to alanine, another, nearby charged residue takes its place, thus keeping the structure nearly intact. If this residue is also substituted by alanine a salt bridge involving two other, charged residues on the opposite sides of the loop keeps the loop in place. These adjustments are facilitated by high flexibility of the protein. Computational predictions have been confirmed experimentally, as both mutants retain full activity and overall structure. These results challenge our notions about what is required for protein activity and about the relationship between protein dynamics, stability and robustness. We hypothesize that small, highly dynamic proteins could be both active and fault tolerant in ways that many other proteins are not, i.e. they can adjust to retain their structure and activity even if subjected to mutations in structurally critical regions. This opens the doors for designing proteins with novel functions, structures and dynamics that have not been yet considered.
Tighter Ligand Binding Can Compensate for Impaired Stability of an RNA-Binding Protein.
Wallis, Christopher P; Richman, Tara R; Filipovska, Aleksandra; Rackham, Oliver
2018-06-15
It has been widely shown that ligand-binding residues, by virtue of their orientation, charge, and solvent exposure, often have a net destabilizing effect on proteins that is offset by stability conferring residues elsewhere in the protein. This structure-function trade-off can constrain possible adaptive evolutionary changes of function and may hamper protein engineering efforts to design proteins with new functions. Here, we present evidence from a large randomized mutant library screen that, in the case of PUF RNA-binding proteins, this structural relationship may be inverted and that active-site mutations that increase protein activity are also able to compensate for impaired stability. We show that certain mutations in RNA-protein binding residues are not necessarily destabilizing and that increased ligand-binding can rescue an insoluble, unstable PUF protein. We hypothesize that these mutations restabilize the protein via thermodynamic coupling of protein folding and RNA binding.
ERIC Educational Resources Information Center
Rundgren, Carl-Johan; Hirsch, Richard; Chang Rundgren, Shu-Nu; Tibell, Lena A. E.
2012-01-01
This study examines how students explain their conceptual understanding of protein function using visualizations. Thirteen upper secondary students, four tertiary students (studying chemical biology), and two experts were interviewed in semi-structured interviews. The interviews were structured around 2D illustrations of proteins and an animated…
Smith, Colin A; Kortemme, Tanja
2011-01-01
Predicting the set of sequences that are tolerated by a protein or protein interface, while maintaining a desired function, is useful for characterizing protein interaction specificity and for computationally designing sequence libraries to engineer proteins with new functions. Here we provide a general method, a detailed set of protocols, and several benchmarks and analyses for estimating tolerated sequences using flexible backbone protein design implemented in the Rosetta molecular modeling software suite. The input to the method is at least one experimentally determined three-dimensional protein structure or high-quality model. The starting structure(s) are expanded or refined into a conformational ensemble using Monte Carlo simulations consisting of backrub backbone and side chain moves in Rosetta. The method then uses a combination of simulated annealing and genetic algorithm optimization methods to enrich for low-energy sequences for the individual members of the ensemble. To emphasize certain functional requirements (e.g. forming a binding interface), interactions between and within parts of the structure (e.g. domains) can be reweighted in the scoring function. Results from each backbone structure are merged together to create a single estimate for the tolerated sequence space. We provide an extensive description of the protocol and its parameters, all source code, example analysis scripts and three tests applying this method to finding sequences predicted to stabilize proteins or protein interfaces. The generality of this method makes many other applications possible, for example stabilizing interactions with small molecules, DNA, or RNA. Through the use of within-domain reweighting and/or multistate design, it may also be possible to use this method to find sequences that stabilize particular protein conformations or binding interactions over others.
Modular protein domains: an engineering approach toward functional biomaterials.
Lin, Charng-Yu; Liu, Julie C
2016-08-01
Protein domains and peptide sequences are a powerful tool for conferring specific functions to engineered biomaterials. Protein sequences with a wide variety of functionalities, including structure, bioactivity, protein-protein interactions, and stimuli responsiveness, have been identified, and advances in molecular biology continue to pinpoint new sequences. Protein domains can be combined to make recombinant proteins with multiple functionalities. The high fidelity of the protein translation machinery results in exquisite control over the sequence of recombinant proteins and the resulting properties of protein-based materials. In this review, we discuss protein domains and peptide sequences in the context of functional protein-based materials, composite materials, and their biological applications. Copyright © 2016 Elsevier Ltd. All rights reserved.
Predicting Structure and Function for Novel Proteins of an Extremophilic Iron Oxidizing Bacterium
NASA Astrophysics Data System (ADS)
Wheeler, K.; Zemla, A.; Banfield, J.; Thelen, M.
2007-12-01
Proteins isolated from uncultivated microbial populations represent the functional components of microbial processes and contribute directly to community fitness under natural conditions. Investigations into proteins in the environment are hindered by the lack of genome data, or where available, the high proportion of proteins of unknown function. We have identified thousands of proteins from biofilms in the extremely acidic drainage outflow of an iron mine ecosystem (1). With an extensive genomic and proteomic foundation, we have focused directly on the problem of several hundred proteins of unknown function within this well-defined model system. Here we describe the geobiological insights gained by using a high throughput computational approach for predicting structure and function of 421 novel proteins from the biofilm community. We used a homology based modeling system to compare these proteins to those of known structure (AS2TS) (2). This approach has resulted in the assignment of structures to 360 proteins (85%) and provided functional information for up to 75% of the modeled proteins. Detailed examination of the modeling results enables confident, high-throughput prediction of the roles of many of the novel proteins within the microbial community. For instance, one prediction places a protein in the phosphoenolpyruvate/pyruvate domain superfamily as a carboxylase that fills in a gap in an otherwise complete carbon cycle. Particularly important for a community in such a metal rich environment is the evolution of over 25% of the novel proteins that contain a metal cofactor; of these, one third are likely Fe containing proteins. Two of the most abundant proteins in biofilm samples are unusual c-type cytochromes. Both of these proteins catalyze iron- oxidation, a key metabolic reaction supporting the energy requirements of this community. Structural models of these cytochromes verify our experimental results on heme binding and electron transfer reactivity, and provide details for a working hypothesis of electron flow within the biofilm's major bacterium. Nearly 7% of the novel proteins contain tetratrico peptide repeat (TPR) modules, a protein-protein interaction domain that participates in signal transduction and a wide variety of other cellular functions. Like many biofilms, the various organisms in this community use unknown mechanisms to communicate, relying upon each other for survival. Especially interesting is evidence that most of these novel TPR proteins are located in the extracellular or membrane fractions, suggesting their role in intracellular communication. (1) Ram et al, 2005, Science 308:1915-20, "Community Proteomics of a Natural Microbial Biofilm" (2) Zemla et al, 2005, Nucleic Acids Res 33 (Web Server issue):W111-5, "AS2TS system for protein structure modeling and analysis" This work was funded by the DOE Genomics: GTL Program and was performed under the auspices of the DOE by the University of California, Lawrence Livermore National Laboratory under contract W-7405-Eng-48.
Esque, Jérémy; Urbain, Aurélie; Etchebest, Catherine; de Brevern, Alexandre G
2015-11-01
Transmembrane proteins (TMPs) are major drug targets, but the knowledge of their precise topology structure remains highly limited compared with globular proteins. In spite of the difficulties in obtaining their structures, an important effort has been made these last years to increase their number from an experimental and computational point of view. In view of this emerging challenge, the development of computational methods to extract knowledge from these data is crucial for the better understanding of their functions and in improving the quality of structural models. Here, we revisit an efficient unsupervised learning procedure, called Hybrid Protein Model (HPM), which is applied to the analysis of transmembrane proteins belonging to the all-α structural class. HPM method is an original classification procedure that efficiently combines sequence and structure learning. The procedure was initially applied to the analysis of globular proteins. In the present case, HPM classifies a set of overlapping protein fragments, extracted from a non-redundant databank of TMP 3D structure. After fine-tuning of the learning parameters, the optimal classification results in 65 clusters. They represent at best similar relationships between sequence and local structure properties of TMPs. Interestingly, HPM distinguishes among the resulting clusters two helical regions with distinct hydrophobic patterns. This underlines the complexity of the topology of these proteins. The HPM classification enlightens unusual relationship between amino acids in TMP fragments, which can be useful to elaborate new amino acids substitution matrices. Finally, two challenging applications are described: the first one aims at annotating protein functions (channel or not), the second one intends to assess the quality of the structures (X-ray or models) via a new scoring function deduced from the HPM classification.
A fully automatic evolutionary classification of protein folds: Dali Domain Dictionary version 3
Dietmann, Sabine; Park, Jong; Notredame, Cedric; Heger, Andreas; Lappe, Michael; Holm, Liisa
2001-01-01
The Dali Domain Dictionary (http://www.ebi.ac.uk/dali/domain) is a numerical taxonomy of all known structures in the Protein Data Bank (PDB). The taxonomy is derived fully automatically from measurements of structural, functional and sequence similarities. Here, we report the extension of the classification to match the traditional four hierarchical levels corresponding to: (i) supersecondary structural motifs (attractors in fold space), (ii) the topology of globular domains (fold types), (iii) remote homologues (functional families) and (iv) homologues with sequence identity above 25% (sequence families). The computational definitions of attractors and functional families are new. In September 2000, the Dali classification contained 10 531 PDB entries comprising 17 101 chains, which were partitioned into five attractor regions, 1375 fold types, 2582 functional families and 3724 domain sequence families. Sequence families were further associated with 99 582 unique homologous sequences in the HSSP database, which increases the number of effectively known structures several-fold. The resulting database contains the description of protein domain architecture, the definition of structural neighbours around each known structure, the definition of structurally conserved cores and a comprehensive library of explicit multiple alignments of distantly related protein families. PMID:11125048
Claudin Loss-of-Function Disrupts Tight Junctions and Impairs Amelogenesis
Bardet, Claire; Ribes, Sandy; Wu, Yong; Diallo, Mamadou Tidiane; Salmon, Benjamin; Breiderhoff, Tilman; Houillier, Pascal; Müller, Dominik; Chaussain, Catherine
2017-01-01
Claudins are a family of proteins that forms paracellular barriers and pores determining tight junctions (TJ) permeability. Claudin-16 and -19 are pore forming TJ proteins allowing calcium and magnesium reabsorption in the thick ascending limb of Henle's loop (TAL). Loss-of-function mutations in the encoding genes, initially identified to cause Familial Hypomagnesemia with Hypercalciuria and Nephrocalcinosis (FHHNC), were recently shown to be also involved in Amelogenesis Imperfecta (AI). In addition, both claudins were expressed in the murine tooth germ and Claudin-16 knockout (KO) mice displayed abnormal enamel formation. Claudin-3, an ubiquitous claudin expressed in epithelia including kidney, acts as a barrier-forming tight junction protein. We determined that, similarly to claudin-16 and claudin-19, claudin-3 was expressed in the tooth germ, more precisely in the TJ located at the apical end of secretory ameloblasts. The observation of Claudin-3 KO teeth revealed enamel defects associated to impaired TJ structure at the secretory ends of ameloblasts and accumulation of matrix proteins in the forming enamel. Thus, claudin-3 protein loss-of-function disturbs amelogenesis similarly to claudin-16 loss-of-function, highlighting the importance of claudin proteins for the TJ structure. These findings unravel that loss-of-function of either pore or barrier-forming TJ proteins leads to enamel defects. Hence, the major structural function of claudin proteins appears essential for amelogenesis. PMID:28596736
Claudin Loss-of-Function Disrupts Tight Junctions and Impairs Amelogenesis.
Bardet, Claire; Ribes, Sandy; Wu, Yong; Diallo, Mamadou Tidiane; Salmon, Benjamin; Breiderhoff, Tilman; Houillier, Pascal; Müller, Dominik; Chaussain, Catherine
2017-01-01
Claudins are a family of proteins that forms paracellular barriers and pores determining tight junctions (TJ) permeability. Claudin-16 and -19 are pore forming TJ proteins allowing calcium and magnesium reabsorption in the thick ascending limb of Henle's loop (TAL). Loss-of-function mutations in the encoding genes, initially identified to cause Familial Hypomagnesemia with Hypercalciuria and Nephrocalcinosis (FHHNC), were recently shown to be also involved in Amelogenesis Imperfecta (AI). In addition, both claudins were expressed in the murine tooth germ and Claudin-16 knockout (KO) mice displayed abnormal enamel formation. Claudin-3, an ubiquitous claudin expressed in epithelia including kidney, acts as a barrier-forming tight junction protein. We determined that, similarly to claudin-16 and claudin-19, claudin-3 was expressed in the tooth germ, more precisely in the TJ located at the apical end of secretory ameloblasts. The observation of Claudin-3 KO teeth revealed enamel defects associated to impaired TJ structure at the secretory ends of ameloblasts and accumulation of matrix proteins in the forming enamel. Thus, claudin-3 protein loss-of-function disturbs amelogenesis similarly to claudin-16 loss-of-function, highlighting the importance of claudin proteins for the TJ structure. These findings unravel that loss-of-function of either pore or barrier-forming TJ proteins leads to enamel defects. Hence, the major structural function of claudin proteins appears essential for amelogenesis.
Resilience of biochemical activity in protein domains in the face of structural divergence.
Zhang, Dapeng; Iyer, Lakshminarayan M; Burroughs, A Maxwell; Aravind, L
2014-06-01
Recent studies point to the prevalence of the evolutionary phenomenon of drastic structural transformation of protein domains while continuing to preserve their basic biochemical function. These transformations span a wide spectrum, including simple domains incorporated into larger structural scaffolds, changes in the structural core, major active site shifts, topological rewiring and extensive structural transmogrifications. Proteins from biological conflict systems, such as toxin-antitoxin, restriction-modification, CRISPR/Cas, polymorphic toxin and secondary metabolism systems commonly display such transformations. These include endoDNases, metal-independent RNases, deaminases, ADP ribosyltransferases, immunity proteins, kinases and E1-like enzymes. In eukaryotes such transformations are seen in domains involved in chromatin-related peptide recognition and protein/DNA-modification. Intense selective pressures from 'arms-race'-like situations in conflict and macromolecular modification systems could favor drastic structural divergence while preserving function. Published by Elsevier Ltd.
Effect of fullerenol surface chemistry on nanoparticle binding-induced protein misfolding
NASA Astrophysics Data System (ADS)
Radic, Slaven; Nedumpully-Govindan, Praveen; Chen, Ran; Salonen, Emppu; Brown, Jared M.; Ke, Pu Chun; Ding, Feng
2014-06-01
Fullerene and its derivatives with different surface chemistry have great potential in biomedical applications. Accordingly, it is important to delineate the impact of these carbon-based nanoparticles on protein structure, dynamics, and subsequently function. Here, we focused on the effect of hydroxylation -- a common strategy for solubilizing and functionalizing fullerene -- on protein-nanoparticle interactions using a model protein, ubiquitin. We applied a set of complementary computational modeling methods, including docking and molecular dynamics simulations with both explicit and implicit solvent, to illustrate the impact of hydroxylated fullerenes on the structure and dynamics of ubiquitin. We found that all derivatives bound to the model protein. Specifically, the more hydrophilic nanoparticles with a higher number of hydroxyl groups bound to the surface of the protein via hydrogen bonds, which stabilized the protein without inducing large conformational changes in the protein structure. In contrast, fullerene derivatives with a smaller number of hydroxyl groups buried their hydrophobic surface inside the protein, thereby causing protein denaturation. Overall, our results revealed a distinct role of surface chemistry on nanoparticle-protein binding and binding-induced protein misfolding.Fullerene and its derivatives with different surface chemistry have great potential in biomedical applications. Accordingly, it is important to delineate the impact of these carbon-based nanoparticles on protein structure, dynamics, and subsequently function. Here, we focused on the effect of hydroxylation -- a common strategy for solubilizing and functionalizing fullerene -- on protein-nanoparticle interactions using a model protein, ubiquitin. We applied a set of complementary computational modeling methods, including docking and molecular dynamics simulations with both explicit and implicit solvent, to illustrate the impact of hydroxylated fullerenes on the structure and dynamics of ubiquitin. We found that all derivatives bound to the model protein. Specifically, the more hydrophilic nanoparticles with a higher number of hydroxyl groups bound to the surface of the protein via hydrogen bonds, which stabilized the protein without inducing large conformational changes in the protein structure. In contrast, fullerene derivatives with a smaller number of hydroxyl groups buried their hydrophobic surface inside the protein, thereby causing protein denaturation. Overall, our results revealed a distinct role of surface chemistry on nanoparticle-protein binding and binding-induced protein misfolding. Electronic supplementary information (ESI) is available: Fluorescence spectra, ITC, CD spectra and other data as described in the text. See DOI: 10.1039/c4nr01544d
Nasir, Arshan; Naeem, Aisha; Khan, Muhammad Jawad; Lopez-Nicora, Horacio D.; Caetano-Anollés, Gustavo
2011-01-01
The functional repertoire of a cell is largely embodied in its proteome, the collection of proteins encoded in the genome of an organism. The molecular functions of proteins are the direct consequence of their structure and structure can be inferred from sequence using hidden Markov models of structural recognition. Here we analyze the functional annotation of protein domain structures in almost a thousand sequenced genomes, exploring the functional and structural diversity of proteomes. We find there is a remarkable conservation in the distribution of domains with respect to the molecular functions they perform in the three superkingdoms of life. In general, most of the protein repertoire is spent in functions related to metabolic processes but there are significant differences in the usage of domains for regulatory and extra-cellular processes both within and between superkingdoms. Our results support the hypotheses that the proteomes of superkingdom Eukarya evolved via genome expansion mechanisms that were directed towards innovating new domain architectures for regulatory and extra/intracellular process functions needed for example to maintain the integrity of multicellular structure or to interact with environmental biotic and abiotic factors (e.g., cell signaling and adhesion, immune responses, and toxin production). Proteomes of microbial superkingdoms Archaea and Bacteria retained fewer numbers of domains and maintained simple and smaller protein repertoires. Viruses appear to play an important role in the evolution of superkingdoms. We finally identify few genomic outliers that deviate significantly from the conserved functional design. These include Nanoarchaeum equitans, proteobacterial symbionts of insects with extremely reduced genomes, Tenericutes and Guillardia theta. These organisms spend most of their domains on information functions, including translation and transcription, rather than on metabolism and harbor a domain repertoire characteristic of parasitic organisms. In contrast, the functional repertoire of the proteomes of the Planctomycetes-Verrucomicrobia-Chlamydiae superphylum was no different than the rest of bacteria, failing to support claims of them representing a separate superkingdom. In turn, Protista and Bacteria shared similar functional distribution patterns suggesting an ancestral evolutionary link between these groups. PMID:24710297
Schuler, Benjamin; Soranno, Andrea; Hofmann, Hagen; Nettels, Daniel
2016-07-05
The properties of unfolded proteins have long been of interest because of their importance to the protein folding process. Recently, the surprising prevalence of unstructured regions or entirely disordered proteins under physiological conditions has led to the realization that such intrinsically disordered proteins can be functional even in the absence of a folded structure. However, owing to their broad conformational distributions, many of the properties of unstructured proteins are difficult to describe with the established concepts of structural biology. We have thus seen a reemergence of polymer physics as a versatile framework for understanding their structure and dynamics. An important driving force for these developments has been single-molecule spectroscopy, as it allows structural heterogeneity, intramolecular distance distributions, and dynamics to be quantified over a wide range of timescales and solution conditions. Polymer concepts provide an important basis for relating the physical properties of unstructured proteins to folding and function.
Origins of Protein Functions in Cells
NASA Technical Reports Server (NTRS)
Seelig, Burchard; Pohorille, Andrzej
2011-01-01
In modern organisms proteins perform a majority of cellular functions, such as chemical catalysis, energy transduction and transport of material across cell walls. Although great strides have been made towards understanding protein evolution, a meaningful extrapolation from contemporary proteins to their earliest ancestors is virtually impossible. In an alternative approach, the origin of water-soluble proteins was probed through the synthesis and in vitro evolution of very large libraries of random amino acid sequences. In combination with computer modeling and simulations, these experiments allow us to address a number of fundamental questions about the origins of proteins. Can functionality emerge from random sequences of proteins? How did the initial repertoire of functional proteins diversify to facilitate new functions? Did this diversification proceed primarily through drawing novel functionalities from random sequences or through evolution of already existing proto-enzymes? Did protein evolution start from a pool of proteins defined by a frozen accident and other collections of proteins could start a different evolutionary pathway? Although we do not have definitive answers to these questions yet, important clues have been uncovered. In one example (Keefe and Szostak, 2001), novel ATP binding proteins were identified that appear to be unrelated in both sequence and structure to any known ATP binding proteins. One of these proteins was subsequently redesigned computationally to bind GTP through introducing several mutations that introduce targeted structural changes to the protein, improve its binding to guanine and prevent water from accessing the active center. This study facilitates further investigations of individual evolutionary steps that lead to a change of function in primordial proteins. In a second study (Seelig and Szostak, 2007), novel enzymes were generated that can join two pieces of RNA in a reaction for which no natural enzymes are known. Recently it was found that, as in the previous case, the proteins have a structure unknown among modern enzymes. In this case, in vitro evolution started from a small, non-enzymatic protein. A similar selection process initiated from a library of random polypeptides is in progress. These results not only allow for estimating the occurrence of function in random protein assemblies but also provide evidence for the possibility of alternative protein worlds. Extant proteins might simply represent a frozen accident in the world of possible proteins. Alternative collections of proteins, even with similar functions, could originate alternative evolutionary paths.
An Integrated Framework Advancing Membrane Protein Modeling and Design
Weitzner, Brian D.; Duran, Amanda M.; Tilley, Drew C.; Elazar, Assaf; Gray, Jeffrey J.
2015-01-01
Membrane proteins are critical functional molecules in the human body, constituting more than 30% of open reading frames in the human genome. Unfortunately, a myriad of difficulties in overexpression and reconstitution into membrane mimetics severely limit our ability to determine their structures. Computational tools are therefore instrumental to membrane protein structure prediction, consequently increasing our understanding of membrane protein function and their role in disease. Here, we describe a general framework facilitating membrane protein modeling and design that combines the scientific principles for membrane protein modeling with the flexible software architecture of Rosetta3. This new framework, called RosettaMP, provides a general membrane representation that interfaces with scoring, conformational sampling, and mutation routines that can be easily combined to create new protocols. To demonstrate the capabilities of this implementation, we developed four proof-of-concept applications for (1) prediction of free energy changes upon mutation; (2) high-resolution structural refinement; (3) protein-protein docking; and (4) assembly of symmetric protein complexes, all in the membrane environment. Preliminary data show that these algorithms can produce meaningful scores and structures. The data also suggest needed improvements to both sampling routines and score functions. Importantly, the applications collectively demonstrate the potential of combining the flexible nature of RosettaMP with the power of Rosetta algorithms to facilitate membrane protein modeling and design. PMID:26325167
Isom, Daniel G; Marguet, Philippe R; Oas, Terrence G; Hellinga, Homme W
2011-04-01
Protein thermodynamic stability is a fundamental physical characteristic that determines biological function. Furthermore, alteration of thermodynamic stability by macromolecular interactions or biochemical modifications is a powerful tool for assessing the relationship between protein structure, stability, and biological function. High-throughput approaches for quantifying protein stability are beginning to emerge that enable thermodynamic measurements on small amounts of material, in short periods of time, and using readily accessible instrumentation. Here we present such a method, fast quantitative cysteine reactivity, which exploits the linkage between protein stability, sidechain protection by protein structure, and structural dynamics to characterize the thermodynamic and kinetic properties of proteins. In this approach, the reaction of a protected cysteine and thiol-reactive fluorogenic indicator is monitored over a gradient of temperatures after a short incubation time. These labeling data can be used to determine the midpoint of thermal unfolding, measure the temperature dependence of protein stability, quantify ligand-binding affinity, and, under certain conditions, estimate folding rate constants. Here, we demonstrate the fQCR method by characterizing these thermodynamic and kinetic properties for variants of Staphylococcal nuclease and E. coli ribose-binding protein engineered to contain single, protected cysteines. These straightforward, information-rich experiments are likely to find applications in protein engineering and functional genomics. Copyright © 2010 Wiley-Liss, Inc.
WEBnm@ v2.0: Web server and services for comparing protein flexibility.
Tiwari, Sandhya P; Fuglebakk, Edvin; Hollup, Siv M; Skjærven, Lars; Cragnolini, Tristan; Grindhaug, Svenn H; Tekle, Kidane M; Reuter, Nathalie
2014-12-30
Normal mode analysis (NMA) using elastic network models is a reliable and cost-effective computational method to characterise protein flexibility and by extension, their dynamics. Further insight into the dynamics-function relationship can be gained by comparing protein motions between protein homologs and functional classifications. This can be achieved by comparing normal modes obtained from sets of evolutionary related proteins. We have developed an automated tool for comparative NMA of a set of pre-aligned protein structures. The user can submit a sequence alignment in the FASTA format and the corresponding coordinate files in the Protein Data Bank (PDB) format. The computed normalised squared atomic fluctuations and atomic deformation energies of the submitted structures can be easily compared on graphs provided by the web user interface. The web server provides pairwise comparison of the dynamics of all proteins included in the submitted set using two measures: the Root Mean Squared Inner Product and the Bhattacharyya Coefficient. The Comparative Analysis has been implemented on our web server for NMA, WEBnm@, which also provides recently upgraded functionality for NMA of single protein structures. This includes new visualisations of protein motion, visualisation of inter-residue correlations and the analysis of conformational change using the overlap analysis. In addition, programmatic access to WEBnm@ is now available through a SOAP-based web service. Webnm@ is available at http://apps.cbu.uib.no/webnma . WEBnm@ v2.0 is an online tool offering unique capability for comparative NMA on multiple protein structures. Along with a convenient web interface, powerful computing resources, and several methods for mode analyses, WEBnm@ facilitates the assessment of protein flexibility within protein families and superfamilies. These analyses can give a good view of how the structures move and how the flexibility is conserved over the different structures.
Discovering rules for protein-ligand specificity using support vector inductive logic programming.
Kelley, Lawrence A; Shrimpton, Paul J; Muggleton, Stephen H; Sternberg, Michael J E
2009-09-01
Structural genomics initiatives are rapidly generating vast numbers of protein structures. Comparative modelling is also capable of producing accurate structural models for many protein sequences. However, for many of the known structures, functions are not yet determined, and in many modelling tasks, an accurate structural model does not necessarily tell us about function. Thus, there is a pressing need for high-throughput methods for determining function from structure. The spatial arrangement of key amino acids in a folded protein, on the surface or buried in clefts, is often the determinants of its biological function. A central aim of molecular biology is to understand the relationship between such substructures or surfaces and biological function, leading both to function prediction and to function design. We present a new general method for discovering the features of binding pockets that confer specificity for particular ligands. Using a recently developed machine-learning technique which couples the rule-discovery approach of inductive logic programming with the statistical learning power of support vector machines, we are able to discriminate, with high precision (90%) and recall (86%) between pockets that bind FAD and those that bind NAD on a large benchmark set given only the geometry and composition of the backbone of the binding pocket without the use of docking. In addition, we learn rules governing this specificity which can feed into protein functional design protocols. An analysis of the rules found suggests that key features of the binding pocket may be tied to conformational freedom in the ligand. The representation is sufficiently general to be applicable to any discriminatory binding problem. All programs and data sets are freely available to non-commercial users at http://www.sbg.bio.ic.ac.uk/svilp_ligand/.
USDA-ARS?s Scientific Manuscript database
Potato leafroll virus (PLRV) produces a readthrough protein (RTP) via translational readthrough of the coat protein amber stop codon. The RTP functions as a structural component of the virion and as a non-incorporated protein in concert with numerous insect and plant proteins to regulate virus movem...
Covering complete proteomes with X-ray structures: A current snapshot
Mizianty, Marcin J.; Fan, Xiao; Yan, Jing; ...
2014-10-23
Structural genomics programs have developed and applied structure-determination pipelines to a wide range of protein targets, facilitating the visualization of macromolecular interactions and the understanding of their molecular and biochemical functions. The fundamental question of whether three-dimensional structures of all proteins and all functional annotations can be determined using X-ray crystallography is investigated. A first-of-its-kind large-scale analysis of crystallization propensity for all proteins encoded in 1953 fully sequenced genomes was performed. It is shown that current X-ray crystallographic knowhow combined with homology modeling can provide structures for 25% of modeling families (protein clusters for which structural models can be obtainedmore » through homology modeling), with at least one structural model produced for each Gene Ontology functional annotation. The coverage varies between superkingdoms, with 19% for eukaryotes, 35% for bacteria and 49% for archaea, and with those of viruses following the coverage values of their hosts. It is shown that the crystallization propensities of proteomes from the taxonomic superkingdoms are distinct. The use of knowledge-based target selection is shown to substantially increase the ability to produce X-ray structures. It is demonstrated that the human proteome has one of the highest attainable coverage values among eukaryotes, and GPCR membrane proteins suitable for X-ray structure determination were determined.« less
Crystal structure of the YDR533c S. cerevisiae protein, a class II member of the Hsp31 family.
Graille, Marc; Quevillon-Cheruel, Sophie; Leulliot, Nicolas; Zhou, Cong-Zhao; Li de la Sierra Gallay, Ines; Jacquamet, Lilian; Ferrer, Jean-Luc; Liger, Dominique; Poupon, Anne; Janin, Joel; van Tilbeurgh, Herman
2004-05-01
The ORF YDR533c from Saccharomyces cerevisiae codes for a 25.5 kDa protein of unknown biochemical function. Transcriptome analysis of yeast has shown that this gene is activated in response to various stress conditions together with proteins belonging to the heat shock family. In order to clarify its biochemical function, we determined the crystal structure of YDR533c to 1.85 A resolution by the single anomalous diffraction method. The protein possesses an alpha/beta hydrolase fold and a putative Cys-His-Glu catalytic triad common to a large enzyme family containing proteases, amidotransferases, lipases, and esterases. The protein has strong structural resemblance with the E. coli Hsp31 protein and the intracellular protease I from Pyrococcus horikoshii, which are considered class I and class III members of the Hsp31 family, respectively. Detailed structural analysis strongly suggests that the YDR533c protein crystal structure is the first one of a class II member of the Hsp31 family.
NASA Astrophysics Data System (ADS)
Ward, Meaghan E.; Brown, Leonid S.; Ladizhansky, Vladimir
2015-04-01
Studies of the structure, dynamics, and function of membrane proteins (MPs) have long been considered one of the main applications of solid-state NMR (SSNMR). Advances in instrumentation, and the plethora of new SSNMR methodologies developed over the past decade have resulted in a number of high-resolution structures and structural models of both bitopic and polytopic α-helical MPs. The necessity to retain lipids in the sample, the high proportion of one type of secondary structure, differential dynamics, and the possibility of local disorder in the loop regions all create challenges for structure determination. In this Perspective article we describe our recent efforts directed at determining the structure and functional dynamics of Anabaena Sensory Rhodopsin, a heptahelical transmembrane (7TM) protein. We review some of the established and emerging methods which can be utilized for SSNMR-based structure determination, with a particular focus on those used for ASR, a bacterial protein which shares its 7TM architecture with G-protein coupled receptors.
Siponen, Marina I.; Wisniewska, Magdalena; Lehtiö, Lari; Johansson, Ida; Svensson, Linda; Raszewski, Grzegorz; Nilsson, Lennart; Sigvardsson, Mikael; Berglund, Helena
2010-01-01
The early B-cell factor (EBF) transcription factors are central regulators of development in several organs and tissues. This protein family shows low sequence similarity to other protein families, which is why structural information for the functional domains of these proteins is crucial to understand their biochemical features. We have used a modular approach to determine the crystal structures of the structured domains in the EBF family. The DNA binding domain reveals a striking resemblance to the DNA binding domains of the Rel homology superfamily of transcription factors but contains a unique zinc binding structure, termed zinc knuckle. Further the EBF proteins contain an IPT/TIG domain and an atypical helix-loop-helix domain with a novel type of dimerization motif. The data presented here provide insights into unique structural features of the EBF proteins and open possibilities for detailed molecular investigations of this important transcription factor family. PMID:20592035
Future directions of electron crystallography.
Fujiyoshi, Yoshinori
2013-01-01
In biological science, there are still many interesting and fundamental yet difficult questions, such as those in neuroscience, remaining to be answered. Structural and functional studies of membrane proteins, which are key molecules of signal transduction in neural and other cells, are essential for understanding the molecular mechanisms of many fundamental biological processes. Technological and instrumental advancements of electron microscopy have facilitated comprehension of structural studies of biological components, such as membrane proteins. While X-ray crystallography has been the main method of structure analysis of proteins including membrane proteins, electron crystallography is now an established technique to analyze structures of membrane proteins in the lipid bilayer, which is close to their natural biological environment. By utilizing cryo-electron microscopes with helium-cooled specimen stages, structures of membrane proteins were analyzed at a resolution better than 3 Å. Such high-resolution structural analysis of membrane proteins by electron crystallography opens up the new research field of structural physiology. Considering the fact that the structures of integral membrane proteins in their native membrane environment without artifacts from crystal contacts are critical in understanding their physiological functions, electron crystallography will continue to be an important technology for structural analysis. In this chapter, I will present several examples to highlight important advantages and to suggest future directions of this technique.
Barradas-Bautista, Didier; Moal, Iain H; Fernández-Recio, Juan
2017-07-01
Protein-protein interactions play fundamental roles in biological processes including signaling, metabolism, and trafficking. While the structure of a protein complex reveals crucial details about the interaction, it is often difficult to acquire this information experimentally. As the number of interactions discovered increases faster than they can be characterized, protein-protein docking calculations may be able to reduce this disparity by providing models of the interacting proteins. Rigid-body docking is a widely used docking approach, and is often capable of generating a pool of models within which a near-native structure can be found. These models need to be scored in order to select the acceptable ones from the set of poses. Recently, more than 100 scoring functions from the CCharPPI server were evaluated for this task using decoy structures generated with SwarmDock. Here, we extend this analysis to identify the predictive success rates of the scoring functions on decoys from three rigid-body docking programs, ZDOCK, FTDock, and SDOCK, allowing us to assess the transferability of the functions. We also apply set-theoretic measure to test whether the scoring functions are capable of identifying near-native poses within different subsets of the benchmark. This information can provide guides for the use of the most efficient scoring function for each docking method, as well as instruct future scoring functions development efforts. Proteins 2017; 85:1287-1297. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.
Structural and Functional Assessment of APOBEC3G Macromolecular Complexes
Polevoda, Bogdan; McDougall, William M.; Bennett, Ryan P.; Salter, Jason D.; Smith, Harold C.
2016-01-01
There are eleven members in the human APOBEC family of proteins that are evolutionarily related through their zinc-dependent cytidine deaminase domains. The human APOBEC gene clusters arose on chromosome 6 and 22 through gene duplication and divergence to where current day APOBEC proteins are functionally diverse and broadly expressed in tissues. APOBEC serve enzymatic and non enzymatic functions in cells. In both cases, formation of higher-order structures driven by APOBEC protein-protein interactions and binding to RNA and/or single stranded DNA are integral to their function. In some circumstances, these interactions are regulatory and modulate APOBEC activities. We are just beginning to understand how macromolecular interactions drive processes such as APOBEC subcellular compartmentalization, formation of holoenzyme complexes, gene targeting, foreign DNA restriction, anti-retroviral activity, formation of ribonucleoprotein particles and APOBEC degradation. Protein-protein and protein-nucleic acid cross-linking methods coupled with mass spectrometry, electrophoretic mobility shift assays, glycerol gradient sedimentation, fluorescence anisotropy and APOBEC deaminase assays are enabling mapping of interacting surfaces that are essential for these functions. The goal of this methods review is through example of our research on APOBEC3G, describe the application of cross-linking methods to characterize and quantify macromolecular interactions and their functional implications. Given the homology in structure and function, it is proposed that these methods will be generally applicable to the discovery process for other APOBEC and RNA and DNA editing and modifying proteins. PMID:26988126
Bhadra, Pratiti; Pal, Debnath
2017-04-01
Dynamics is integral to the function of proteins, yet the use of molecular dynamics (MD) simulation as a technique remains under-explored for molecular function inference. This is more important in the context of genomics projects where novel proteins are determined with limited evolutionary information. Recently we developed a method to match the query protein's flexible segments to infer function using a novel approach combining analysis of residue fluctuation-graphs and auto-correlation vectors derived from coarse-grained (CG) MD trajectory. The method was validated on a diverse dataset with sequence identity between proteins as low as 3%, with high function-recall rates. Here we share its implementation as a publicly accessible web service, named DynFunc (Dynamics Match for Function) to query protein function from ≥1 µs long CG dynamics trajectory information of protein subunits. Users are provided with the custom-developed coarse-grained molecular mechanics (CGMM) forcefield to generate the MD trajectories for their protein of interest. On upload of trajectory information, the DynFunc web server identifies specific flexible regions of the protein linked to putative molecular function. Our unique application does not use evolutionary information to infer molecular function from MD information and can, therefore, work for all proteins, including moonlighting and the novel ones, whenever structural information is available. Our pipeline is expected to be of utility to all structural biologists working with novel proteins and interested in moonlighting functions. Copyright © 2017 Elsevier Ltd. All rights reserved.
Chowdhury, S Roy; Cao, Jin; He, Yufan; Lu, H Peter
2018-03-27
Manipulating protein conformations for exploring protein structure-function relationship has shown great promise. Although protein conformational changes under pulling force manipulation have been extensively studied, protein conformation changes under a compressive force have not been explored quantitatively. The latter is even more biologically significant and relevant in revealing protein functions in living cells associated with protein crowdedness, distribution fluctuations, and cell osmotic stress. Here we report our experimental observations on abrupt ruptures of protein native structures under compressive force, demonstrated and studied by single-molecule AFM-FRET spectroscopic nanoscopy. Our results show that the protein ruptures are abrupt and spontaneous events occurred when the compressive force reaches a threshold of 12-75 pN, a force amplitude accessible from thermal fluctuations in a living cell. The abrupt ruptures are sensitive to local environment, likely a general and important pathway of protein unfolding in living cells.
Conformational Transitions in Molecular Systems
NASA Astrophysics Data System (ADS)
Bachmann, M.; Janke, W.
2008-11-01
Proteins are the "work horses" in biological systems. In almost all functions specific proteins are involved. They control molecular transport processes, stabilize the cell structure, enzymatically catalyze chemical reactions; others act as molecular motors in the complex machinery of molecular synthetization processes. Due to their significance, misfolds and malfunctions of proteins typically entail disastrous diseases, such as Alzheimer's disease and bovine spongiform encephalopathy (BSE). Therefore, the understanding of the trinity of amino acid composition, geometric structure, and biological function is one of the most essential challenges for the natural sciences. Here, we glance at conformational transitions accompanying the structure formation in protein folding processes.
Toward high-resolution computational design of helical membrane protein structure and function
Barth, Patrick; Senes, Alessandro
2016-01-01
The computational design of α-helical membrane proteins is still in its infancy but has made important progress. De novo design has produced stable, specific and active minimalistic oligomeric systems. Computational re-engineering can improve stability and modulate the function of natural membrane proteins. Currently, the major hurdle for the field is not computational, but the experimental characterization of the designs. The emergence of new structural methods for membrane proteins will accelerate progress PMID:27273630
Barth, Patrick; Senes, Alessandro
2016-06-07
The computational design of α-helical membrane proteins is still in its infancy but has already made great progress. De novo design allows stable, specific and active minimal oligomeric systems to be obtained. Computational reengineering can improve the stability and function of naturally occurring membrane proteins. Currently, the major hurdle for the field is the experimental characterization of the designs. The emergence of new structural methods for membrane proteins will accelerate progress.
Structure-Based Characterization of Multiprotein Complexes
Wiederstein, Markus; Gruber, Markus; Frank, Karl; Melo, Francisco; Sippl, Manfred J.
2014-01-01
Summary Multiprotein complexes govern virtually all cellular processes. Their 3D structures provide important clues to their biological roles, especially through structural correlations among protein molecules and complexes. The detection of such correlations generally requires comprehensive searches in databases of known protein structures by means of appropriate structure-matching techniques. Here, we present a high-speed structure search engine capable of instantly matching large protein oligomers against the complete and up-to-date database of biologically functional assemblies of protein molecules. We use this tool to reveal unseen structural correlations on the level of protein quaternary structure and demonstrate its general usefulness for efficiently exploring complex structural relationships among known protein assemblies. PMID:24954616
Structure, Biology, and Therapeutic Application of Toxin-Antitoxin Systems in Pathogenic Bacteria.
Lee, Ki-Young; Lee, Bong-Jin
2016-10-22
Bacterial toxin-antitoxin (TA) systems have received increasing attention for their diverse identities, structures, and functional implications in cell cycle arrest and survival against environmental stresses such as nutrient deficiency, antibiotic treatments, and immune system attacks. In this review, we describe the biological functions and the auto-regulatory mechanisms of six different types of TA systems, among which the type II TA system has been most extensively studied. The functions of type II toxins include mRNA/tRNA cleavage, gyrase/ribosome poison, and protein phosphorylation, which can be neutralized by their cognate antitoxins. We mainly explore the similar but divergent structures of type II TA proteins from 12 important pathogenic bacteria, including various aspects of protein-protein interactions. Accumulating knowledge about the structure-function correlation of TA systems from pathogenic bacteria has facilitated a novel strategy to develop antibiotic drugs that target specific pathogens. These molecules could increase the intrinsic activity of the toxin by artificially interfering with the intermolecular network of the TA systems.
Li de La Sierra-Gallay, Ines; Collinet, Bruno; Graille, Marc; Quevillon-Cheruel, Sophie; Liger, Dominique; Minard, Philippe; Blondeau, Karine; Henckes, Gilles; Aufrère, Robert; Leulliot, Nicolas; Zhou, Cong-Zhao; Sorel, Isabelle; Ferrer, Jean-Luc; Poupon, Anne; Janin, Joël; van Tilbeurgh, Herman
2004-03-01
The protein product of the YGR205w gene of Saccharomyces cerevisiae was targeted as part of our yeast structural genomics project. YGR205w codes for a small (290 amino acids) protein with unknown structure and function. The only recognizable sequence feature is the presence of a Walker A motif (P loop) indicating a possible nucleotide binding/converting function. We determined the three-dimensional crystal structure of Se-methionine substituted protein using multiple anomalous diffraction. The structure revealed a well known mononucleotide fold and strong resemblance to the structure of small metabolite phosphorylating enzymes such as pantothenate and phosphoribulo kinase. Biochemical experiments show that YGR205w binds specifically ATP and, less tightly, ADP. The structure also revealed the presence of two bound sulphate ions, occupying opposite niches in a canyon that corresponds to the active site of the protein. One sulphate is bound to the P-loop in a position that corresponds to the position of beta-phosphate in mononucleotide protein ATP complex, suggesting the protein is indeed a kinase. The nature of the phosphate accepting substrate remains to be determined. Copyright 2004 Wiley-Liss, Inc.
Kryshtafovych, Andriy; Moult, John; Bales, Patrick; Bazan, J. Fernando; Biasini, Marco; Burgin, Alex; Chen, Chen; Cochran, Frank V.; Craig, Timothy K.; Das, Rhiju; Fass, Deborah; Garcia-Doval, Carmela; Herzberg, Osnat; Lorimer, Donald; Luecke, Hartmut; Ma, Xiaolei; Nelson, Daniel C.; van Raaij, Mark J.; Rohwer, Forest; Segall, Anca; Seguritan, Victor; Zeth, Kornelius; Schwede, Torsten
2014-01-01
For the last two decades, CASP has assessed the state of the art in techniques for protein structure prediction and identified areas which required further development. CASP would not have been possible without the prediction targets provided by the experimental structural biology community. In the latest experiment, CASP10, over 100 structures were suggested as prediction targets, some of which appeared to be extraordinarily difficult for modeling. In this paper, authors of some of the most challenging targets discuss which specific scientific question motivated the experimental structure determination of the target protein, which structural features were especially interesting from a structural or functional perspective, and to what extent these features were correctly reproduced in the predictions submitted to CASP10. Specifically, the following targets will be presented: the acid-gated urea channel, a difficult to predict trans-membrane protein from the important human pathogen Helicobacter pylori; the structure of human interleukin IL-34, a recently discovered helical cytokine; the structure of a functionally uncharacterized enzyme OrfY from Thermoproteus tenax formed by a gene duplication and a novel fold; an ORFan domain of mimivirus sulfhydryl oxidase R596; the fibre protein gp17 from bacteriophage T7; the Bacteriophage CBA-120 tailspike protein; a virus coat protein from metagenomic samples of the marine environment; and finally an unprecedented class of structure prediction targets based on engineered disulfide-rich small proteins. PMID:24318984
Protein Structures Revealed at Record Pace
Hura, Greg
2017-12-11
The structure of a protein in days -- not months or years -- ushers in a new era in genomics research. Berkeley Lab scientists have developed a high-throughput protein pipeline that could expedite the development of biofuels and elucidate how proteins carry out lifes vital functions.
Protein Structures Revealed at Record Pace
Greg Hura
2017-12-09
The structure of a protein in days -- not months or years -- ushers in a new era in genomics research. Berkeley Lab scientists have developed a high-throughput protein pipeline that could expedite the development of biofuels and elucidate how proteins carry out lifes vital functions.
A new family of β-helix proteins with similarities to the polysaccharide lyases
Close, Devin W.; D'Angelo, Sara; Bradbury, Andrew R. M.
2014-09-27
Microorganisms that degrade biomass produce diverse assortments of carbohydrate-active enzymes and binding modules. Despite tremendous advances in the genomic sequencing of these organisms, many genes do not have an ascribed function owing to low sequence identity to genes that have been annotated. Consequently, biochemical and structural characterization of genes with unknown function is required to complement the rapidly growing pool of genomic sequencing data. A protein with previously unknown function (Cthe_2159) was recently isolated in a genome-wide screen using phage display to identify cellulose-binding protein domains from the biomass-degrading bacterium Clostridium thermocellum. Here, the crystal structure of Cthe_2159 is presentedmore » and it is shown that it is a unique right-handed parallel β-helix protein. Despite very low sequence identity to known β-helix or carbohydrate-active proteins, Cthe_2159 displays structural features that are very similar to those of polysaccharide lyase (PL) families 1, 3, 6 and 9. Cthe_2159 is conserved across bacteria and some archaea and is a member of the domain of unknown function family DUF4353. This suggests that Cthe_2159 is the first representative of a previously unknown family of cellulose and/or acid-sugar binding β-helix proteins that share structural similarities with PLs. More importantly, these results demonstrate how functional annotation by biochemical and structural analysis remains a critical tool in the characterization of new gene products.« less
A new family of β-helix proteins with similarities to the polysaccharide lyases
DOE Office of Scientific and Technical Information (OSTI.GOV)
Close, Devin W.; D'Angelo, Sara; Bradbury, Andrew R. M.
Microorganisms that degrade biomass produce diverse assortments of carbohydrate-active enzymes and binding modules. Despite tremendous advances in the genomic sequencing of these organisms, many genes do not have an ascribed function owing to low sequence identity to genes that have been annotated. Consequently, biochemical and structural characterization of genes with unknown function is required to complement the rapidly growing pool of genomic sequencing data. A protein with previously unknown function (Cthe_2159) was recently isolated in a genome-wide screen using phage display to identify cellulose-binding protein domains from the biomass-degrading bacterium Clostridium thermocellum. Here, the crystal structure of Cthe_2159 is presentedmore » and it is shown that it is a unique right-handed parallel β-helix protein. Despite very low sequence identity to known β-helix or carbohydrate-active proteins, Cthe_2159 displays structural features that are very similar to those of polysaccharide lyase (PL) families 1, 3, 6 and 9. Cthe_2159 is conserved across bacteria and some archaea and is a member of the domain of unknown function family DUF4353. This suggests that Cthe_2159 is the first representative of a previously unknown family of cellulose and/or acid-sugar binding β-helix proteins that share structural similarities with PLs. More importantly, these results demonstrate how functional annotation by biochemical and structural analysis remains a critical tool in the characterization of new gene products.« less
EHD proteins: Key conductors of endocytic transport
Naslavsky, Naava; Caplan, Steve
2010-01-01
Regulation of endocytic transport is controlled by an elaborate network of proteins. Rab GTP-binding proteins and their effectors have well-defined roles in mediating specific endocytic transport steps, but until recently, less was known about the four mammalian dynamin-like C-terminal Eps15 Homology Domain (EHD) proteins that also regulate endocytic events. In recent years, however, great strides have been made in understanding the structure and function of these unique proteins. Indeed, a growing body of literature addresses EHD protein structure, interactions with binding partners, functions in mammalian cells, and the generation of various new model systems. Accordingly, this is now an opportune time to pause and review the function and mechanisms of action of EHD proteins, and to highlight some of the challenges and future directions for the field. PMID:21067929
α-Crystallins Are Small Heat Shock Proteins: Functional and Structural Properties.
Tikhomirova, T S; Selivanova, O M; Galzitskaya, O V
2017-02-01
During its life cycle, a cell can be subjected to various external negative effects. Many proteins provide cell protection, including small heat shock proteins (sHsp) that have chaperone-like activity. These proteins have several important functions involving prevention of apoptosis and retention of cytoskeletal integrity; also, sHsp take part in the recovery of enzyme activity. The action mechanism of sHsp is based on the binding of hydrophobic regions exposed to the surface of a molten globule. α-Crystallins presented in chordate cells as two αA- and αB-isoforms are the most studied small heat shock proteins. In this review, we describe the main functions of α-crystallins, features of their secondary and tertiary structures, and examples of their partners in protein-protein interactions.
Functional understanding of the diverse exon-intron structures of human GPCR genes.
Hammond, Dorothy A; Olman, Victor; Xu, Ying
2014-02-01
The GPCR genes have a variety of exon-intron structures even though their proteins are all structurally homologous. We have examined all human GPCR genes with at least two functional protein isoforms, totaling 199, aiming to gain an understanding of what may have contributed to the large diversity of the exon-intron structures of the GPCR genes. The 199 genes have a total of 808 known protein splicing isoforms with experimentally verified functions. Our analysis reveals that 1301 (80.6%) adjacent exon-exon pairs out of the total of 1,613 in the 199 genes have either exactly one exon skipped or the intron in-between retained in at least one of the 808 protein splicing isoforms. This observation has a statistical significance p-value of 2.051762 * e(-09), assuming that the observed splicing isoforms are independent of the exon-intron structures. Our interpretation of this observation is that the exon boundaries of the GPCR genes are not randomly determined; instead they may be selected to facilitate specific alternative splicing for functional purposes.
FERM proteins in animal morphogenesis.
Tepass, Ulrich
2009-08-01
Proteins containing a FERM domain are ubiquitous components of the cytocortex of animal cells where they are engaged in structural, transport, and signaling functions. Recent years have seen a wealth of genetic studies in model organisms that explore FERM protein function in development and tissue organization. In addition, mutations in several FERM protein-encoding genes have been associated with human diseases. This review will provide a brief overview of the FERM domain structure and the FERM protein superfamily and then discuss recent advances in our understanding of the mechanism of function and developmental requirement of several FERM proteins including Moesin, Myosin-VIIA, Myosin-XV, Coracle/Band4.1 as well as Yurt and its vertebrate homologs Mosaic Eyes and EPB41L5/YMO1/Limulus.
Porphyrin mediated photo-modification of the structure and function of human serum albumin
NASA Astrophysics Data System (ADS)
Rozinek, Sarah C.
Photosensitization reactions involve irradiating (with visible light) molecules with a high efficiency for either electron transfer or entering an excited triplet state (photosensitizer). Such reactions are applied to photodynamic cancer therapy, many medical laser-treatments, and a potential array of disinfection and pest elimination techniques. To understand the biophysical mechanisms of how these applications are effective at the protein level, the group of Dr. Brancaleon (UTSA) has investigated the irradiation of several dye-protein combinations, and discovered effects on protein structure and function. To further that work, we have investigated irradiation of the protein, human serum albumin (HSA), photosensitized by either protoporphyrin IX (PPIX) or meso-tetrakis(4-sulfonatophenyl)porphyrin (TSPP). HSA is the most abundant plasma protein, making it a likely substrate in PDT, and it possesses a specific binding pocket for iron-PPIX (heme) and possibly other porphyrin derivatives. The results of our research are summarized as follows. First, a thorough characterization of the binding of each photosensitizer to albumin was completed, elucidating a probable binding location for TSPP. Next, fluorescence lifetime emission of the single tryptophan residue, alongside circular dichroism, found tertiary structural changes around tryptophan and an overall 20% decrease in protein secondary structure after irradiation with TSPP bound. Finally, to determine if protein function was lost after photosensitization, size exclusion chromatography found modified albumin still recognizable by its receptor-protein, and comparative ex vivo up-take studies revealed that modified albumin is not processed the same way as native albumin in live tapeworm larva (Mesocestoides corti). Thus we found that visible light can induce partial unfolding of a protein by using a photo-activated ligand. These small structural modifications were sufficient to affect the protein's biological function.
Evolutionarily Conserved Linkage between Enzyme Fold, Flexibility, and Catalysis
Ramanathan, Arvind; Agarwal, Pratul K.
2011-01-01
Proteins are intrinsically flexible molecules. The role of internal motions in a protein's designated function is widely debated. The role of protein structure in enzyme catalysis is well established, and conservation of structural features provides vital clues to their role in function. Recently, it has been proposed that the protein function may involve multiple conformations: the observed deviations are not random thermodynamic fluctuations; rather, flexibility may be closely linked to protein function, including enzyme catalysis. We hypothesize that the argument of conservation of important structural features can also be extended to identification of protein flexibility in interconnection with enzyme function. Three classes of enzymes (prolyl-peptidyl isomerase, oxidoreductase, and nuclease) that catalyze diverse chemical reactions have been examined using detailed computational modeling. For each class, the identification and characterization of the internal protein motions coupled to the chemical step in enzyme mechanisms in multiple species show identical enzyme conformational fluctuations. In addition to the active-site residues, motions of protein surface loop regions (>10 Å away) are observed to be identical across species, and networks of conserved interactions/residues connect these highly flexible surface regions to the active-site residues that make direct contact with substrates. More interestingly, examination of reaction-coupled motions in non-homologous enzyme systems (with no structural or sequence similarity) that catalyze the same biochemical reaction shows motions that induce remarkably similar changes in the enzyme–substrate interactions during catalysis. The results indicate that the reaction-coupled flexibility is a conserved aspect of the enzyme molecular architecture. Protein motions in distal areas of homologous and non-homologous enzyme systems mediate similar changes in the active-site enzyme–substrate interactions, thereby impacting the mechanism of catalyzed chemistry. These results have implications for understanding the mechanism of allostery, and for protein engineering and drug design. PMID:22087074
Evolutionarily conserved linkage between enzyme fold, flexibility, and catalysis.
Ramanathan, Arvind; Agarwal, Pratul K
2011-11-01
Proteins are intrinsically flexible molecules. The role of internal motions in a protein's designated function is widely debated. The role of protein structure in enzyme catalysis is well established, and conservation of structural features provides vital clues to their role in function. Recently, it has been proposed that the protein function may involve multiple conformations: the observed deviations are not random thermodynamic fluctuations; rather, flexibility may be closely linked to protein function, including enzyme catalysis. We hypothesize that the argument of conservation of important structural features can also be extended to identification of protein flexibility in interconnection with enzyme function. Three classes of enzymes (prolyl-peptidyl isomerase, oxidoreductase, and nuclease) that catalyze diverse chemical reactions have been examined using detailed computational modeling. For each class, the identification and characterization of the internal protein motions coupled to the chemical step in enzyme mechanisms in multiple species show identical enzyme conformational fluctuations. In addition to the active-site residues, motions of protein surface loop regions (>10 Å away) are observed to be identical across species, and networks of conserved interactions/residues connect these highly flexible surface regions to the active-site residues that make direct contact with substrates. More interestingly, examination of reaction-coupled motions in non-homologous enzyme systems (with no structural or sequence similarity) that catalyze the same biochemical reaction shows motions that induce remarkably similar changes in the enzyme-substrate interactions during catalysis. The results indicate that the reaction-coupled flexibility is a conserved aspect of the enzyme molecular architecture. Protein motions in distal areas of homologous and non-homologous enzyme systems mediate similar changes in the active-site enzyme-substrate interactions, thereby impacting the mechanism of catalyzed chemistry. These results have implications for understanding the mechanism of allostery, and for protein engineering and drug design.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ramanathan, Arvind; Agarwal, Pratul K
Proteins are intrinsically flexible molecules. The role of internal motions in a protein's designated function is widely debated. The role of protein structure in enzyme catalysis is well established, and conservation of structural features provides vital clues to their role in function. Recently, it has been proposed that the protein function may involve multiple conformations: the observed deviations are not random thermodynamic fluctuations; rather, flexibility may be closely linked to protein function, including enzyme catalysis. We hypothesize that the argument of conservation of important structural features can also be extended to identification of protein flexibility in interconnection with enzyme function.more » Three classes of enzymes (prolyl-peptidyl isomerase, oxidoreductase, and nuclease) that catalyze diverse chemical reactions have been examined using detailed computational modeling. For each class, the identification and characterization of the internal protein motions coupled to the chemical step in enzyme mechanisms in multiple species show identical enzyme conformational fluctuations. In addition to the active-site residues, motions of protein surface loop regions (>10 away) are observed to be identical across species, and networks of conserved interactions/residues connect these highly flexible surface regions to the active-site residues that make direct contact with substrates. More interestingly, examination of reaction-coupled motions in non-homologous enzyme systems (with no structural or sequence similarity) that catalyze the same biochemical reaction shows motions that induce remarkably similar changes in the enzyme substrate interactions during catalysis. The results indicate that the reaction-coupled flexibility is a conserved aspect of the enzyme molecular architecture. Protein motions in distal areas of homologous and non-homologous enzyme systems mediate similar changes in the active-site enzyme substrate interactions, thereby impacting the mechanism of catalyzed chemistry. These results have implications for understanding the mechanism of allostery, and for protein engineering and drug design.« less
Zhang, Shuxing; Kaplan, Andrew H.; Tropsha, Alexander
2009-01-01
The Simplicial Neighborhood Analysis of Protein Packing (SNAPP) method was used to predict the effect of mutagenesis on the enzymatic activity of the HIV-1 protease (HIVP). SNAPP relies on a four-body statistical scoring function derived from the analysis of spatially nearest neighbor residue compositional preferences in a diverse and representative subset of protein structures from the Protein Data Bank. The method was applied to the analysis of HIVP mutants with residue substitutions in the hydrophobic core as well as at the interface between the two protease monomers. Both wild type and tethered structures were employed in the calculations. We obtained a strong correlation, with R2 as high as 0.96, between ΔSNAPP score (i.e., the difference in SNAPP scores between wild type and mutant proteins) and the protease catalytic activity for tethered structures. A weaker but significant correlation was also obtained for non-tethered structures as well. Our analysis identified residues both in the hydrophobic core and at the dimeric interface (DI) that are very important for the protease function. This study demonstrates a potential utility of the SNAPP method for rational design of mutagenesis studies and protein engineering. PMID:18498108
Looking at the Disordered Proteins through the Computational Microscope.
Das, Payel; Matysiak, Silvina; Mittal, Jeetain
2018-05-23
Intrinsically disordered proteins (IDPs) have attracted wide interest over the past decade due to their surprising prevalence in the proteome and versatile roles in cell physiology and pathology. A large selection of IDPs has been identified as potential targets for therapeutic intervention. Characterizing the structure-function relationship of disordered proteins is therefore an essential but daunting task, as these proteins can adapt transient structure, necessitating a new paradigm for connecting structural disorder to function. Molecular simulation has emerged as a natural complement to experiments for atomic-level characterizations and mechanistic investigations of this intriguing class of proteins. The diverse range of length and time scales involved in IDP function requires performing simulations at multiple levels of resolution. In this Outlook, we focus on summarizing available simulation methods, along with a few interesting example applications. We also provide an outlook on how these simulation methods can be further improved in order to provide a more accurate description of IDP structure, binding, and assembly.
Native state volume fluctuations in proteins as a mechanism for dynamic allostery
Law, Anthony B.; Sapienza, Paul J.; Zhang, Jun; ...
2017-01-17
Allostery enables tight regulation of protein function in the cellular environment. While existing models of allostery are firmly rooted in the current structure-function paradigm, the mechanistic basis for allostery in the absence of structural change remains unclear. In this study, we show that a typical globular protein is able to undergo significant changes in volume under native conditions while exhibiting no additional changes in protein structure. These native state volume fluctuations were found to correlate with changes in internal motions that were previously recognized as a source of allosteric entropy. This finding offers a novel mechanistic basis for allostery inmore » the absence of canonical structural change. As a result, the unexpected observation that function can be derived from expanded, low density protein states has broad implications for our understanding of allostery and suggests that the general concept of the native state be expanded to allow for more variable physical dimensions with looser packing.« less
The mechanism of protein export enhancement by the SecDF membrane component
Tsukazaki, Tomoya; Nureki, Osamu
2011-01-01
Protein transport across membranes is a fundamental and essential cellular activity in all organisms. In bacteria, protein export across the cytoplasmic membrane, driven by dynamic interplays between the protein-conducting SecYEG channel (Sec translocon) and the SecA ATPase, is enhanced by the proton motive force (PMF) and a membrane-integrated Sec component, SecDF. However, the structure and function of SecDF have remained unclear. We solved the first crystal structure of SecDF, consisting of a pseudo-symmetrical 12-helix transmembrane domain and two protruding periplasmic domains. Based on the structural features, we proposed that SecDF functions as a membrane-integrated chaperone, which drives protein movement without using the major energetic currency, ATP, but with remarkable cycles of conformational changes, powered by the proton gradient across the membrane. By a series of biochemical and biophysical approaches, several functionally important residues in the transmembrane region have been identified and our model of the SecDF function has been verified. PMID:27857601
Soluble expression, purification and characterization of the full length IS2 Transposase.
Lewis, Leslie A; Astatke, Mekbib; Umekubo, Peter T; Alvi, Shaheen; Saby, Robert; Afrose, Jehan
2011-10-27
The two-step transposition pathway of insertion sequences of the IS3 family, and several other families, involves first the formation of a branched figure-of-eight (F-8) structure by an asymmetric single strand cleavage at one optional donor end and joining to the flanking host DNA near the target end. Its conversion to a double stranded minicircle precedes the second insertional step, where both ends function as donors. In IS2, the left end which lacks donor function in Step I acquires it in Step II. The assembly of two intrinsically different protein-DNA complexes in these F-8 generating elements has been intuitively proposed, but a barrier to testing this hypothesis has been the difficulty of isolating a full length, soluble and active transposase that creates fully formed synaptic complexes in vitro with protein bound to both binding and catalytic domains of the ends. We address here a solution to expressing, purifying and structurally analyzing such a protein. A soluble and active IS2 transposase derivative with GFP fused to its C-terminus functions as efficiently as the native protein in in vivo transposition assays. In vitro electrophoretic mobility shift assay data show that the partially purified protein prepared under native conditions binds very efficiently to cognate DNA, utilizing both N- and C-terminal residues. As a precursor to biophysical analyses of these complexes, a fluorescence-based random mutagenesis protocol was developed that enabled a structure-function analysis of the protein with good resolution at the secondary structure level. The results extend previous structure-function work on IS3 family transposases, identifying the binding domain as a three helix H + HTH bundle and explaining the function of an atypical leucine zipper-like motif in IS2. In addition gain- and loss-of-function mutations in the catalytic active site define its role in regional and global binding and identify functional signatures that are common to the three dimensional catalytic core motif of the retroviral integrase superfamily. Intractably insoluble transposases, such as the IS2 transposase, prepared by solubilization protocols are often refractory to whole protein structure-function studies. The results described here have validated the use of GFP-tagging and fluorescence-based random mutagenesis in overcoming this limitation at the secondary structure level.
Soluble expression, purification and characterization of the full length IS2 Transposase
2011-01-01
Background The two-step transposition pathway of insertion sequences of the IS3 family, and several other families, involves first the formation of a branched figure-of-eight (F-8) structure by an asymmetric single strand cleavage at one optional donor end and joining to the flanking host DNA near the target end. Its conversion to a double stranded minicircle precedes the second insertional step, where both ends function as donors. In IS2, the left end which lacks donor function in Step I acquires it in Step II. The assembly of two intrinsically different protein-DNA complexes in these F-8 generating elements has been intuitively proposed, but a barrier to testing this hypothesis has been the difficulty of isolating a full length, soluble and active transposase that creates fully formed synaptic complexes in vitro with protein bound to both binding and catalytic domains of the ends. We address here a solution to expressing, purifying and structurally analyzing such a protein. Results A soluble and active IS2 transposase derivative with GFP fused to its C-terminus functions as efficiently as the native protein in in vivo transposition assays. In vitro electrophoretic mobility shift assay data show that the partially purified protein prepared under native conditions binds very efficiently to cognate DNA, utilizing both N- and C-terminal residues. As a precursor to biophysical analyses of these complexes, a fluorescence-based random mutagenesis protocol was developed that enabled a structure-function analysis of the protein with good resolution at the secondary structure level. The results extend previous structure-function work on IS3 family transposases, identifying the binding domain as a three helix H + HTH bundle and explaining the function of an atypical leucine zipper-like motif in IS2. In addition gain- and loss-of-function mutations in the catalytic active site define its role in regional and global binding and identify functional signatures that are common to the three dimensional catalytic core motif of the retroviral integrase superfamily. Conclusions Intractably insoluble transposases, such as the IS2 transposase, prepared by solubilization protocols are often refractory to whole protein structure-function studies. The results described here have validated the use of GFP-tagging and fluorescence-based random mutagenesis in overcoming this limitation at the secondary structure level. PMID:22032517
Fundamental Characteristics of AAA+ Protein Family Structure and Function.
Miller, Justin M; Enemark, Eric J
2016-01-01
Many complex cellular events depend on multiprotein complexes known as molecular machines to efficiently couple the energy derived from adenosine triphosphate hydrolysis to the generation of mechanical force. Members of the AAA+ ATPase superfamily (ATPases Associated with various cellular Activities) are critical components of many molecular machines. AAA+ proteins are defined by conserved modules that precisely position the active site elements of two adjacent subunits to catalyze ATP hydrolysis. In many cases, AAA+ proteins form a ring structure that translocates a polymeric substrate through the central channel using specialized loops that project into the central channel. We discuss the major features of AAA+ protein structure and function with an emphasis on pivotal aspects elucidated with archaeal proteins.
Computational modeling of Repeat1 region of INI1/hSNF5: An evolutionary link with ubiquitin.
Bhutoria, Savita; Kalpana, Ganjam V; Acharya, Seetharama A
2016-09-01
The structure of a protein can be very informative of its function. However, determining protein structures experimentally can often be very challenging. Computational methods have been used successfully in modeling structures with sufficient accuracy. Here we have used computational tools to predict the structure of an evolutionarily conserved and functionally significant domain of Integrase interactor (INI)1/hSNF5 protein. INI1 is a component of the chromatin remodeling SWI/SNF complex, a tumor suppressor and is involved in many protein-protein interactions. It belongs to SNF5 family of proteins that contain two conserved repeat (Rpt) domains. Rpt1 domain of INI1 binds to HIV-1 Integrase, and acts as a dominant negative mutant to inhibit viral replication. Rpt1 domain also interacts with oncogene c-MYC and modulates its transcriptional activity. We carried out an ab initio modeling of a segment of INI1 protein containing the Rpt1 domain. The structural model suggested the presence of a compact and well defined ββαα topology as core structure in the Rpt1 domain of INI1. This topology in Rpt1 was similar to PFU domain of Phospholipase A2 Activating Protein, PLAA. Interestingly, PFU domain shares similarity with Ubiquitin and has ubiquitin binding activity. Because of the structural similarity between Rpt1 domain of INI1 and PFU domain of PLAA, we propose that Rpt1 domain of INI1 may participate in ubiquitin recognition or binding with ubiquitin or ubiquitin related proteins. This modeling study may shed light on the mode of interactions of Rpt1 domain of INI1 and is likely to facilitate future functional studies of INI1. © 2016 The Protein Society.
Running, William E; Reilly, James P
2010-10-01
Ribosomes occupy a central position in cellular metabolism, converting stored genetic information into active cellular machinery. Ribosomal proteins modulate both the intrinsic function of the ribosome and its interaction with other cellular complexes, such as chaperonins or the signal recognition particle. Chemical modification of proteins combined with mass spectrometric detection of the extent and position of covalent modifications is a rapid, sensitive method for the study of protein structure and flexibility. By altering the pH of the solution, we have induced non-denaturing changes in the structure of bacterial ribosomal proteins and detected these conformational changes by covalent labeling. Changes in ribosomal protein modification across a pH range from 6.6 to 8.3 are unique to each protein, and correlate with their structural environment in the ribosome. Lysine residues whose extent of modification increases as a function of increasing pH are on the surface of proteins, but in close proximity either to glutamate and aspartate residues, or to rRNA backbone phosphates. Increasing pH disrupts tertiary and quaternary interactions mediated by hydrogen bonding or ionic interactions, and regions of protein structure whose conformations are sensitive to these changes are of potential importance in modulating the flexibility of the ribosome or its interaction with other cellular complexes.
Molloy, Kevin; Shehu, Amarda
2013-01-01
Many proteins tune their biological function by transitioning between different functional states, effectively acting as dynamic molecular machines. Detailed structural characterization of transition trajectories is central to understanding the relationship between protein dynamics and function. Computational approaches that build on the Molecular Dynamics framework are in principle able to model transition trajectories at great detail but also at considerable computational cost. Methods that delay consideration of dynamics and focus instead on elucidating energetically-credible conformational paths connecting two functionally-relevant structures provide a complementary approach. Effective sampling-based path planning methods originating in robotics have been recently proposed to produce conformational paths. These methods largely model short peptides or address large proteins by simplifying conformational space. We propose a robotics-inspired method that connects two given structures of a protein by sampling conformational paths. The method focuses on small- to medium-size proteins, efficiently modeling structural deformations through the use of the molecular fragment replacement technique. In particular, the method grows a tree in conformational space rooted at the start structure, steering the tree to a goal region defined around the goal structure. We investigate various bias schemes over a progress coordinate for balance between coverage of conformational space and progress towards the goal. A geometric projection layer promotes path diversity. A reactive temperature scheme allows sampling of rare paths that cross energy barriers. Experiments are conducted on small- to medium-size proteins of length up to 214 amino acids and with multiple known functionally-relevant states, some of which are more than 13Å apart of each-other. Analysis reveals that the method effectively obtains conformational paths connecting structural states that are significantly different. A detailed analysis on the depth and breadth of the tree suggests that a soft global bias over the progress coordinate enhances sampling and results in higher path diversity. The explicit geometric projection layer that biases the exploration away from over-sampled regions further increases coverage, often improving proximity to the goal by forcing the exploration to find new paths. The reactive temperature scheme is shown effective in increasing path diversity, particularly in difficult structural transitions with known high-energy barriers.
Choong, Yee Siew; Lim, Theam Soon; Chew, Ai Lan; Aziah, Ismail; Ismail, Asma
2011-04-01
The high typhoid incidence rate in developing and under-developed countries emphasizes the need for a rapid, affordable and accessible diagnostic test for effective therapy and disease management. TYPHIDOT®, a rapid dot enzyme immunoassay test for typhoid, was developed from the discovery of a ∼50 kDa protein specific for Salmonella enterica serovar Typhi. However, the structure of this antigen remains unknown till today. Studies on the structure of this antigen are important to elucidate its function, which will in turn increase the efficiency of the development and improvement of the typhoid detection test. This paper described the predictive structure and function of the antigenically specific protein. The homology modeling approach was employed to construct the three-dimensional structure of the antigen. The built structure possesses the features of TolC-like outer membrane protein. Molecular docking simulation was also performed to further probe the functionality of the antigen. Docking results showed that hexamminecobalt, Co(NH(3))(6)(3+), as an inhibitor of TolC protein, formed favorable hydrogen bonds with D368 and D371 of the antigen. The single point (D368A, D371A) and double point (D368A and D371A) mutations of the antigen showed a decrease (single point mutation) and loss (double point mutations) of binding affinity towards hexamminecobalt. The architecture features of the built model and the docking simulation reinforced and supported that this antigen is indeed the variant of outer membrane protein, TolC. As channel proteins are important for the virulence and survival of bacteria, therefore this ∼50 kDa channel protein is a good specific target for typhoid detection test. Copyright © 2011 Elsevier Inc. All rights reserved.
Binding Mechanisms of Intrinsically Disordered Proteins: Theory, Simulation, and Experiment
Mollica, Luca; Bessa, Luiza M.; Hanoulle, Xavier; Jensen, Malene Ringkjøbing; Blackledge, Martin; Schneider, Robert
2016-01-01
In recent years, protein science has been revolutionized by the discovery of intrinsically disordered proteins (IDPs). In contrast to the classical paradigm that a given protein sequence corresponds to a defined structure and an associated function, we now know that proteins can be functional in the absence of a stable three-dimensional structure. In many cases, disordered proteins or protein regions become structured, at least locally, upon interacting with their physiological partners. Many, sometimes conflicting, hypotheses have been put forward regarding the interaction mechanisms of IDPs and the potential advantages of disorder for protein-protein interactions. Whether disorder may increase, as proposed, e.g., in the “fly-casting” hypothesis, or decrease binding rates, increase or decrease binding specificity, or what role pre-formed structure might play in interactions involving IDPs (conformational selection vs. induced fit), are subjects of intense debate. Experimentally, these questions remain difficult to address. Here, we review experimental studies of binding mechanisms of IDPs using NMR spectroscopy and transient kinetic techniques, as well as the underlying theoretical concepts and numerical methods that can be applied to describe these interactions at the atomic level. The available literature suggests that the kinetic and thermodynamic parameters characterizing interactions involving IDPs can vary widely and that there may be no single common mechanism that can explain the different binding modes observed experimentally. Rather, disordered proteins appear to make combined use of features such as pre-formed structure and flexibility, depending on the individual system and the functional context. PMID:27668217
Extensions of PDZ domains as important structural and functional elements.
Wang, Conan K; Pan, Lifeng; Chen, Jia; Zhang, Mingjie
2010-08-01
'Divide and conquer' has been the guiding strategy for the study of protein structure and function. Proteins are divided into domains with each domain having a canonical structural definition depending on its type. In this review, we push forward with the interesting observation that many domains have regions outside of their canonical definition that affect their structure and function; we call these regions 'extensions'. We focus on the highly abundant PDZ (PSD-95, DLG1 and ZO-1) domain. Using bioinformatics, we find that many PDZ domains have potential extensions and we developed an openly-accessible website to display our results ( http://bcz102.ust.hk/pdzex/ ). We propose, using well-studied PDZ domains as illustrative examples, that the roles of PDZ extensions can be classified into at least four categories: 1) protein dynamics-based modulation of target binding affinity, 2) provision of binding sites for macro-molecular assembly, 3) structural integration of multi-domain modules, and 4) expansion of the target ligand-binding pocket. Our review highlights the potential structural and functional importance of domain extensions, highlighting the significance of looking beyond the canonical boundaries of protein domains in general.
Less is More: Membrane Protein Digestion Beyond Urea–Trypsin Solution for Next-level Proteomics*
Zhang, Xi
2015-01-01
The goal of next-level bottom-up membrane proteomics is protein function investigation, via high-coverage high-throughput peptide-centric quantitation of expression, modifications and dynamic structures at systems scale. Yet efficient digestion of mammalian membrane proteins presents a daunting barrier, and prevalent day-long urea–trypsin in-solution digestion proved insufficient to reach this goal. Many efforts contributed incremental advances over past years, but involved protein denaturation that disconnected measurement from functional states. Beyond denaturation, the recent discovery of structure/proteomics omni-compatible detergent n-dodecyl-β-d-maltopyranoside, combined with pepsin and PNGase F columns, enabled breakthroughs in membrane protein digestion: a 2010 DDM-low-TCEP (DLT) method for H/D-exchange (HDX) using human G protein-coupled receptor, and a 2015 flow/detergent-facilitated protease and de-PTM digestions (FDD) for integrative deep sequencing and quantitation using full-length human ion channel complex. Distinguishing protein solubilization from denaturation, protease digestion reliability from theoretical specificity, and reduction from alkylation, these methods shifted day(s)-long paradigms into minutes, and afforded fully automatable (HDX)-protein-peptide-(tandem mass tag)-HPLC pipelines to instantly measure functional proteins at deep coverage, high peptide reproducibility, low artifacts and minimal leakage. Promoting—not destroying—structures and activities harnessed membrane proteins for the next-level streamlined functional proteomics. This review analyzes recent advances in membrane protein digestion methods and highlights critical discoveries for future proteomics. PMID:26081834
Kim, Do Jin; Bitto, Eduard; Bingman, Craig A; Kim, Hyun-Jung; Han, Byung Woo; Phillips, George N
2015-07-01
Members of the universal stress protein (USP) family are conserved in a phylogenetically diverse range of prokaryotes, fungi, protists, and plants and confer abilities to respond to a wide range of environmental stresses. Arabidopsis thaliana contains 44 USP domain-containing proteins, and USP domain is found either in a small protein with unknown physiological function or in an N-terminal portion of a multi-domain protein, usually a protein kinase. Here, we report the first crystal structure of a eukaryotic USP-like protein encoded from the gene At3g01520. The crystal structure of the protein At3g01520 was determined by the single-wavelength anomalous dispersion method and refined to an R factor of 21.8% (Rfree = 26.1%) at 2.5 Å resolution. The crystal structure includes three At3g01520 protein dimers with one AMP molecule bound to each protomer, comprising a Rossmann-like α/β overall fold. The bound AMP and conservation of residues in the ATP-binding loop suggest that the protein At3g01520 also belongs to the ATP-binding USP subfamily members. © 2015 The Authors. Proteins: Structure, Function, and Bioinformatics Published by Wiley Periodicals, Inc.
SARS-unique fold in the Rousettus bat coronavirus HKU9.
Hammond, Robert G; Tan, Xuan; Johnson, Margaret A
2017-09-01
The coronavirus nonstructural protein 3 (nsp3) is a multifunctional protein that comprises multiple structural domains. This protein assists viral polyprotein cleavage, host immune interference, and may play other roles in genome replication or transcription. Here, we report the solution NMR structure of a protein from the "SARS-unique region" of the bat coronavirus HKU9. The protein contains a frataxin fold or double-wing motif, which is an α + β fold that is associated with protein/protein interactions, DNA binding, and metal ion binding. High structural similarity to the human severe acute respiratory syndrome (SARS) coronavirus nsp3 is present. A possible functional site that is conserved among some betacoronaviruses has been identified using bioinformatics and biochemical analyses. This structure provides strong experimental support for the recent proposal advanced by us and others that the "SARS-unique" region is not unique to the human SARS virus, but is conserved among several different phylogenetic groups of coronaviruses and provides essential functions. © 2017 The Protein Society.
Structure-Functional Basis of Ion Transport in Sodium–Calcium Exchanger (NCX) Proteins
Giladi, Moshe; Shor, Reut; Lisnyansky, Michal; Khananshvili, Daniel
2016-01-01
The membrane-bound sodium–calcium exchanger (NCX) proteins shape Ca2+ homeostasis in many cell types, thus participating in a wide range of physiological and pathological processes. Determination of the crystal structure of an archaeal NCX (NCX_Mj) paved the way for a thorough and systematic investigation of ion transport mechanisms in NCX proteins. Here, we review the data gathered from the X-ray crystallography, molecular dynamics simulations, hydrogen–deuterium exchange mass-spectrometry (HDX-MS), and ion-flux analyses of mutants. Strikingly, the apo NCX_Mj protein exhibits characteristic patterns in the local backbone dynamics at particular helix segments, thereby possessing characteristic HDX profiles, suggesting structure-dynamic preorganization (geometric arrangements of catalytic residues before the transition state) of conserved α1 and α2 repeats at ion-coordinating residues involved in transport activities. Moreover, dynamic preorganization of local structural entities in the apo protein predefines the status of ion-occlusion and transition states, even though Na+ or Ca2+ binding modifies the preceding backbone dynamics nearby functionally important residues. Future challenges include resolving the structural-dynamic determinants governing the ion selectivity, functional asymmetry and ion-induced alternating access. Taking into account the structural similarities of NCX_Mj with the other proteins belonging to the Ca2+/cation exchanger superfamily, the recent findings can significantly improve our understanding of ion transport mechanisms in NCX and similar proteins. PMID:27879668
Structure-Functional Basis of Ion Transport in Sodium-Calcium Exchanger (NCX) Proteins.
Giladi, Moshe; Shor, Reut; Lisnyansky, Michal; Khananshvili, Daniel
2016-11-22
The membrane-bound sodium-calcium exchanger (NCX) proteins shape Ca 2+ homeostasis in many cell types, thus participating in a wide range of physiological and pathological processes. Determination of the crystal structure of an archaeal NCX (NCX_Mj) paved the way for a thorough and systematic investigation of ion transport mechanisms in NCX proteins. Here, we review the data gathered from the X-ray crystallography, molecular dynamics simulations, hydrogen-deuterium exchange mass-spectrometry (HDX-MS), and ion-flux analyses of mutants. Strikingly, the apo NCX_Mj protein exhibits characteristic patterns in the local backbone dynamics at particular helix segments, thereby possessing characteristic HDX profiles, suggesting structure-dynamic preorganization (geometric arrangements of catalytic residues before the transition state) of conserved α₁ and α₂ repeats at ion-coordinating residues involved in transport activities. Moreover, dynamic preorganization of local structural entities in the apo protein predefines the status of ion-occlusion and transition states, even though Na⁺ or Ca 2+ binding modifies the preceding backbone dynamics nearby functionally important residues. Future challenges include resolving the structural-dynamic determinants governing the ion selectivity, functional asymmetry and ion-induced alternating access. Taking into account the structural similarities of NCX_Mj with the other proteins belonging to the Ca 2+ /cation exchanger superfamily, the recent findings can significantly improve our understanding of ion transport mechanisms in NCX and similar proteins.
Bromberg, Yana; Yachdav, Guy; Ofran, Yanay; Schneider, Reinhard; Rost, Burkhard
2009-05-01
The rapidly increasing quantity of protein sequence data continues to widen the gap between available sequences and annotations. Comparative modeling suggests some aspects of the 3D structures of approximately half of all known proteins; homology- and network-based inferences annotate some aspect of function for a similar fraction of the proteome. For most known protein sequences, however, there is detailed knowledge about neither their function nor their structure. Comprehensive efforts towards the expert curation of sequence annotations have failed to meet the demand of the rapidly increasing number of available sequences. Only the automated prediction of protein function in the absence of homology can close the gap between available sequences and annotations in the foreseeable future. This review focuses on two novel methods for automated annotation, and briefly presents an outlook on how modern web software may revolutionize the field of protein sequence annotation. First, predictions of protein binding sites and functional hotspots, and the evolution of these into the most successful type of prediction of protein function from sequence will be discussed. Second, a new tool, comprehensive in silico mutagenesis, which contributes important novel predictions of function and at the same time prepares for the onset of the next sequencing revolution, will be described. While these two new sub-fields of protein prediction represent the breakthroughs that have been achieved methodologically, it will then be argued that a different development might further change the way biomedical researchers benefit from annotations: modern web software can connect the worldwide web in any browser with the 'Deep Web' (ie, proprietary data resources). The availability of this direct connection, and the resulting access to a wealth of data, may impact drug discovery and development more than any existing method that contributes to protein annotation.
A new multi-scale method to reveal hierarchical modular structures in biological networks.
Jiao, Qing-Ju; Huang, Yan; Shen, Hong-Bin
2016-11-15
Biological networks are effective tools for studying molecular interactions. Modular structure, in which genes or proteins may tend to be associated with functional modules or protein complexes, is a remarkable feature of biological networks. Mining modular structure from biological networks enables us to focus on a set of potentially important nodes, which provides a reliable guide to future biological experiments. The first fundamental challenge in mining modular structure from biological networks is that the quality of the observed network data is usually low owing to noise and incompleteness in the obtained networks. The second problem that poses a challenge to existing approaches to the mining of modular structure is that the organization of both functional modules and protein complexes in networks is far more complicated than was ever thought. For instance, the sizes of different modules vary considerably from each other and they often form multi-scale hierarchical structures. To solve these problems, we propose a new multi-scale protocol for mining modular structure (named ISIMB) driven by a node similarity metric, which works in an iteratively converged space to reduce the effects of the low data quality of the observed network data. The multi-scale node similarity metric couples both the local and the global topology of the network with a resolution regulator. By varying this resolution regulator to give different weightings to the local and global terms in the metric, the ISIMB method is able to fit the shape of modules and to detect them on different scales. Experiments on protein-protein interaction and genetic interaction networks show that our method can not only mine functional modules and protein complexes successfully, but can also predict functional modules from specific to general and reveal the hierarchical organization of protein complexes.
Centrins in unicellular organisms: functional diversity and specialization.
Zhang, Yu; He, Cynthia Y
2012-07-01
Centrins (also known as caltractins) are conserved, EF hand-containing proteins ubiquitously found in eukaryotes. Similar to calmodulins, the calcium-binding EF hands in centrins fold into two structurally similar domains separated by an alpha-helical linker region, shaping like a dumbbell. The small size (15-22 kDa) and domain organization of centrins and their functional diversity/specialization make them an ideal system to study protein structure-function relationship. Here, we review the work on centrins with a focus on their structures and functions characterized in unicellular organisms.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Law, Anthony B.; Sapienza, Paul J.; Zhang, Jun
Allostery enables tight regulation of protein function in the cellular environment. While existing models of allostery are firmly rooted in the current structure-function paradigm, the mechanistic basis for allostery in the absence of structural change remains unclear. In this study, we show that a typical globular protein is able to undergo significant changes in volume under native conditions while exhibiting no additional changes in protein structure. These native state volume fluctuations were found to correlate with changes in internal motions that were previously recognized as a source of allosteric entropy. This finding offers a novel mechanistic basis for allostery inmore » the absence of canonical structural change. As a result, the unexpected observation that function can be derived from expanded, low density protein states has broad implications for our understanding of allostery and suggests that the general concept of the native state be expanded to allow for more variable physical dimensions with looser packing.« less
The unfoldomics decade: an update on intrinsically disordered proteins.
Dunker, A Keith; Oldfield, Christopher J; Meng, Jingwei; Romero, Pedro; Yang, Jack Y; Chen, Jessica Walton; Vacic, Vladimir; Obradovic, Zoran; Uversky, Vladimir N
2008-09-16
Our first predictor of protein disorder was published just over a decade ago in the Proceedings of the IEEE International Conference on Neural Networks (Romero P, Obradovic Z, Kissinger C, Villafranca JE, Dunker AK (1997) Identifying disordered regions in proteins from amino acid sequence. Proceedings of the IEEE International Conference on Neural Networks, 1: 90-95). By now more than twenty other laboratory groups have joined the efforts to improve the prediction of protein disorder. While the various prediction methodologies used for protein intrinsic disorder resemble those methodologies used for secondary structure prediction, the two types of structures are entirely different. For example, the two structural classes have very different dynamic properties, with the irregular secondary structure class being much less mobile than the disorder class. The prediction of secondary structure has been useful. On the other hand, the prediction of intrinsic disorder has been revolutionary, leading to major modifications of the more than 100 year-old views relating protein structure and function. Experimentalists have been providing evidence over many decades that some proteins lack fixed structure or are disordered (or unfolded) under physiological conditions. In addition, experimentalists are also showing that, for many proteins, their functions depend on the unstructured rather than structured state; such results are in marked contrast to the greater than hundred year old views such as the lock and key hypothesis. Despite extensive data on many important examples, including disease-associated proteins, the importance of disorder for protein function has been largely ignored. Indeed, to our knowledge, current biochemistry books don't present even one acknowledged example of a disorder-dependent function, even though some reports of disorder-dependent functions are more than 50 years old. The results from genome-wide predictions of intrinsic disorder and the results from other bioinformatics studies of intrinsic disorder are demanding attention for these proteins. Disorder prediction has been important for showing that the relatively few experimentally characterized examples are members of a very large collection of related disordered proteins that are wide-spread over all three domains of life. Many significant biological functions are now known to depend directly on, or are importantly associated with, the unfolded or partially folded state. Here our goal is to review the key discoveries and to weave these discoveries together to support novel approaches for understanding sequence-function relationships. Intrinsically disordered protein is common across the three domains of life, but especially common among the eukaryotic proteomes. Signaling sequences and sites of posttranslational modifications are frequently, or very likely most often, located within regions of intrinsic disorder. Disorder-to-order transitions are coupled with the adoption of different structures with different partners. Also, the flexibility of intrinsic disorder helps different disordered regions to bind to a common binding site on a common partner. Such capacity for binding diversity plays important roles in both protein-protein interaction networks and likely also in gene regulation networks. Such disorder-based signaling is further modulated in multicellular eukaryotes by alternative splicing, for which such splicing events map to regions of disorder much more often than to regions of structure. Associating alternative splicing with disorder rather than structure alleviates theoretical and experimentally observed problems associated with the folding of different length, isomeric amino acid sequences. The combination of disorder and alternative splicing is proposed to provide a mechanism for easily "trying out" different signaling pathways, thereby providing the mechanism for generating signaling diversity and enabling the evolution of cell differentiation and multicellularity. Finally, several recent small molecules of interest as potential drugs have been shown to act by blocking protein-protein interactions based on intrinsic disorder of one of the partners. Study of these examples has led to a new approach for drug discovery, and bioinformatics analysis of the human proteome suggests that various disease-associated proteins are very rich in such disorder-based drug discovery targets.
An Evolution-Based Approach to De Novo Protein Design and Case Study on Mycobacterium tuberculosis
Brender, Jeffrey R.; Czajka, Jeff; Marsh, David; Gray, Felicia; Cierpicki, Tomasz; Zhang, Yang
2013-01-01
Computational protein design is a reverse procedure of protein folding and structure prediction, where constructing structures from evolutionarily related proteins has been demonstrated to be the most reliable method for protein 3-dimensional structure prediction. Following this spirit, we developed a novel method to design new protein sequences based on evolutionarily related protein families. For a given target structure, a set of proteins having similar fold are identified from the PDB library by structural alignments. A structural profile is then constructed from the protein templates and used to guide the conformational search of amino acid sequence space, where physicochemical packing is accommodated by single-sequence based solvation, torsion angle, and secondary structure predictions. The method was tested on a computational folding experiment based on a large set of 87 protein structures covering different fold classes, which showed that the evolution-based design significantly enhances the foldability and biological functionality of the designed sequences compared to the traditional physics-based force field methods. Without using homologous proteins, the designed sequences can be folded with an average root-mean-square-deviation of 2.1 Å to the target. As a case study, the method is extended to redesign all 243 structurally resolved proteins in the pathogenic bacteria Mycobacterium tuberculosis, which is the second leading cause of death from infectious disease. On a smaller scale, five sequences were randomly selected from the design pool and subjected to experimental validation. The results showed that all the designed proteins are soluble with distinct secondary structure and three have well ordered tertiary structure, as demonstrated by circular dichroism and NMR spectroscopy. Together, these results demonstrate a new avenue in computational protein design that uses knowledge of evolutionary conservation from protein structural families to engineer new protein molecules of improved fold stability and biological functionality. PMID:24204234
The SARS coronavirus nucleocapsid protein--forms and functions.
Chang, Chung-ke; Hou, Ming-Hon; Chang, Chi-Fon; Hsiao, Chwan-Deng; Huang, Tai-huang
2014-03-01
The nucleocapsid phosphoprotein of the severe acute respiratory syndrome coronavirus (SARS-CoV N protein) packages the viral genome into a helical ribonucleocapsid (RNP) and plays a fundamental role during viral self-assembly. It is a protein with multifarious activities. In this article we will review our current understanding of the N protein structure and its interaction with nucleic acid. Highlights of the progresses include uncovering the modular organization, determining the structures of the structural domains, realizing the roles of protein disorder in protein-protein and protein-nucleic acid interactions, and visualizing the ribonucleoprotein (RNP) structure inside the virions. It was also demonstrated that N-protein binds to nucleic acid at multiple sites with a coupled-allostery manner. We propose a SARS-CoV RNP model that conforms to existing data and bears resemblance to the existing RNP structures of RNA viruses. The model highlights the critical role of modular organization and intrinsic disorder of the N protein in the formation and functions of the dynamic RNP capsid in RNA viruses. This paper forms part of a symposium in Antiviral Research on "From SARS to MERS: 10 years of research on highly pathogenic human coronaviruses." Copyright © 2014 Elsevier B.V. All rights reserved.
Kavianpour, Hamidreza; Vasighi, Mahdi
2017-02-01
Nowadays, having knowledge about cellular attributes of proteins has an important role in pharmacy, medical science and molecular biology. These attributes are closely correlated with the function and three-dimensional structure of proteins. Knowledge of protein structural class is used by various methods for better understanding the protein functionality and folding patterns. Computational methods and intelligence systems can have an important role in performing structural classification of proteins. Most of protein sequences are saved in databanks as characters and strings and a numerical representation is essential for applying machine learning methods. In this work, a binary representation of protein sequences is introduced based on reduced amino acids alphabets according to surrounding hydrophobicity index. Many important features which are hidden in these long binary sequences can be clearly displayed through their cellular automata images. The extracted features from these images are used to build a classification model by support vector machine. Comparing to previous studies on the several benchmark datasets, the promising classification rates obtained by tenfold cross-validation imply that the current approach can help in revealing some inherent features deeply hidden in protein sequences and improve the quality of predicting protein structural class.
Hydrophobic potential of mean force as a solvation function for protein structure prediction.
Lin, Matthew S; Fawzi, Nicolas Lux; Head-Gordon, Teresa
2007-06-01
We have developed a solvation function that combines a Generalized Born model for polarization of protein charge by the high dielectric solvent, with a hydrophobic potential of mean force (HPMF) as a model for hydrophobic interaction, to aid in the discrimination of native structures from other misfolded states in protein structure prediction. We find that our energy function outperforms other reported scoring functions in terms of correct native ranking for 91% of proteins and low Z scores for a variety of decoy sets, including the challenging Rosetta decoys. This work shows that the stabilizing effect of hydrophobic exposure to aqueous solvent that defines the HPMF hydration physics is an apparent improvement over solvent-accessible surface area models that penalize hydrophobic exposure. Decoys generated by thermal sampling around the native-state basin reveal a potentially important role for side-chain entropy in the future development of even more accurate free energy surfaces.
Functional and Structural Analysis of the Conserved EFhd2 Protein
Acosta, Yancy Ferrer; Rodríguez Cruz, Eva N.; Vaquer, Ana del C.; Vega, Irving E.
2013-01-01
EFhd2 is a novel protein conserved from C. elegans to H. sapiens. This novel protein was originally identified in cells of the immune and central nervous systems. However, it is most abundant in the central nervous system, where it has been found associated with pathological forms of the microtubule-associated protein tau. The physiological or pathological roles of EFhd2 are poorly understood. In this study, a functional and structural analysis was carried to characterize the molecular requirements for EFhd2’s calcium binding activity. The results showed that mutations of a conserved aspartate on either EF-hand motif disrupted the calcium binding activity, indicating that these motifs work in pair as a functional calcium binding domain. Furthermore, characterization of an identified single-nucleotide polymorphisms (SNP) that introduced a missense mutation indicates the importance of a conserved phenylalanine on EFhd2 calcium binding activity. Structural analysis revealed that EFhd2 is predominantly composed of alpha helix and random coil structures and that this novel protein is thermostable. EFhd2’s thermo stability depends on its N-terminus. In the absence of the N-terminus, calcium binding restored EFhd2’s thermal stability. Overall, these studies contribute to our understanding on EFhd2 functional and structural properties, and introduce it into the family of canonical EF-hand domain containing proteins. PMID:22973849
Protein structure based prediction of catalytic residues.
Fajardo, J Eduardo; Fiser, Andras
2013-02-22
Worldwide structural genomics projects continue to release new protein structures at an unprecedented pace, so far nearly 6000, but only about 60% of these proteins have any sort of functional annotation. We explored a range of features that can be used for the prediction of functional residues given a known three-dimensional structure. These features include various centrality measures of nodes in graphs of interacting residues: closeness, betweenness and page-rank centrality. We also analyzed the distance of functional amino acids to the general center of mass (GCM) of the structure, relative solvent accessibility (RSA), and the use of relative entropy as a measure of sequence conservation. From the selected features, neural networks were trained to identify catalytic residues. We found that using distance to the GCM together with amino acid type provide a good discriminant function, when combined independently with sequence conservation. Using an independent test set of 29 annotated protein structures, the method returned 411 of the initial 9262 residues as the most likely to be involved in function. The output 411 residues contain 70 of the annotated 111 catalytic residues. This represents an approximately 14-fold enrichment of catalytic residues on the entire input set (corresponding to a sensitivity of 63% and a precision of 17%), a performance competitive with that of other state-of-the-art methods. We found that several of the graph based measures utilize the same underlying feature of protein structures, which can be simply and more effectively captured with the distance to GCM definition. This also has the added the advantage of simplicity and easy implementation. Meanwhile sequence conservation remains by far the most influential feature in identifying functional residues. We also found that due the rapid changes in size and composition of sequence databases, conservation calculations must be recalibrated for specific reference databases.
A Parametric Rosetta Energy Function Analysis with LK Peptides on SAM Surfaces.
Lubin, Joseph H; Pacella, Michael S; Gray, Jeffrey J
2018-05-08
Although structures have been determined for many soluble proteins and an increasing number of membrane proteins, experimental structure determination methods are limited for complexes of proteins and solid surfaces. An economical alternative or complement to experimental structure determination is molecular simulation. Rosetta is one software suite that models protein-surface interactions, but Rosetta is normally benchmarked on soluble proteins. For surface interactions, the validity of the energy function is uncertain because it is a combination of independent parameters from energy functions developed separately for solution proteins and mineral surfaces. Here, we assess the performance of the RosettaSurface algorithm and test the accuracy of its energy function by modeling the adsorption of leucine/lysine (LK)-repeat peptides on methyl- and carboxy-terminated self-assembled monolayers (SAMs). We investigated how RosettaSurface predictions for this system compare with the experimental results, which showed that on both surfaces, LK-α peptides folded into helices and LK-β peptides held extended structures. Utilizing this model system, we performed a parametric analysis of Rosetta's Talaris energy function and determined that adjusting solvation parameters offered improved predictive accuracy. Simultaneously increasing lysine carbon hydrophilicity and the hydrophobicity of the surface methyl head groups yielded computational predictions most closely matching the experimental results. De novo models still should be interpreted skeptically unless bolstered in an integrative approach with experimental data.
Grandison, Scott; Roberts, Carl; Morris, Richard J
2009-03-01
Protein structures are not static entities consisting of equally well-determined atomic coordinates. Proteins undergo continuous motion, and as catalytic machines, these movements can be of high relevance for understanding function. In addition to this strong biological motivation for considering shape changes is the necessity to correctly capture different levels of detail and error in protein structures. Some parts of a structural model are often poorly defined, and the atomic displacement parameters provide an excellent means to characterize the confidence in an atom's spatial coordinates. A mathematical framework for studying these shape changes, and handling positional variance is therefore of high importance. We present an approach for capturing various protein structure properties in a concise mathematical framework that allows us to compare features in a highly efficient manner. We demonstrate how three-dimensional Zernike moments can be employed to describe functions, not only on the surface of a protein but throughout the entire molecule. A number of proof-of-principle examples are given which demonstrate how this approach may be used in practice for the representation of movement and uncertainty.
Bressan, Gustavo Costa; Kobarg, Jörg
2010-01-01
The mapping of protein-protein interactions of a determined organism is considered fundamental to assign protein function in the post-genomic era. As part of this effort, screenings for pairwise interactions by yeast two-hybrid system have been used popularly to reveal protein interaction networks in different biological systems. Through the identification of protein interaction partners we have successfully obtained interesting functional clues for Ki-1/57, a human protein with no previous functional annotation, in the context of RNA metabolism. We briefly discuss the way we approached protein-protein interaction data to conduct and interpret further molecular biological and cellular studies as well as structural analyses on this protein. Our data suggest that Ki-1/57 belongs to the family of intrinsically unstructured proteins and that the structural flexibility may be crucial for its capacity to interact with many different proteins. A large fraction of these proteins are involved in pre-mRNA splicing control. Finally, Ki-1/57 is localized to several subnuclear domains, all of which have been described to splicing and other RNA processing events.
Hou, Zhi-Shuai; Ulloa-Aguirre, Alfredo; Tao, Ya-Xiong
2018-06-01
Conformational diseases are caused by structurally abnormal proteins that cannot fold properly and achieve their native conformation. Misfolded proteins frequently originate from genetic mutations that may lead to loss-of-function diseases involving a variety of structurally diverse proteins including enzymes, ion channels, and membrane receptors. Pharmacoperones are small molecules that cross the cell surface plasma membrane and reach their target proteins within the cell, serving as molecular scaffolds to stabilize the native conformation of misfolded or well-folded but destabilized proteins, to prevent their degradation and promote correct trafficking to their functional site of action. Because of their high specificity toward the target protein, pharmacoperones are currently the focus of intense investigation as therapy for several conformational diseases. Areas covered: This review summarizes data on the mechanisms leading to protein misfolding and the use of pharmacoperone drugs as an experimental approach to rescue function of distinct misfolded/misrouted proteins associated with a variety of diseases, such as lysosomal storage diseases, channelopathies, and G protein-coupled receptor misfolding diseases. Expert commentary: The fact that many misfolded proteins may retain function, offers a unique therapeutic opportunity to cure disease by directly correcting misrouting through administering pharmacoperone drugs thereby rescuing function of disease-causing, conformationally abnormal proteins.
DNAproDB: an interactive tool for structural analysis of DNA–protein complexes
Sagendorf, Jared M.
2017-01-01
Abstract Many biological processes are mediated by complex interactions between DNA and proteins. Transcription factors, various polymerases, nucleases and histones recognize and bind DNA with different levels of binding specificity. To understand the physical mechanisms that allow proteins to recognize DNA and achieve their biological functions, it is important to analyze structures of DNA–protein complexes in detail. DNAproDB is a web-based interactive tool designed to help researchers study these complexes. DNAproDB provides an automated structure-processing pipeline that extracts structural features from DNA–protein complexes. The extracted features are organized in structured data files, which are easily parsed with any programming language or viewed in a browser. We processed a large number of DNA–protein complexes retrieved from the Protein Data Bank and created the DNAproDB database to store this data. Users can search the database by combining features of the DNA, protein or DNA–protein interactions at the interface. Additionally, users can upload their own structures for processing privately and securely. DNAproDB provides several interactive and customizable tools for creating visualizations of the DNA–protein interface at different levels of abstraction that can be exported as high quality figures. All functionality is documented and freely accessible at http://dnaprodb.usc.edu. PMID:28431131
Apoferritin fibers: a new template for 1D fluorescent hybrid nanostructures
NASA Astrophysics Data System (ADS)
Jurado, Rocío; Castello, Fabio; Bondia, Patricia; Casado, Santiago; Flors, Cristina; Cuesta, Rafael; Domínguez-Vera, José M.; Orte, Angel; Gálvez, Natividad
2016-05-01
Recently, research in the field of protein amyloid fibers has gained great attention due to the use of these materials as nanoscale templates for the construction of functional hybrid materials. The formation of apoferritin amyloid-like protein fibers is demonstrated herein for the first time. The morphology, size and stiffness of these one-dimensional structures are comparable to the fibers formed by β-lactoglobulin, a protein frequently used as a model in the study of amyloid-like fibrillar proteins. Nanometer-sized globular apoferritin is capable of self-assembling to form 1D micrometer-sized structures after being subjected to a heating process. Depending on the experimental conditions, fibers with different morphologies and sizes are obtained. The wire-like protein structure is rich in functional groups and allows chemical functionalization with diverse quantum dots (QD), as well as with different Alexa Fluor (AF) dyes, leading to hybrid fluorescent fibers with variable emission wavelengths, from green to near infrared, depending on the QD and AFs coupled. For fibers containing the pair AF488 and AF647, efficient fluorescence energy transfer from the covalently coupled donor (AF488) to acceptor tags (AF647) takes place. Apoferritin fibers are proposed here as a new promising template for obtaining hybrid functional materials.Recently, research in the field of protein amyloid fibers has gained great attention due to the use of these materials as nanoscale templates for the construction of functional hybrid materials. The formation of apoferritin amyloid-like protein fibers is demonstrated herein for the first time. The morphology, size and stiffness of these one-dimensional structures are comparable to the fibers formed by β-lactoglobulin, a protein frequently used as a model in the study of amyloid-like fibrillar proteins. Nanometer-sized globular apoferritin is capable of self-assembling to form 1D micrometer-sized structures after being subjected to a heating process. Depending on the experimental conditions, fibers with different morphologies and sizes are obtained. The wire-like protein structure is rich in functional groups and allows chemical functionalization with diverse quantum dots (QD), as well as with different Alexa Fluor (AF) dyes, leading to hybrid fluorescent fibers with variable emission wavelengths, from green to near infrared, depending on the QD and AFs coupled. For fibers containing the pair AF488 and AF647, efficient fluorescence energy transfer from the covalently coupled donor (AF488) to acceptor tags (AF647) takes place. Apoferritin fibers are proposed here as a new promising template for obtaining hybrid functional materials. Electronic supplementary information (ESI) available: TEM images of ferritin protein fiber formation, and apoferritin after 18 days of heat treatment; FLIM-PIE technique details; fluorescence emission spectra of apoferritin and β-lactoglobulin fibers functionalized with different QDs. See DOI: 10.1039/c6nr01044j
Venko, Katja; Roy Choudhury, A; Novič, Marjana
2017-01-01
The structural and functional details of transmembrane proteins are vastly underexplored, mostly due to experimental difficulties regarding their solubility and stability. Currently, the majority of transmembrane protein structures are still unknown and this present a huge experimental and computational challenge. Nowadays, thanks to X-ray crystallography or NMR spectroscopy over 3000 structures of membrane proteins have been solved, among them only a few hundred unique ones. Due to the vast biological and pharmaceutical interest in the elucidation of the structure and the functional mechanisms of transmembrane proteins, several computational methods have been developed to overcome the experimental gap. If combined with experimental data the computational information enables rapid, low cost and successful predictions of the molecular structure of unsolved proteins. The reliability of the predictions depends on the availability and accuracy of experimental data associated with structural information. In this review, the following methods are proposed for in silico structure elucidation: sequence-dependent predictions of transmembrane regions, predictions of transmembrane helix-helix interactions, helix arrangements in membrane models, and testing their stability with molecular dynamics simulations. We also demonstrate the usage of the computational methods listed above by proposing a model for the molecular structure of the transmembrane protein bilitranslocase. Bilitranslocase is bilirubin membrane transporter, which shares similar tissue distribution and functional properties with some of the members of the Organic Anion Transporter family and is the only member classified in the Bilirubin Transporter Family. Regarding its unique properties, bilitranslocase is a potentially interesting drug target.
Recognition of functional sites in protein structures.
Shulman-Peleg, Alexandra; Nussinov, Ruth; Wolfson, Haim J
2004-06-04
Recognition of regions on the surface of one protein, that are similar to a binding site of another is crucial for the prediction of molecular interactions and for functional classifications. We first describe a novel method, SiteEngine, that assumes no sequence or fold similarities and is able to recognize proteins that have similar binding sites and may perform similar functions. We achieve high efficiency and speed by introducing a low-resolution surface representation via chemically important surface points, by hashing triangles of physico-chemical properties and by application of hierarchical scoring schemes for a thorough exploration of global and local similarities. We proceed to rigorously apply this method to functional site recognition in three possible ways: first, we search a given functional site on a large set of complete protein structures. Second, a potential functional site on a protein of interest is compared with known binding sites, to recognize similar features. Third, a complete protein structure is searched for the presence of an a priori unknown functional site, similar to known sites. Our method is robust and efficient enough to allow computationally demanding applications such as the first and the third. From the biological standpoint, the first application may identify secondary binding sites of drugs that may lead to side-effects. The third application finds new potential sites on the protein that may provide targets for drug design. Each of the three applications may aid in assigning a function and in classification of binding patterns. We highlight the advantages and disadvantages of each type of search, provide examples of large-scale searches of the entire Protein Data Base and make functional predictions.
FunTree: advances in a resource for exploring and contextualising protein function evolution.
Sillitoe, Ian; Furnham, Nicholas
2016-01-04
FunTree is a resource that brings together protein sequence, structure and functional information, including overall chemical reaction and mechanistic data, for structurally defined domain superfamilies. Developed in tandem with the CATH database, the original FunTree contained just 276 superfamilies focused on enzymes. Here, we present an update of FunTree that has expanded to include 2340 superfamilies including both enzymes and proteins with non-enzymatic functions annotated by Gene Ontology (GO) terms. This allows the investigation of how novel functions have evolved within a structurally defined superfamily and provides a means to analyse trends across many superfamilies. This is done not only within the context of a protein's sequence and structure but also the relationships of their functions. New measures of functional similarity have been integrated, including for enzymes comparisons of overall reactions based on overall bond changes, reaction centres (the local environment atoms involved in the reaction) and the sub-structure similarities of the metabolites involved in the reaction and for non-enzymes semantic similarities based on the GO. To identify and highlight changes in function through evolution, ancestral character estimations are made and presented. All this is accessible through a new re-designed web interface that can be found at http://www.funtree.info. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Supra-domains: evolutionary units larger than single protein domains.
Vogel, Christine; Berzuini, Carlo; Bashton, Matthew; Gough, Julian; Teichmann, Sarah A
2004-02-20
Domains are the evolutionary units that comprise proteins, and most proteins are built from more than one domain. Domains can be shuffled by recombination to create proteins with new arrangements of domains. Using structural domain assignments, we examined the combinations of domains in the proteins of 131 completely sequenced organisms. We found two-domain and three-domain combinations that recur in different protein contexts with different partner domains. The domains within these combinations have a particular functional and spatial relationship. These units are larger than individual domains and we term them "supra-domains". Amongst the supra-domains, we identified some 1400 (1203 two-domain and 166 three-domain) combinations that are statistically significantly over-represented relative to the occurrence and versatility of the individual component domains. Over one-third of all structurally assigned multi-domain proteins contain these over-represented supra-domains. This means that investigation of the structural and functional relationships of the domains forming these popular combinations would be particularly useful for an understanding of multi-domain protein function and evolution as well as for genome annotation. These and other supra-domains were analysed for their versatility, duplication, their distribution across the three kingdoms of life and their functional classes. By examining the three-dimensional structures of several examples of supra-domains in different biological processes, we identify two basic types of spatial relationships between the component domains: the combined function of the two domains is such that either the geometry of the two domains is crucial and there is a tight constraint on the interface, or the precise orientation of the domains is less important and they are spatially separate. Frequently, the role of the supra-domain becomes clear only once the three-dimensional structure is known. Since this is the case for only a quarter of the supra-domains, we provide a list of the most important unknown supra-domains as potential targets for structural genomics projects.
Posttranslational Modifications Regulate the Postsynaptic Localization of PSD-95.
Vallejo, Daniela; Codocedo, Juan F; Inestrosa, Nibaldo C
2017-04-01
The postsynaptic density (PSD) consists of a lattice-like array of interacting proteins that organizes and stabilizes synaptic receptors, ion channels, structural proteins, and signaling molecules required for normal synaptic transmission and synaptic function. The scaffolding and hub protein postsynaptic density protein-95 (PSD-95) is a major element of central chemical synapses and interacts with glutamate receptors, cell adhesion molecules, and cytoskeletal elements. In fact, PSD-95 can regulate basal synaptic stability as well as the activity-dependent structural plasticity of the PSD and, therefore, of the excitatory chemical synapse. Several studies have shown that PSD-95 is highly enriched at excitatory synapses and have identified multiple protein structural domains and protein-protein interactions that mediate PSD-95 function and trafficking to the postsynaptic region. PSD-95 is also a target of several signaling pathways that induce posttranslational modifications, including palmitoylation, phosphorylation, ubiquitination, nitrosylation, and neddylation; these modifications determine the synaptic stability and function of PSD-95 and thus regulate the fates of individual dendritic spines in the nervous system. In the present work, we review the posttranslational modifications that regulate the synaptic localization of PSD-95 and describe their functional consequences. We also explore the signaling pathways that induce such changes.
Functional dynamics of cell surface membrane proteins
NASA Astrophysics Data System (ADS)
Nishida, Noritaka; Osawa, Masanori; Takeuchi, Koh; Imai, Shunsuke; Stampoulis, Pavlos; Kofuku, Yutaka; Ueda, Takumi; Shimada, Ichio
2014-04-01
Cell surface receptors are integral membrane proteins that receive external stimuli, and transmit signals across plasma membranes. In the conventional view of receptor activation, ligand binding to the extracellular side of the receptor induces conformational changes, which convert the structure of the receptor into an active conformation. However, recent NMR studies of cell surface membrane proteins have revealed that their structures are more dynamic than previously envisioned, and they fluctuate between multiple conformations in an equilibrium on various timescales. In addition, NMR analyses, along with biochemical and cell biological experiments indicated that such dynamical properties are critical for the proper functions of the receptors. In this review, we will describe several NMR studies that revealed direct linkage between the structural dynamics and the functions of the cell surface membrane proteins, such as G-protein coupled receptors (GPCRs), ion channels, membrane transporters, and cell adhesion molecules.
Functional dynamics of cell surface membrane proteins.
Nishida, Noritaka; Osawa, Masanori; Takeuchi, Koh; Imai, Shunsuke; Stampoulis, Pavlos; Kofuku, Yutaka; Ueda, Takumi; Shimada, Ichio
2014-04-01
Cell surface receptors are integral membrane proteins that receive external stimuli, and transmit signals across plasma membranes. In the conventional view of receptor activation, ligand binding to the extracellular side of the receptor induces conformational changes, which convert the structure of the receptor into an active conformation. However, recent NMR studies of cell surface membrane proteins have revealed that their structures are more dynamic than previously envisioned, and they fluctuate between multiple conformations in an equilibrium on various timescales. In addition, NMR analyses, along with biochemical and cell biological experiments indicated that such dynamical properties are critical for the proper functions of the receptors. In this review, we will describe several NMR studies that revealed direct linkage between the structural dynamics and the functions of the cell surface membrane proteins, such as G-protein coupled receptors (GPCRs), ion channels, membrane transporters, and cell adhesion molecules. Copyright © 2013 Elsevier Inc. All rights reserved.
-6223 Research Interests Molecular mechanisms of cellulose-degrading enzymes Structure-function relationships of biomass-derived polymers Structure-function relationships in glycoside hydrolases Methane potential protein engineering targets. Structure-Function Relationships of Biomass-Derived Polymers
Omnipresence of the polyproline II helix in fibrous and globular proteins.
Esipova, Natalia G; Tumanyan, Vladimir G
2017-02-01
Left-handed helical conformation of a polypeptide chain (PPII) is the third type of the protein backbone structure. This conformation universally exists in fibrous, globular proteins, and biologically active peptides. It has unique physical and chemical properties determining a wide range of biological functions, from the protein folding to the tissue differentiation. New examples of the structure have been appearing in spite of difficulties in their detection and investigation. The annotation and prediction of the PPII was also a challenging task. Recently, many PPII motifs with new and/or unexpected functions are being accumulated in databases. In this review we describe the major structural and dynamic forms of PPII, the diversity of its functions, and the role in different biological processes. Copyright © 2016 Elsevier Ltd. All rights reserved.
The many blades of the β-propeller proteins: conserved but versatile.
Chen, Cammy K-M; Chan, Nei-Li; Wang, Andrew H-J
2011-10-01
The β-propeller is a highly symmetrical structure with 4-10 repeats of a four-stranded antiparallel β-sheet motif. Although β-propeller proteins with different blade numbers all adopt disc-like shapes, they are involved in a diverse set of functions, and defects in this family of proteins have been associated with human diseases. However, it has remained ambiguous how variations in blade number could alter the function of β-propellers. In addition to the regularly arranged β-propeller topology, a recently discovered β-pinwheel propeller has been found. Here, we review the structural and functional diversity of β-propeller proteins, including β-pinwheels, as well as recent advances in the typical and atypical propeller structures. Copyright © 2011 Elsevier Ltd. All rights reserved.
An Algorithm for Protein Helix Assignment Using Helix Geometry
Cao, Chen; Xu, Shutan; Wang, Lincong
2015-01-01
Helices are one of the most common and were among the earliest recognized secondary structure elements in proteins. The assignment of helices in a protein underlies the analysis of its structure and function. Though the mathematical expression for a helical curve is simple, no previous assignment programs have used a genuine helical curve as a model for helix assignment. In this paper we present a two-step assignment algorithm. The first step searches for a series of bona fide helical curves each one best fits the coordinates of four successive backbone Cα atoms. The second step uses the best fit helical curves as input to make helix assignment. The application to the protein structures in the PDB (protein data bank) proves that the algorithm is able to assign accurately not only regular α-helix but also 310 and π helices as well as their left-handed versions. One salient feature of the algorithm is that the assigned helices are structurally more uniform than those by the previous programs. The structural uniformity should be useful for protein structure classification and prediction while the accurate assignment of a helix to a particular type underlies structure-function relationship in proteins. PMID:26132394
Wolf, Maxim Y; Wolf, Yuri I; Koonin, Eugene V
2008-01-01
Background Proteins show a broad range of evolutionary rates. Understanding the factors that are responsible for the characteristic rate of evolution of a given protein arguably is one of the major goals of evolutionary biology. A long-standing general assumption used to be that the evolution rate is, primarily, determined by the specific functional constraints that affect the given protein. These constrains were traditionally thought to depend both on the specific features of the protein's structure and its biological role. The advent of systems biology brought about new types of data, such as expression level and protein-protein interactions, and unexpectedly, a variety of correlations between protein evolution rate and these variables have been observed. The strongest connections by far were repeatedly seen between protein sequence evolution rate and the expression level of the respective gene. It has been hypothesized that this link is due to the selection for the robustness of the protein structure to mistranslation-induced misfolding that is particularly important for highly expressed proteins and is the dominant determinant of the sequence evolution rate. Results This work is an attempt to assess the relative contributions of protein domain structure and function, on the one hand, and expression level on the other hand, to the rate of sequence evolution. To this end, we performed a genome-wide analysis of the effect of the fusion of a pair of domains in multidomain proteins on the difference in the domain-specific evolutionary rates. The mistranslation-induced misfolding hypothesis would predict that, within multidomain proteins, fused domains, on average, should evolve at substantially closer rates than the same domains in different proteins because, within a mutlidomain protein, all domains are translated at the same rate. We performed a comprehensive comparison of the evolutionary rates of mammalian and plant protein domains that are either joined in multidomain proteins or contained in distinct proteins. Substantial homogenization of evolutionary rates in multidomain proteins was, indeed, observed in both animals and plants, although highly significant differences between domain-specific rates remained. The contributions of the translation rate, as determined by the effect of the fusion of a pair of domains within a multidomain protein, and intrinsic, domain-specific structural-functional constraints appear to be comparable in magnitude. Conclusion Fusion of domains in a multidomain protein results in substantial homogenization of the domain-specific evolutionary rates but significant differences between domain-specific evolution rates remain. Thus, the rate of translation and intrinsic structural-functional constraints both exert sizable and comparable effects on sequence evolution. Reviewers This article was reviewed by Sergei Maslov, Dennis Vitkup, Claus Wilke (nominated by Orly Alter), and Allan Drummond (nominated by Joel Bader). For the full reviews, please go to the Reviewers' Reports section. PMID:18840284
Taha; Siddiqui, K S; Campanaro, S; Najnin, T; Deshpande, N; Williams, T J; Aldrich-Wright, J; Wilkins, M; Curmi, P M G; Cavicchioli, R
2016-09-01
TRAM domain proteins present in Archaea and Bacteria have a β-barrel shape with anti-parallel β-sheets that form a nucleic acid binding surface; a structure also present in cold shock proteins (Csps). Aside from protein structures, experimental data defining the function of TRAM domains is lacking. Here, we explore the possible functional properties of a single TRAM domain protein, Ctr3 (cold-responsive TRAM domain protein 3) from the Antarctic archaeon Methanococcoides burtonii that has increased abundance during low temperature growth. Ribonucleic acid (RNA) bound by Ctr3 in vitro was determined using RNA-seq. Ctr3-bound M. burtonii RNA with a preference for transfer (t)RNA and 5S ribosomal RNA, and a potential binding motif was identified. In tRNA, the motif represented the C loop; a region that is conserved in tRNA from all domains of life and appears to be solvent exposed, potentially providing access for Ctr3 to bind. Ctr3 and Csps are structurally similar and are both inferred to function in low temperature translation. The broad representation of single TRAM domain proteins within Archaea compared with their apparent absence in Bacteria, and scarcity of Csps in Archaea but prevalence in Bacteria, suggests they represent distinct evolutionary lineages of functionally equivalent RNA-binding proteins. © 2016 Society for Applied Microbiology and John Wiley & Sons Ltd.
Misra, R D K; Nune, C; Pesacreta, T C; Somani, M C; Karjalainen, L P
2013-01-01
The rapid adsorption of proteins is the starting and primary biological response that occurs when a biomedical device is implanted in the physiological system. The biological response, however, depends on the surface characteristics of the device. Considering the significant interest in nano-/ultrafine surfaces and nanostructured coatings, we describe here, the interplay between grain structure and protein adsorption (bovine serum albumin: BSA) on osteoblasts functions by comparing nanograined/ultrafine-grained (NG/UFG) and coarse-grained (CG: grain size in the micrometer range) substrates by investigating cell-substrate interactions. The protein adsorption on NG/UFG surface was beneficial in favorably modulating biological functions including cell attachment, proliferation, and viability, whereas the effect was less pronounced on protein adsorbed CG surface. Additionally, immunofluorescence studies demonstrated stronger vinculin signals associated with actin stress fibers in the outer regions of the cells and cellular extensions on protein adsorbed NG/UFG surface. The functional response followed the sequence: NG/UFG(BSA) > NG/UFG > CG(BSA) > CG. The differences in the cellular response on bare and protein adsorbed NG/UFG and CG surfaces are attributed to cumulative contribution of grain structure and degree of hydrophilicity. The study underscores the potential advantages of protein adsorption on artificial biomedical devices to enhance the bioactivity and regulate biological functions. Copyright © 2012 Wiley Periodicals, Inc.
Wanscher, Anne Sofie Molsted; Williamson, Michael; Ebersole, Tasja Wainani; Streicher, Werner; Wikström, Mats; Cazzamali, Giuseppe
2015-04-01
Insulin-like growth factor binding proteins (IGFBPs) display many functions in humans including regulation of the insulin-like growth factor (IGF) signaling pathway. The various roles of human IGFBPs make them attractive protein candidates in drug discovery. Structural and functional knowledge on human proteins with therapeutic relevance is needed to design and process the next generation of protein therapeutics. In order to conduct structural and functional investigations large quantities of recombinant proteins are needed. However, finding a suitable recombinant production system for proteins such as full-length human IGFBPs, still remains a challenge. Here we present a mammalian HEK293 expression method suitable for over-expression of secretory full-length human IGFBP-1 to -7. Protein purification of full-length human IGFBP-1, -2, -3 and -5 was conducted using a two-step chromatography procedure and the final protein yields were between 1 and 12mg protein per liter culture media. The recombinant IGFBPs contained PTMs and exhibited high-affinity interactions with their natural ligands IGF-1 and IGF-2. Copyright © 2014 Elsevier Inc. All rights reserved.
PDB2Graph: A toolbox for identifying critical amino acids map in proteins based on graph theory.
Niknam, Niloofar; Khakzad, Hamed; Arab, Seyed Shahriar; Naderi-Manesh, Hossein
2016-05-01
The integrative and cooperative nature of protein structure involves the assessment of topological and global features of constituent parts. Network concept takes complete advantage of both of these properties in the analysis concomitantly. High compatibility to structural concepts or physicochemical properties in addition to exploiting a remarkable simplification in the system has made network an ideal tool to explore biological systems. There are numerous examples in which different protein structural and functional characteristics have been clarified by the network approach. Here, we present an interactive and user-friendly Matlab-based toolbox, PDB2Graph, devoted to protein structure network construction, visualization, and analysis. Moreover, PDB2Graph is an appropriate tool for identifying critical nodes involved in protein structural robustness and function based on centrality indices. It maps critical amino acids in protein networks and can greatly aid structural biologists in selecting proper amino acid candidates for manipulating protein structures in a more reasonable and rational manner. To introduce the capability and efficiency of PDB2Graph in detail, the structural modification of Calmodulin through allosteric binding of Ca(2+) is considered. In addition, a mutational analysis for three well-identified model proteins including Phage T4 lysozyme, Barnase and Ribonuclease HI, was performed to inspect the influence of mutating important central residues on protein activity. Copyright © 2016 Elsevier Ltd. All rights reserved.
Expanding the proteome: disordered and alternatively-folded proteins
Dyson, H. Jane
2011-01-01
Proteins provide much of the scaffolding for life, as well as undertaking a variety of essential catalytic reactions. These characteristic functions have led us to presuppose that proteins are in general functional only when well-structured and correctly folded. As we begin to explore the repertoire of possible protein sequences inherent in the human and other genomes, two stark facts that belie this supposition become clear: firstly, the number of apparent open reading frames in the human genome is significantly smaller than appears to be necessary to code for all of the diverse proteins in higher organisms, and secondly that a significant proportion of the protein sequences that would be coded by the genome would not be expected to form stable three-dimensional structures. Clearly the genome must include coding for a multitude of alternative forms of proteins, some of which may be partly or fully disordered or incompletely structured in their functional states. At the same time as this likelihood was recognized, experimental studies also began to uncover examples of important protein molecules and domains that were incompletely structured or completely disordered in solution, yet remained perfectly functional. In the ensuing years, we have seen an explosion of experimental and genome-annotation studies that have mapped the extent of the intrinsic disorder phenomenon and explored the possible biological rationales for its widespread occurrence. Answers to the question “why would a particular domain need to be unstructured?” are as varied as the systems where such domains are found. This review provides a survey of recent new directions in this field, and includes an evaluation of the role not only of intrinsically disordered proteins but of partially structured and highly dynamic members of the disorder-order continuum. PMID:21729349
Protein Structural Analysis via Mass Spectrometry-Based Proteomics
Artigues, Antonio; Nadeau, Owen W.; Rimmer, Mary Ashley; Villar, Maria T.; Du, Xiuxia; Fenton, Aron W.; Carlson, Gerald M.
2017-01-01
Modern mass spectrometry (MS) technologies have provided a versatile platform that can be combined with a large number of techniques to analyze protein structure and dynamics. These techniques include the three detailed in this chapter: 1) hydrogen/deuterium exchange (HDX), 2) limited proteolysis, and 3) chemical crosslinking (CX). HDX relies on the change in mass of a protein upon its dilution into deuterated buffer, which results in varied deuterium content within its backbone amides. Structural information on surface exposed, flexible or disordered linker regions of proteins can be achieved through limited proteolysis, using a variety of proteases and only small extents of digestion. CX refers to the covalent coupling of distinct chemical species and has been used to analyze the structure, function and interactions of proteins by identifying crosslinking sites that are formed by small multi-functional reagents, termed crosslinkers. Each of these MS applications is capable of revealing structural information for proteins when used either with or without other typical high resolution techniques, including NMR and X-ray crystallography. PMID:27975228
Mechanism of Resilin Elasticity
Qin, Guokui; Hu, Xiao; Cebe, Peggy; Kaplan, David L.
2012-01-01
Resilin is critical in the flight and jumping systems of insects as a polymeric rubber-like protein with outstanding elasticity. However, insight into the underlying molecular mechanisms responsible for resilin elasticity remains undefined. Here we report the structure and function of resilin from Drosophila CG15920. A reversible beta-turn transition was identified in the peptide encoded by exon III and for full length resilin during energy input and release, features that correlate to the rapid deformation of resilin during functions in vivo. Micellar structures and nano-porous patterns formed after beta-turn structures were present via changes in either the thermal or mechanical inputs. A model is proposed to explain the super elasticity and energy conversion mechanisms of resilin, providing important insight into structure-function relationships for this protein. Further, this model offers a view of elastomeric proteins in general where beta-turn related structures serve as fundamental units of the structure and elasticity. PMID:22893127
Liu, Lu-Ning; Su, Hai-Nan; Yan, Shi-Gan; Shao, Si-Mi; Xie, Bin-Bin; Chen, Xiu-Lan; Zhang, Xi-Ying; Zhou, Bai-Cheng; Zhang, Yu-Zhong
2009-07-01
Crystal structures of phycobiliproteins have provided valuable information regarding the conformations and amino acid organizations of peptides and chromophores, and enable us to investigate their structural and functional relationships with respect to environmental variations. In this work, we explored the pH-induced conformational and functional dynamics of R-phycoerythrin (R-PE) by means of absorption, fluorescence and circular dichroism spectra, together with analysis of its crystal structure. R-PE presents stronger functional stability in the pH range of 3.5-10 compared to the structural stability. Beyond this range, pronounced functional and structural changes occur. Crystal structure analysis shows that the tertiary structure of R-PE is fixed by several key anchoring points of the protein. With this specific association, the fundamental structure of R-PE is stabilized to present physiological spectroscopic properties, while local variations in protein peptides are also allowed in response to environmental disturbances. The functional stability and relative structural sensitivity of R-PE allow environmental adaptation.
Modelling protein functional domains in signal transduction using Maude
NASA Technical Reports Server (NTRS)
Sriram, M. G.
2003-01-01
Modelling of protein-protein interactions in signal transduction is receiving increased attention in computational biology. This paper describes recent research in the application of Maude, a symbolic language founded on rewriting logic, to the modelling of functional domains within signalling proteins. Protein functional domains (PFDs) are a critical focus of modern signal transduction research. In general, Maude models can simulate biological signalling networks and produce specific testable hypotheses at various levels of abstraction. Developing symbolic models of signalling proteins containing functional domains is important because of the potential to generate analyses of complex signalling networks based on structure-function relationships.
Structure-based characterization of multiprotein complexes.
Wiederstein, Markus; Gruber, Markus; Frank, Karl; Melo, Francisco; Sippl, Manfred J
2014-07-08
Multiprotein complexes govern virtually all cellular processes. Their 3D structures provide important clues to their biological roles, especially through structural correlations among protein molecules and complexes. The detection of such correlations generally requires comprehensive searches in databases of known protein structures by means of appropriate structure-matching techniques. Here, we present a high-speed structure search engine capable of instantly matching large protein oligomers against the complete and up-to-date database of biologically functional assemblies of protein molecules. We use this tool to reveal unseen structural correlations on the level of protein quaternary structure and demonstrate its general usefulness for efficiently exploring complex structural relationships among known protein assemblies. Copyright © 2014 The Authors. Published by Elsevier Inc. All rights reserved.
Fourier-based classification of protein secondary structures.
Shu, Jian-Jun; Yong, Kian Yan
2017-04-15
The correct prediction of protein secondary structures is one of the key issues in predicting the correct protein folded shape, which is used for determining gene function. Existing methods make use of amino acids properties as indices to classify protein secondary structures, but are faced with a significant number of misclassifications. The paper presents a technique for the classification of protein secondary structures based on protein "signal-plotting" and the use of the Fourier technique for digital signal processing. New indices are proposed to classify protein secondary structures by analyzing hydrophobicity profiles. The approach is simple and straightforward. Results show that the more types of protein secondary structures can be classified by means of these newly-proposed indices. Copyright © 2017 Elsevier Inc. All rights reserved.
Expanded explorations into the optimization of an energy function for protein design
Huang, Yao-ming; Bystroff, Christopher
2014-01-01
Nature possesses a secret formula for the energy as a function of the structure of a protein. In protein design, approximations are made to both the structural representation of the molecule and to the form of the energy equation, such that the existence of a general energy function for proteins is by no means guaranteed. Here we present new insights towards the application of machine learning to the problem of finding a general energy function for protein design. Machine learning requires the definition of an objective function, which carries with it the implied definition of success in protein design. We explored four functions, consisting of two functional forms, each with two criteria for success. Optimization was carried out by a Monte Carlo search through the space of all variable parameters. Cross-validation of the optimized energy function against a test set gave significantly different results depending on the choice of objective function, pointing to relative correctness of the built-in assumptions. Novel energy cross-terms correct for the observed non-additivity of energy terms and an imbalance in the distribution of predicted amino acids. This paper expands on the work presented at ACM-BCB, Orlando FL , October 2012. PMID:24384706
Diversity and functions of protein glycosylation in insects.
Walski, Tomasz; De Schutter, Kristof; Van Damme, Els J M; Smagghe, Guy
2017-04-01
The majority of proteins is modified with carbohydrate structures. This modification, called glycosylation, was shown to be crucial for protein folding, stability and subcellular location, as well as protein-protein interactions, recognition and signaling. Protein glycosylation is involved in multiple physiological processes, including embryonic development, growth, circadian rhythms, cell attachment as well as maintenance of organ structure, immunity and fertility. Although the general principles of glycosylation are similar among eukaryotic organisms, insects synthesize a distinct repertoire of glycan structures compared to plants and vertebrates. Consequently, a number of unique insect glycans mediate functions specific to this class of invertebrates. For instance, the core α1,3-fucosylation of N-glycans is absent in vertebrates, while in insects this modification is crucial for the development of wings and the nervous system. At present, most of the data on insect glycobiology comes from research in Drosophila. Yet, progressively more information on the glycan structures and the importance of glycosylation in other insects like beetles, caterpillars, aphids and bees is becoming available. This review gives a summary of the current knowledge and recent progress related to glycan diversity and function(s) of protein glycosylation in insects. We focus on N- and O-glycosylation, their synthesis, physiological role(s), as well as the molecular and biochemical basis of these processes. Copyright © 2017 Elsevier Ltd. All rights reserved.
Ritchie, Andrew W; Webb, Lauren J
2015-11-05
Biological function emerges in large part from the interactions of biomacromolecules in the complex and dynamic environment of the living cell. For this reason, macromolecular interactions in biological systems are now a major focus of interest throughout the biochemical and biophysical communities. The affinity and specificity of macromolecular interactions are the result of both structural and electrostatic factors. Significant advances have been made in characterizing structural features of stable protein-protein interfaces through the techniques of modern structural biology, but much less is understood about how electrostatic factors promote and stabilize specific functional macromolecular interactions over all possible choices presented to a given molecule in a crowded environment. In this Feature Article, we describe how vibrational Stark effect (VSE) spectroscopy is being applied to measure electrostatic fields at protein-protein interfaces, focusing on measurements of guanosine triphosphate (GTP)-binding proteins of the Ras superfamily binding with structurally related but functionally distinct downstream effector proteins. In VSE spectroscopy, spectral shifts of a probe oscillator's energy are related directly to that probe's local electrostatic environment. By performing this experiment repeatedly throughout a protein-protein interface, an experimental map of measured electrostatic fields generated at that interface is determined. These data can be used to rationalize selective binding of similarly structured proteins in both in vitro and in vivo environments. Furthermore, these data can be used to compare to computational predictions of electrostatic fields to explore the level of simulation detail that is necessary to accurately predict our experimental findings.
The fine art of integral membrane protein crystallisation.
Birch, James; Axford, Danny; Foadi, James; Meyer, Arne; Eckhardt, Annette; Thielmann, Yvonne; Moraes, Isabel
2018-05-18
Integral membrane proteins are among the most fascinating and important biomolecules as they play a vital role in many biological functions. Knowledge of their atomic structures is fundamental to the understanding of their biochemical function and key in many drug discovery programs. However, over the years, structure determination of integral membrane proteins has proven to be far from trivial, hence they are underrepresented in the protein data bank. Low expression levels, insolubility and instability are just a few of the many hurdles one faces when studying these proteins. X-ray crystallography has been the most used method to determine atomic structures of membrane proteins. However, the production of high quality membrane protein crystals is always very challenging, often seen more as art than a rational experiment. Here we review valuable approaches, methods and techniques to successful membrane protein crystallisation. Copyright © 2018 Diamond Light Source LTD. Published by Elsevier Inc. All rights reserved.
Nadzirin, Nurul; Firdaus-Raih, Mohd
2012-10-08
Proteins of uncharacterized functions form a large part of many of the currently available biological databases and this situation exists even in the Protein Data Bank (PDB). Our analysis of recent PDB data revealed that only 42.53% of PDB entries (1084 coordinate files) that were categorized under "unknown function" are true examples of proteins of unknown function at this point in time. The remainder 1465 entries also annotated as such appear to be able to have their annotations re-assessed, based on the availability of direct functional characterization experiments for the protein itself, or for homologous sequences or structures thus enabling computational function inference.
Kryshtafovych, Andriy; Moult, John; Bales, Patrick; Bazan, J Fernando; Biasini, Marco; Burgin, Alex; Chen, Chen; Cochran, Frank V; Craig, Timothy K; Das, Rhiju; Fass, Deborah; Garcia-Doval, Carmela; Herzberg, Osnat; Lorimer, Donald; Luecke, Hartmut; Ma, Xiaolei; Nelson, Daniel C; van Raaij, Mark J; Rohwer, Forest; Segall, Anca; Seguritan, Victor; Zeth, Kornelius; Schwede, Torsten
2014-02-01
For the last two decades, CASP has assessed the state of the art in techniques for protein structure prediction and identified areas which required further development. CASP would not have been possible without the prediction targets provided by the experimental structural biology community. In the latest experiment, CASP10, more than 100 structures were suggested as prediction targets, some of which appeared to be extraordinarily difficult for modeling. In this article, authors of some of the most challenging targets discuss which specific scientific question motivated the experimental structure determination of the target protein, which structural features were especially interesting from a structural or functional perspective, and to what extent these features were correctly reproduced in the predictions submitted to CASP10. Specifically, the following targets will be presented: the acid-gated urea channel, a difficult to predict transmembrane protein from the important human pathogen Helicobacter pylori; the structure of human interleukin (IL)-34, a recently discovered helical cytokine; the structure of a functionally uncharacterized enzyme OrfY from Thermoproteus tenax formed by a gene duplication and a novel fold; an ORFan domain of mimivirus sulfhydryl oxidase R596; the fiber protein gene product 17 from bacteriophage T7; the bacteriophage CBA-120 tailspike protein; a virus coat protein from metagenomic samples of the marine environment; and finally, an unprecedented class of structure prediction targets based on engineered disulfide-rich small proteins. Copyright © 2013 The Authors. Wiley Periodicals, Inc.
Protein Structure Classification and Loop Modeling Using Multiple Ramachandran Distributions.
Najibi, Seyed Morteza; Maadooliat, Mehdi; Zhou, Lan; Huang, Jianhua Z; Gao, Xin
2017-01-01
Recently, the study of protein structures using angular representations has attracted much attention among structural biologists. The main challenge is how to efficiently model the continuous conformational space of the protein structures based on the differences and similarities between different Ramachandran plots. Despite the presence of statistical methods for modeling angular data of proteins, there is still a substantial need for more sophisticated and faster statistical tools to model the large-scale circular datasets. To address this need, we have developed a nonparametric method for collective estimation of multiple bivariate density functions for a collection of populations of protein backbone angles. The proposed method takes into account the circular nature of the angular data using trigonometric spline which is more efficient compared to existing methods. This collective density estimation approach is widely applicable when there is a need to estimate multiple density functions from different populations with common features. Moreover, the coefficients of adaptive basis expansion for the fitted densities provide a low-dimensional representation that is useful for visualization, clustering, and classification of the densities. The proposed method provides a novel and unique perspective to two important and challenging problems in protein structure research: structure-based protein classification and angular-sampling-based protein loop structure prediction.
Pattern similarity study of functional sites in protein sequences: lysozymes and cystatins
Nakai, Shuryo; Li-Chan, Eunice CY; Dou, Jinglie
2005-01-01
Background Although it is generally agreed that topography is more conserved than sequences, proteins sharing the same fold can have different functions, while there are protein families with low sequence similarity. An alternative method for profile analysis of characteristic conserved positions of the motifs within the 3D structures may be needed for functional annotation of protein sequences. Using the approach of quantitative structure-activity relationships (QSAR), we have proposed a new algorithm for postulating functional mechanisms on the basis of pattern similarity and average of property values of side-chains in segments within sequences. This approach was used to search for functional sites of proteins belonging to the lysozyme and cystatin families. Results Hydrophobicity and β-turn propensity of reference segments with 3–7 residues were used for the homology similarity search (HSS) for active sites. Hydrogen bonding was used as the side-chain property for searching the binding sites of lysozymes. The profiles of similarity constants and average values of these parameters as functions of their positions in the sequences could identify both active and substrate binding sites of the lysozyme of Streptomyces coelicolor, which has been reported as a new fold enzyme (Cellosyl). The same approach was successfully applied to cystatins, especially for postulating the mechanisms of amyloidosis of human cystatin C as well as human lysozyme. Conclusion Pattern similarity and average index values of structure-related properties of side chains in short segments of three residues or longer were, for the first time, successfully applied for predicting functional sites in sequences. This new approach may be applicable to studying functional sites in un-annotated proteins, for which complete 3D structures are not yet available. PMID:15904486
Characterization of the motion of membrane proteins using high-speed atomic force microscopy
NASA Astrophysics Data System (ADS)
Casuso, Ignacio; Khao, Jonathan; Chami, Mohamed; Paul-Gilloteaux, Perrine; Husain, Mohamed; Duneau, Jean-Pierre; Stahlberg, Henning; Sturgis, James N.; Scheuring, Simon
2012-08-01
For cells to function properly, membrane proteins must be able to diffuse within biological membranes. The functions of these membrane proteins depend on their position and also on protein-protein and protein-lipid interactions. However, so far, it has not been possible to study simultaneously the structure and dynamics of biological membranes. Here, we show that the motion of unlabelled membrane proteins can be characterized using high-speed atomic force microscopy. We find that the molecules of outer membrane protein F (OmpF) are widely distributed in the membrane as a result of diffusion-limited aggregation, and while the overall protein motion scales roughly with the local density of proteins in the membrane, individual protein molecules can also diffuse freely or become trapped by protein-protein interactions. Using these measurements, and the results of molecular dynamics simulations, we determine an interaction potential map and an interaction pathway for a membrane protein, which should provide new insights into the connection between the structures of individual proteins and the structures and dynamics of supramolecular membranes.
Wong, Sienna; Jin, J-P
2017-01-01
Study of folded structure of proteins provides insights into their biological functions, conformational dynamics and molecular evolution. Current methods of elucidating folded structure of proteins are laborious, low-throughput, and constrained by various limitations. Arising from these methods is the need for a sensitive, quantitative, rapid and high-throughput method not only analysing the folded structure of proteins, but also to monitor dynamic changes under physiological or experimental conditions. In this focused review, we outline the foundation and limitations of current protein structure-determination methods prior to discussing the advantages of an emerging antibody epitope analysis for applications in structural, conformational and evolutionary studies of proteins. We discuss the application of this method using representative examples in monitoring allosteric conformation of regulatory proteins and the determination of the evolutionary lineage of related proteins and protein isoforms. The versatility of the method described herein is validated by the ability to modulate a variety of assay parameters to meet the needs of the user in order to monitor protein conformation. Furthermore, the assay has been used to clarify the lineage of troponin isoforms beyond what has been depicted by sequence homology alone, demonstrating the nonlinear evolutionary relationship between primary structure and tertiary structure of proteins. The antibody epitope analysis method is a highly adaptable technique of protein conformation elucidation, which can be easily applied without the need for specialized equipment or technical expertise. When applied in a systematic and strategic manner, this method has the potential to reveal novel and biomedically meaningful information for structure-function relationship and evolutionary lineage of proteins. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Shah, Dinen D.; Singh, Surinder M.; Dzieciatkowska, Monika
2017-01-01
Binding immunoglobulin protein (BiP) is a molecular chaperone important for the folding of numerous proteins, which include millions of immunoglobulins in human body. It also plays a key role in the unfolded protein response (UPR) in the endoplasmic reticulum. Free radical generation is a common phenomenon that occurs in cells under healthy as well as under stress conditions such as ageing, inflammation, alcohol consumption, and smoking. These free radicals attack the cell membranes and generate highly reactive lipid peroxidation products such as 4-oxononenal (4-ONE). BiP is a key protein that is modified by 4-ONE. In this study, we probed how such chemical modification affects the biophysical properties of BiP. Upon modification, BiP shows significant tertiary structural changes with no changes in its secondary structure. The protein loses its thermodynamic stability, particularly, that of the nucleotide binding domain (NBD) where ATP binds. In terms of function, the modified BiP completely loses its ATPase activity with decreased ATP binding affinity. However, modified BiP retains its immunoglobulin binding function and its chaperone activity of suppressing non-specific protein aggregation. These results indicate that 4-ONE modification can significantly affect the structure-function of key proteins such as BiP involved in cellular pathways, and provide a molecular basis for how chemical modifications can result in the failure of quality control mechanisms inside the cell. PMID:28886061
Biomaterials Made from Coiled-Coil Peptides.
Conticello, Vincent; Hughes, Spencer; Modlin, Charles
The development of biomaterials designed for specific applications is an important objective in personalized medicine. While the breadth and prominence of biomaterials have increased exponentially over the past decades, critical challenges remain to be addressed, particularly in the development of biomaterials that exhibit highly specific functions. These functional properties are often encoded within the molecular structure of the component molecules. Proteins, as a consequence of their structural specificity, represent useful substrates for the construction of functional biomaterials through rational design. This chapter provides an in-depth survey of biomaterials constructed from coiled-coils, one of the best-understood protein structural motifs. We discuss the utility of this structurally diverse and functionally tunable class of proteins for the creation of novel biomaterials. This discussion illustrates the progress that has been made in the development of coiled-coil biomaterials by showcasing studies that bridge the gap between the academic science and potential technological impact.
Domain atrophy creates rare cases of functional partial protein domains.
Prakash, Ananth; Bateman, Alex
2015-04-30
Protein domains display a range of structural diversity, with numerous additions and deletions of secondary structural elements between related domains. We have observed a small number of cases of surprising large-scale deletions of core elements of structural domains. We propose a new concept called domain atrophy, where protein domains lose a significant number of core structural elements. Here, we implement a new pipeline to systematically identify new cases of domain atrophy across all known protein sequences. The output of this pipeline was carefully checked by hand, which filtered out partial domain instances that were unlikely to represent true domain atrophy due to misannotations or un-annotated sequence fragments. We identify 75 cases of domain atrophy, of which eight cases are found in a three-dimensional protein structure and 67 cases have been inferred based on mapping to a known homologous structure. Domains with structural variations include ancient folds such as the TIM-barrel and Rossmann folds. Most of these domains are observed to show structural loss that does not affect their functional sites. Our analysis has significantly increased the known cases of domain atrophy. We discuss specific instances of domain atrophy and see that there has often been a compensatory mechanism that helps to maintain the stability of the partial domain. Our study indicates that although domain atrophy is an extremely rare phenomenon, protein domains under certain circumstances can tolerate extreme mutations giving rise to partial, but functional, domains.
Exploring Fold Space Preferences of New-born and Ancient Protein Superfamilies
Edwards, Hannah; Abeln, Sanne; Deane, Charlotte M.
2013-01-01
The evolution of proteins is one of the fundamental processes that has delivered the diversity and complexity of life we see around ourselves today. While we tend to define protein evolution in terms of sequence level mutations, insertions and deletions, it is hard to translate these processes to a more complete picture incorporating a polypeptide's structure and function. By considering how protein structures change over time we can gain an entirely new appreciation of their long-term evolutionary dynamics. In this work we seek to identify how populations of proteins at different stages of evolution explore their possible structure space. We use an annotation of superfamily age to this space and explore the relationship between these ages and a diverse set of properties pertaining to a superfamily's sequence, structure and function. We note several marked differences between the populations of newly evolved and ancient structures, such as in their length distributions, secondary structure content and tertiary packing arrangements. In particular, many of these differences suggest a less elaborate structure for newly evolved superfamilies when compared with their ancient counterparts. We show that the structural preferences we report are not a residual effect of a more fundamental relationship with function. Furthermore, we demonstrate the robustness of our results, using significant variation in the algorithm used to estimate the ages. We present these age estimates as a useful tool to analyse protein populations. In particularly, we apply this in a comparison of domains containing greek key or jelly roll motifs. PMID:24244135
A phylogenetic analysis of normal modes evolution in enzymes and its relationship to enzyme function
Lai, Jason; Jin, Jing; Kubelka, Jan; Liberles, David A.
2012-01-01
Since the dynamic nature of protein structures is essential for enzymatic function, it is expected that the functional evolution can be inferred from the changes in the protein dynamics. However, dynamics can also diverge neutrally with sequence substitution between enzymes without changes of function. In this study, a phylogenetic approach is implemented to explore the relationship between enzyme dynamics and function through evolutionary history. Protein dynamics are described by normal mode analysis based on a simplified harmonic potential force field applied to the reduced Cα representation of the protein structure while enzymatic function is described by Enzyme Commission (EC) numbers. Similarity of the binding pocket dynamics at each branch of the protein family’s phylogeny was analyzed in two ways: 1) explicitly by quantifying the normal mode overlap calculated for the reconstructed ancestral proteins at each end and 2) implicitly using a diffusion model to obtain the reconstructed lineage-specific changes in the normal modes. Both explicit and implicit ancestral reconstruction identified generally faster rates of change in dynamics compared with the expected change from neutral evolution at the branches of potential functional divergences for the alpha-amylase, D-isomer specific 2-hydroxyacid dehydrogenase, and copper-containing amine oxidase protein families. Normal modes analysis added additional information over just comparing the RMSD of static structures. However, the branch-specific changes were not statistically significant compared to background function-independent neutral rates of change of dynamic properties and blind application of the analysis would not enable prediction of changes in enzyme specificity. PMID:22651983
Lai, Jason; Jin, Jing; Kubelka, Jan; Liberles, David A
2012-09-21
Since the dynamic nature of protein structures is essential for enzymatic function, it is expected that functional evolution can be inferred from the changes in protein dynamics. However, dynamics can also diverge neutrally with sequence substitution between enzymes without changes of function. In this study, a phylogenetic approach is implemented to explore the relationship between enzyme dynamics and function through evolutionary history. Protein dynamics are described by normal mode analysis based on a simplified harmonic potential force field applied to the reduced C(α) representation of the protein structure while enzymatic function is described by Enzyme Commission numbers. Similarity of the binding pocket dynamics at each branch of the protein family's phylogeny was analyzed in two ways: (1) explicitly by quantifying the normal mode overlap calculated for the reconstructed ancestral proteins at each end and (2) implicitly using a diffusion model to obtain the reconstructed lineage-specific changes in the normal modes. Both explicit and implicit ancestral reconstruction identified generally faster rates of change in dynamics compared with the expected change from neutral evolution at the branches of potential functional divergences for the α-amylase, D-isomer-specific 2-hydroxyacid dehydrogenase, and copper-containing amine oxidase protein families. Normal mode analysis added additional information over just comparing the RMSD of static structures. However, the branch-specific changes were not statistically significant compared to background function-independent neutral rates of change of dynamic properties and blind application of the analysis would not enable prediction of changes in enzyme specificity. Copyright © 2012 Elsevier Ltd. All rights reserved.
Analysis of self-assembly of S-layer protein slp-B53 from Lysinibacillus sphaericus.
Liu, Jun; Falke, Sven; Drobot, Bjoern; Oberthuer, Dominik; Kikhney, Alexey; Guenther, Tobias; Fahmy, Karim; Svergun, Dmitri; Betzel, Christian; Raff, Johannes
2017-01-01
The formation of stable and functional surface layers (S-layers) via self-assembly of surface-layer proteins on the cell surface is a dynamic and complex process. S-layers facilitate a number of important biological functions, e.g., providing protection and mediating selective exchange of molecules and thereby functioning as molecular sieves. Furthermore, S-layers selectively bind several metal ions including uranium, palladium, gold, and europium, some of them with high affinity. Most current research on surface layers focuses on investigating crystalline arrays of protein subunits in Archaea and bacteria. In this work, several complementary analytical techniques and methods have been applied to examine structure-function relationships and dynamics for assembly of S-layer protein slp-B53 from Lysinibacillus sphaericus: (1) The secondary structure of the S-layer protein was analyzed by circular dichroism spectroscopy; (2) Small-angle X-ray scattering was applied to gain insights into the three-dimensional structure in solution; (3) The interaction with bivalent cations was followed by differential scanning calorimetry; (4) The dynamics and time-dependent assembly of S-layers were followed by applying dynamic light scattering; (5) The two-dimensional structure of the paracrystalline S-layer lattice was examined by atomic force microscopy. The data obtained provide essential structural insights into the mechanism of S-layer self-assembly, particularly with respect to binding of bivalent cations, i.e., Mg 2+ and Ca 2+ . Furthermore, the results obtained highlight potential applications of S-layers in the fields of micromaterials and nanobiotechnology by providing engineered or individual symmetric thin protein layers, e.g., for protective, antimicrobial, or otherwise functionalized surfaces.
Fundamental Characteristics of AAA+ Protein Family Structure and Function
2016-01-01
Many complex cellular events depend on multiprotein complexes known as molecular machines to efficiently couple the energy derived from adenosine triphosphate hydrolysis to the generation of mechanical force. Members of the AAA+ ATPase superfamily (ATPases Associated with various cellular Activities) are critical components of many molecular machines. AAA+ proteins are defined by conserved modules that precisely position the active site elements of two adjacent subunits to catalyze ATP hydrolysis. In many cases, AAA+ proteins form a ring structure that translocates a polymeric substrate through the central channel using specialized loops that project into the central channel. We discuss the major features of AAA+ protein structure and function with an emphasis on pivotal aspects elucidated with archaeal proteins. PMID:27703410
ECOD: An Evolutionary Classification of Protein Domains
Kinch, Lisa N.; Pei, Jimin; Shi, Shuoyong; Kim, Bong-Hyun; Grishin, Nick V.
2014-01-01
Understanding the evolution of a protein, including both close and distant relationships, often reveals insight into its structure and function. Fast and easy access to such up-to-date information facilitates research. We have developed a hierarchical evolutionary classification of all proteins with experimentally determined spatial structures, and presented it as an interactive and updatable online database. ECOD (Evolutionary Classification of protein Domains) is distinct from other structural classifications in that it groups domains primarily by evolutionary relationships (homology), rather than topology (or “fold”). This distinction highlights cases of homology between domains of differing topology to aid in understanding of protein structure evolution. ECOD uniquely emphasizes distantly related homologs that are difficult to detect, and thus catalogs the largest number of evolutionary links among structural domain classifications. Placing distant homologs together underscores the ancestral similarities of these proteins and draws attention to the most important regions of sequence and structure, as well as conserved functional sites. ECOD also recognizes closer sequence-based relationships between protein domains. Currently, approximately 100,000 protein structures are classified in ECOD into 9,000 sequence families clustered into close to 2,000 evolutionary groups. The classification is assisted by an automated pipeline that quickly and consistently classifies weekly releases of PDB structures and allows for continual updates. This synchronization with PDB uniquely distinguishes ECOD among all protein classifications. Finally, we present several case studies of homologous proteins not recorded in other classifications, illustrating the potential of how ECOD can be used to further biological and evolutionary studies. PMID:25474468
ECOD: an evolutionary classification of protein domains.
Cheng, Hua; Schaeffer, R Dustin; Liao, Yuxing; Kinch, Lisa N; Pei, Jimin; Shi, Shuoyong; Kim, Bong-Hyun; Grishin, Nick V
2014-12-01
Understanding the evolution of a protein, including both close and distant relationships, often reveals insight into its structure and function. Fast and easy access to such up-to-date information facilitates research. We have developed a hierarchical evolutionary classification of all proteins with experimentally determined spatial structures, and presented it as an interactive and updatable online database. ECOD (Evolutionary Classification of protein Domains) is distinct from other structural classifications in that it groups domains primarily by evolutionary relationships (homology), rather than topology (or "fold"). This distinction highlights cases of homology between domains of differing topology to aid in understanding of protein structure evolution. ECOD uniquely emphasizes distantly related homologs that are difficult to detect, and thus catalogs the largest number of evolutionary links among structural domain classifications. Placing distant homologs together underscores the ancestral similarities of these proteins and draws attention to the most important regions of sequence and structure, as well as conserved functional sites. ECOD also recognizes closer sequence-based relationships between protein domains. Currently, approximately 100,000 protein structures are classified in ECOD into 9,000 sequence families clustered into close to 2,000 evolutionary groups. The classification is assisted by an automated pipeline that quickly and consistently classifies weekly releases of PDB structures and allows for continual updates. This synchronization with PDB uniquely distinguishes ECOD among all protein classifications. Finally, we present several case studies of homologous proteins not recorded in other classifications, illustrating the potential of how ECOD can be used to further biological and evolutionary studies.
Ladunga, I
1992-04-01
The markedly nonuniform, even systematic distribution of sequences in the protein "universe" has been analyzed by methods of protein taxonomy. Mapping of the natural hierarchical system of proteins has revealed some dense cores, i.e., well-defined clusterings of proteins that seem to be natural structural groupings, possibly seeds for a future protein taxonomy. The aim was not to force proteins into more or less man-made categories by discriminant analysis, but to find structurally similar groups, possibly of common evolutionary origin. Single-valued distance measures between pairs of superfamilies from the Protein Identification Resource were defined by two chi 2-like methods on tripeptide frequencies and the variable-length subsequence identity method derived from dot-matrix comparisons. Distance matrices were processed by several methods of cluster analysis to detect phylogenetic continuum between highly divergent proteins. Only well-defined clusters characterized by relatively unique structural, intracellular environmental, organismal, and functional attribute states were selected as major protein groups, including subsets of viral and Escherichia coli proteins, hormones, inhibitors, plant, ribosomal, serum and structural proteins, amino acid synthases, and clusters dominated by certain oxidoreductases and apolar and DNA-associated enzymes. The limited repertoire of functional patterns due to small genome size, the high rate of recombination, specific features of the bacterial membranes, or of the virus cycle canalize certain proteins of viruses and Gram-negative bacteria, respectively, to organismal groups.
Eyrich, V A; Standley, D M; Friesner, R A
1999-05-14
We report the tertiary structure predictions for 95 proteins ranging in size from 17 to 160 residues starting from known secondary structure. Predictions are obtained from global minimization of an empirical potential function followed by the application of a refined atomic overlap potential. The minimization strategy employed represents a variant of the Monte Carlo plus minimization scheme of Li and Scheraga applied to a reduced model of the protein chain. For all of the cases except beta-proteins larger than 75 residues, a native-like structure, usually 4-6 A root-mean-square deviation from the native, is located. For beta-proteins larger than 75 residues, the energy gap between native-like structures and the lowest energy structures produced in the simulation is large, so that low RMSD structures are not generated starting from an unfolded state. This is attributed to the lack of an explicit hydrogen bond term in the potential function, which we hypothesize is necessary to stabilize large assemblies of beta-strands. Copyright 1999 Academic Press.
@TOME-2: a new pipeline for comparative modeling of protein-ligand complexes.
Pons, Jean-Luc; Labesse, Gilles
2009-07-01
@TOME 2.0 is new web pipeline dedicated to protein structure modeling and small ligand docking based on comparative analyses. @TOME 2.0 allows fold recognition, template selection, structural alignment editing, structure comparisons, 3D-model building and evaluation. These tasks are routinely used in sequence analyses for structure prediction. In our pipeline the necessary software is efficiently interconnected in an original manner to accelerate all the processes. Furthermore, we have also connected comparative docking of small ligands that is performed using protein-protein superposition. The input is a simple protein sequence in one-letter code with no comment. The resulting 3D model, protein-ligand complexes and structural alignments can be visualized through dedicated Web interfaces or can be downloaded for further studies. These original features will aid in the functional annotation of proteins and the selection of templates for molecular modeling and virtual screening. Several examples are described to highlight some of the new functionalities provided by this pipeline. The server and its documentation are freely available at http://abcis.cbs.cnrs.fr/AT2/
Structural Disorder Provides Increased Adaptability for Vesicle Trafficking Pathways
Tompa, Peter
2013-01-01
Vesicle trafficking systems play essential roles in the communication between the organelles of eukaryotic cells and also between cells and their environment. Endocytosis and the late secretory route are mediated by clathrin-coated vesicles, while the COat Protein I and II (COPI and COPII) routes stand for the bidirectional traffic between the ER and the Golgi apparatus. Despite similar fundamental organizations, the molecular machinery, functions, and evolutionary characteristics of the three systems are very different. In this work, we compiled the basic functional protein groups of the three main routes for human and yeast and analyzed them from the structural disorder perspective. We found similar overall disorder content in yeast and human proteins, confirming the well-conserved nature of these systems. Most functional groups contain highly disordered proteins, supporting the general importance of structural disorder in these routes, although some of them seem to heavily rely on disorder, while others do not. Interestingly, the clathrin system is significantly more disordered (∼23%) than the other two, COPI (∼9%) and COPII (∼8%). We show that this structural phenomenon enhances the inherent plasticity and increased evolutionary adaptability of the clathrin system, which distinguishes it from the other two routes. Since multi-functionality (moonlighting) is indicative of both plasticity and adaptability, we studied its prevalence in vesicle trafficking proteins and correlated it with structural disorder. Clathrin adaptors have the highest capability for moonlighting while also comprising the most highly disordered members. The ability to acquire tissue specific functions was also used to approach adaptability: clathrin route genes have the most tissue specific exons encoding for protein segments enriched in structural disorder and interaction sites. Overall, our results confirm the general importance of structural disorder in vesicle trafficking and suggest major roles for this structural property in shaping the differences of evolutionary adaptability in the three routes. PMID:23874186
A General, Adaptive, Roadmap-Based Algorithm for Protein Motion Computation.
Molloy, Kevin; Shehu, Amarda
2016-03-01
Precious information on protein function can be extracted from a detailed characterization of protein equilibrium dynamics. This remains elusive in wet and dry laboratories, as function-modulating transitions of a protein between functionally-relevant, thermodynamically-stable and meta-stable structural states often span disparate time scales. In this paper we propose a novel, robotics-inspired algorithm that circumvents time-scale challenges by drawing analogies between protein motion and robot motion. The algorithm adapts the popular roadmap-based framework in robot motion computation to handle the more complex protein conformation space and its underlying rugged energy surface. Given known structures representing stable and meta-stable states of a protein, the algorithm yields a time- and energy-prioritized list of transition paths between the structures, with each path represented as a series of conformations. The algorithm balances computational resources between a global search aimed at obtaining a global view of the network of protein conformations and their connectivity and a detailed local search focused on realizing such connections with physically-realistic models. Promising results are presented on a variety of proteins that demonstrate the general utility of the algorithm and its capability to improve the state of the art without employing system-specific insight.
Araujo, Gabriela C; Silva, Ricardo H T; Scott, Luis P B; Araujo, Alexandre S; Souza, Fatima P; de Oliveira, Ronaldo Junio
2016-12-01
The human respiratory syncytial virus (hRSV) is the major cause of lower respiratory tract infection in children and elderly people worldwide. Its genome encodes 11 proteins including SH protein, whose functions are not well known. Studies show that SH protein increases RSV virulence degree and permeability to small compounds, suggesting it is involved in the formation of ion channels. The knowledge of SH structure and function is fundamental for a better understanding of its infection mechanism. The aim of this study was to model, characterize, and analyze the structural behavior of SH protein in the phospholipids bilayer environment. Molecular modeling of SH pentameric structure was performed, followed by traditional molecular dynamics (MD) simulations of the protein immersed in the lipid bilayer. Molecular dynamics with excited normal modes (MDeNM) was applied in the resulting system in order to investigate long time scale pore dynamics. MD simulations support that SH protein is stable in its pentameric form. Simulations also showed the presence of water molecules within the bilayer by density distribution, thus confirming that SH protein is a viroporin. This water transport was also observed in MDeNM studies with histidine residues of five chains (His22 and His51), playing a key role in pore permeability. The combination of traditional MD and MDeNM was a very efficient protocol to investigate functional conformational changes of transmembrane proteins that act as molecular channels. This protocol can support future investigations of drug candidates by acting on SH protein to inhibit viral infection. Graphical Abstract The ion channel of the human respiratory syncytial virus (hRSV) small hydrophobic protein (SH) transmembrane domainᅟ.
2014-01-01
Background Bacteroides spp. form a significant part of our gut microbiome and are well known for optimized metabolism of diverse polysaccharides. Initial analysis of the archetypal Bacteroides thetaiotaomicron genome identified 172 glycosyl hydrolases and a large number of uncharacterized proteins associated with polysaccharide metabolism. Results BT_1012 from Bacteroides thetaiotaomicron VPI-5482 is a protein of unknown function and a member of a large protein family consisting entirely of uncharacterized proteins. Initial sequence analysis predicted that this protein has two domains, one on the N- and one on the C-terminal. A PSI-BLAST search found over 150 full length and over 90 half size homologs consisting only of the N-terminal domain. The experimentally determined three-dimensional structure of the BT_1012 protein confirms its two-domain architecture and structural analysis of both domains suggests their specific functions. The N-terminal domain is a putative catalytic domain with significant similarity to known glycoside hydrolases, the C-terminal domain has a beta-sandwich fold typically found in C-terminal domains of other glycosyl hydrolases, however these domains are typically involved in substrate binding. We describe the structure of the BT_1012 protein and discuss its sequence-structure relationship and their possible functional implications. Conclusions Structural and sequence analyses of the BT_1012 protein identifies it as a glycosyl hydrolase, expanding an already impressive catalog of enzymes involved in polysaccharide metabolism in Bacteroides spp. Based on this we have renamed the Pfam families representing the two domains found in the BT_1012 protein, PF13204 and PF12904, as putative glycoside hydrolase and glycoside hydrolase-associated C-terminal domain respectively. PMID:24742328
Recent developments in structural proteomics for protein structure determination.
Liu, Hsuan-Liang; Hsu, Jyh-Ping
2005-05-01
The major challenges in structural proteomics include identifying all the proteins on the genome-wide scale, determining their structure-function relationships, and outlining the precise three-dimensional structures of the proteins. Protein structures are typically determined by experimental approaches such as X-ray crystallography or nuclear magnetic resonance (NMR) spectroscopy. However, the knowledge of three-dimensional space by these techniques is still limited. Thus, computational methods such as comparative and de novo approaches and molecular dynamic simulations are intensively used as alternative tools to predict the three-dimensional structures and dynamic behavior of proteins. This review summarizes recent developments in structural proteomics for protein structure determination; including instrumental methods such as X-ray crystallography and NMR spectroscopy, and computational methods such as comparative and de novo structure prediction and molecular dynamics simulations.
Hsing, Michael; Cherkasov, Artem
2008-06-25
Insertions and deletions (indels) represent a common type of sequence variations, which are less studied and pose many important biological questions. Recent research has shown that the presence of sizable indels in protein sequences may be indicative of protein essentiality and their role in protein interaction networks. Examples of utilization of indels for structure-based drug design have also been recently demonstrated. Nonetheless many structural and functional characteristics of indels remain less researched or unknown. We have created a web-based resource, Indel PDB, representing a structural database of insertions/deletions identified from the sequence alignments of highly similar proteins found in the Protein Data Bank (PDB). Indel PDB utilized large amounts of available structural information to characterize 1-, 2- and 3-dimensional features of indel sites. Indel PDB contains 117,266 non-redundant indel sites extracted from 11,294 indel-containing proteins. Unlike loop databases, Indel PDB features more indel sequences with secondary structures including alpha-helices and beta-sheets in addition to loops. The insertion fragments have been characterized by their sequences, lengths, locations, secondary structure composition, solvent accessibility, protein domain association and three dimensional structures. By utilizing the data available in Indel PDB, we have studied and presented here several sequence and structural features of indels. We anticipate that Indel PDB will not only enable future functional studies of indels, but will also assist protein modeling efforts and identification of indel-directed drug binding sites.
Membrane Protein Structure, Function, and Dynamics: a Perspective from Experiments and Theory
Cournia, Zoe; Allen, Toby W.; Andricioaei, Ioan; ...
2015-06-11
It is fundamental for the flourishing biological cells that membrane proteins mediate the process. Membrane-embedded transporters move ions and larger solutes across membranes; receptors mediate communication between the cell and its environment and membrane-embedded enzymes catalyze chemical reactions. Understanding these mechanisms of action requires knowledge of how the proteins couple to their fluid, hydrated lipid membrane environment. Here, we present here current studies in computational and experimental membrane protein biophysics, and show how they address outstanding challenges in understanding the complex environmental effects on the structure, function, and dynamics of membrane proteins.
Algorithm, applications and evaluation for protein comparison by Ramanujan Fourier transform.
Zhao, Jian; Wang, Jiasong; Hua, Wei; Ouyang, Pingkai
2015-12-01
The amino acid sequence of a protein determines its chemical properties, chain conformation and biological functions. Protein sequence comparison is of great importance to identify similarities of protein structures and infer their functions. Many properties of a protein correspond to the low-frequency signals within the sequence. Low frequency modes in protein sequences are linked to the secondary structures, membrane protein types, and sub-cellular localizations of the proteins. In this paper, we present Ramanujan Fourier transform (RFT) with a fast algorithm to analyze the low-frequency signals of protein sequences. The RFT method is applied to similarity analysis of protein sequences with the Resonant Recognition Model (RRM). The results show that the proposed fast RFT method on protein comparison is more efficient than commonly used discrete Fourier transform (DFT). RFT can detect common frequencies as significant feature for specific protein families, and the RFT spectrum heat-map of protein sequences demonstrates the information conservation in the sequence comparison. The proposed method offers a new tool for pattern recognition, feature extraction and structural analysis on protein sequences. Copyright © 2015 Elsevier Ltd. All rights reserved.
2013-01-01
Background SNPs&GO is a method for the prediction of deleterious Single Amino acid Polymorphisms (SAPs) using protein functional annotation. In this work, we present the web server implementation of SNPs&GO (WS-SNPs&GO). The server is based on Support Vector Machines (SVM) and for a given protein, its input comprises: the sequence and/or its three-dimensional structure (when available), a set of target variations and its functional Gene Ontology (GO) terms. The output of the server provides, for each protein variation, the probabilities to be associated to human diseases. Results The server consists of two main components, including updated versions of the sequence-based SNPs&GO (recently scored as one of the best algorithms for predicting deleterious SAPs) and of the structure-based SNPs&GO3d programs. Sequence and structure based algorithms are extensively tested on a large set of annotated variations extracted from the SwissVar database. Selecting a balanced dataset with more than 38,000 SAPs, the sequence-based approach achieves 81% overall accuracy, 0.61 correlation coefficient and an Area Under the Curve (AUC) of the Receiver Operating Characteristic (ROC) curve of 0.88. For the subset of ~6,600 variations mapped on protein structures available at the Protein Data Bank (PDB), the structure-based method scores with 84% overall accuracy, 0.68 correlation coefficient, and 0.91 AUC. When tested on a new blind set of variations, the results of the server are 79% and 83% overall accuracy for the sequence-based and structure-based inputs, respectively. Conclusions WS-SNPs&GO is a valuable tool that includes in a unique framework information derived from protein sequence, structure, evolutionary profile, and protein function. WS-SNPs&GO is freely available at http://snps.biofold.org/snps-and-go. PMID:23819482
Prokaryotic cytoskeletons: protein filaments organizing small cells.
Wagstaff, James; Löwe, Jan
2018-04-01
Most, if not all, bacterial and archaeal cells contain at least one protein filament system. Although these filament systems in some cases form structures that are very similar to eukaryotic cytoskeletons, the term 'prokaryotic cytoskeletons' is used to refer to many different kinds of protein filaments. Cytoskeletons achieve their functions through polymerization of protein monomers and the resulting ability to access length scales larger than the size of the monomer. Prokaryotic cytoskeletons are involved in many fundamental aspects of prokaryotic cell biology and have important roles in cell shape determination, cell division and nonchromosomal DNA segregation. Some of the filament-forming proteins have been classified into a small number of conserved protein families, for example, the almost ubiquitous tubulin and actin superfamilies. To understand what makes filaments special and how the cytoskeletons they form enable cells to perform essential functions, the structure and function of cytoskeletal molecules and their filaments have been investigated in diverse bacteria and archaea. In this Review, we bring these data together to highlight the diverse ways that linear protein polymers can be used to organize other molecules and structures in bacteria and archaea.
Structure and dynamics of Ebola virus matrix protein VP40 by a coarse-grained Monte Carlo simulation
NASA Astrophysics Data System (ADS)
Pandey, Ras; Farmer, Barry
Ebola virus matrix protein VP40 (consisting of 326 residues) plays a critical role in viral assembly and its functions such as regulation of viral transcription, packaging, and budding of mature virions into the plasma membrane of infected cells. How does the protein VP40 go through structural evolution during the viral life cycle remains an open question? Using a coarse-grained Monte Carlo simulation we investigate the structural evolution of VP40 as a function of temperature with the input of a knowledge-based residue-residue interaction. A number local and global physical quantities (e.g. mobility profile, contact map, radius of gyration, structure factor) are analyzed with our large-scale simulations. Our preliminary data show that the structure of the protein evolves through different state with well-defined morphologies which can be identified and quantified via a detailed analysis of structure factor.
Correlation between protein secondary structure, backbone bond angles, and side-chain orientations.
Lundgren, Martin; Niemi, Antti J
2012-08-01
We investigate the fine structure of the sp3 hybridized covalent bond geometry that governs the tetrahedral architecture around the central C(α) carbon of a protein backbone, and for this we develop new visualization techniques to analyze high-resolution x-ray structures in the Protein Data Bank. We observe that there is a correlation between the deformations of the ideal tetrahedral symmetry and the local secondary structure of the protein. We propose a universal coarse-grained energy function to describe the ensuing side-chain geometry in terms of the C(β) carbon orientations. The energy function can model the side-chain geometry with a subatomic precision. As an example we construct the C(α)-C(β) structure of HP35 chicken villin headpiece. We obtain a configuration that deviates less than 0.4 Å in root-mean-square distance from the experimental x-ray structure.
Conformational Sampling in Template-Free Protein Loop Structure Modeling: An Overview
Li, Yaohang
2013-01-01
Accurately modeling protein loops is an important step to predict three-dimensional structures as well as to understand functions of many proteins. Because of their high flexibility, modeling the three-dimensional structures of loops is difficult and is usually treated as a “mini protein folding problem” under geometric constraints. In the past decade, there has been remarkable progress in template-free loop structure modeling due to advances of computational methods as well as stably increasing number of known structures available in PDB. This mini review provides an overview on the recent computational approaches for loop structure modeling. In particular, we focus on the approaches of sampling loop conformation space, which is a critical step to obtain high resolution models in template-free methods. We review the potential energy functions for loop modeling, loop buildup mechanisms to satisfy geometric constraints, and loop conformation sampling algorithms. The recent loop modeling results are also summarized. PMID:24688696
The Functional Curli Amyloid Is Not Based on In-register Parallel β-Sheet Structure*
Shewmaker, Frank; McGlinchey, Ryan P.; Thurber, Kent R.; McPhie, Peter; Dyda, Fred; Tycko, Robert; Wickner, Reed B.
2009-01-01
The extracellular curli proteins of Enterobacteriaceae form fibrous structures that are involved in biofilm formation and adhesion to host cells. These curli fibrils are considered a functional amyloid because they are not a consequence of misfolding, but they have many of the properties of protein amyloid. We confirm that fibrils formed by CsgA and CsgB, the primary curli proteins of Escherichia coli, possess many of the hallmarks typical of amyloid. Moreover we demonstrate that curli fibrils possess the cross-β structure that distinguishes protein amyloid. However, solid state NMR experiments indicate that curli structure is not based on an in-register parallel β-sheet architecture, which is common to many human disease-associated amyloids and the yeast prion amyloids. Solid state NMR and electron microscopy data are consistent with a β-helix-like structure but are not sufficient to establish such a structure definitively. PMID:19574225
Conformational sampling in template-free protein loop structure modeling: an overview.
Li, Yaohang
2013-01-01
Accurately modeling protein loops is an important step to predict three-dimensional structures as well as to understand functions of many proteins. Because of their high flexibility, modeling the three-dimensional structures of loops is difficult and is usually treated as a "mini protein folding problem" under geometric constraints. In the past decade, there has been remarkable progress in template-free loop structure modeling due to advances of computational methods as well as stably increasing number of known structures available in PDB. This mini review provides an overview on the recent computational approaches for loop structure modeling. In particular, we focus on the approaches of sampling loop conformation space, which is a critical step to obtain high resolution models in template-free methods. We review the potential energy functions for loop modeling, loop buildup mechanisms to satisfy geometric constraints, and loop conformation sampling algorithms. The recent loop modeling results are also summarized.
Protein engineering and its applications in food industry.
Kapoor, Swati; Rafiq, Aasima; Sharma, Savita
2017-07-24
Protein engineering is a young discipline that has been branched out from the field of genetic engineering. Protein engineering is based on the available knowledge about the proteins structure/function(s), tools/instruments, software, bioinformatics database, available cloned gene, knowledge about available protein, vectors, recombinant strains and other materials that could lead to change in the protein backbone. Protein produced properly from genetic engineering process means a protein that is able to fold correctly and to do particular function(s) efficiently even after being subjected to engineering practices. Protein is modified through its gene or chemically. However, modification of protein through gene is easier. There is no specific limitation of Protein Engineering tools; any technique that can lead to change the protein constituent of amino acid and result in the modification of protein structure/function is in the frame of Protein Engineering. Meanwhile, there are some common tools used to reach a specific target. More active industrial and pharmaceutical based proteins have been invented by the field of Protein Engineering to introduce new function as well as to change its interaction with surrounding environment. A variety of protein engineering applications have been reported in the literature. These applications range from biocatalysis for food and industry to environmental, medical and nanobiotechnology applications. Successful combinations of various protein engineering methods had led to successful results in food industries and have created a scope to maintain the quality of finished product after processing.
Integrating protein structural dynamics and evolutionary analysis with Bio3D.
Skjærven, Lars; Yao, Xin-Qiu; Scarabelli, Guido; Grant, Barry J
2014-12-10
Popular bioinformatics approaches for studying protein functional dynamics include comparisons of crystallographic structures, molecular dynamics simulations and normal mode analysis. However, determining how observed displacements and predicted motions from these traditionally separate analyses relate to each other, as well as to the evolution of sequence, structure and function within large protein families, remains a considerable challenge. This is in part due to the general lack of tools that integrate information of molecular structure, dynamics and evolution. Here, we describe the integration of new methodologies for evolutionary sequence, structure and simulation analysis into the Bio3D package. This major update includes unique high-throughput normal mode analysis for examining and contrasting the dynamics of related proteins with non-identical sequences and structures, as well as new methods for quantifying dynamical couplings and their residue-wise dissection from correlation network analysis. These new methodologies are integrated with major biomolecular databases as well as established methods for evolutionary sequence and comparative structural analysis. New functionality for directly comparing results derived from normal modes, molecular dynamics and principal component analysis of heterogeneous experimental structure distributions is also included. We demonstrate these integrated capabilities with example applications to dihydrofolate reductase and heterotrimeric G-protein families along with a discussion of the mechanistic insight provided in each case. The integration of structural dynamics and evolutionary analysis in Bio3D enables researchers to go beyond a prediction of single protein dynamics to investigate dynamical features across large protein families. The Bio3D package is distributed with full source code and extensive documentation as a platform independent R package under a GPL2 license from http://thegrantlab.org/bio3d/ .
Rivera-Najera, Lucero Y.; Saab-Rincón, Gloria; Battaglia, Marina; Amero, Carlos; Pulido, Nancy O.; García-Hernández, Enrique; Solórzano, Rosa M.; Reyes, José L.; Covarrubias, Alejandra A.
2014-01-01
Late embryogenesis-abundant proteins accumulate to high levels in dry seeds. Some of them also accumulate in response to water deficit in vegetative tissues, which leads to a remarkable association between their presence and low water availability conditions. A major sub-group of these proteins, also known as typical LEA proteins, shows high hydrophilicity and a high percentage of glycine and other small amino acid residues, distinctive physicochemical properties that predict a high content of structural disorder. Although all typical LEA proteins share these characteristics, seven groups can be distinguished by sequence similarity, indicating structural and functional diversity among them. Some of these groups have been extensively studied; however, others require a more detailed analysis to advance in their functional understanding. In this work, we report the structural characterization of a group 6 LEA protein from a common bean (Phaseolus vulgaris L.) (PvLEA6) by circular dichroism and nuclear magnetic resonance showing that it is a disordered protein in aqueous solution. Using the same techniques, we show that despite its unstructured nature, the addition of trifluoroethanol exhibited an intrinsic potential in this protein to gain helicity. This property was also promoted by high osmotic potentials or molecular crowding. Furthermore, we demonstrate that PvLEA6 protein is able to form soluble homo-oligomeric complexes that also show high levels of structural disorder. The association between PvLEA6 monomers to form dimers was shown to occur in plant cells by bimolecular fluorescence complementation, pointing to the in vivo functional relevance of this association. PMID:25271167
Aamir, Mohd; Singh, Vinay K.; Meena, Mukesh; Upadhyay, Ram S.; Gupta, Vijai K.; Singh, Surendra
2017-01-01
The WRKY transcription factors (TFs), play crucial role in plant defense response against various abiotic and biotic stresses. The role of WRKY3 and WRKY4 genes in plant defense response against necrotrophic pathogens is well-reported. However, their functional annotation in tomato is largely unknown. In the present work, we have characterized the structural and functional attributes of the two identified tomato WRKY transcription factors, WRKY3 (SlWRKY3), and WRKY4 (SlWRKY4) using computational approaches. Arabidopsis WRKY3 (AtWRKY3: NP_178433) and WRKY4 (AtWRKY4: NP_172849) protein sequences were retrieved from TAIR database and protein BLAST was done for finding their sequential homologs in tomato. Sequence alignment, phylogenetic classification, and motif composition analysis revealed the remarkable sequential variation between, these two WRKYs. The tomato WRKY3 and WRKY4 clusters with Solanum pennellii showing the monophyletic origin and evolution from their wild homolog. The functional domain region responsible for sequence specific DNA-binding occupied in both proteins were modeled [using AtWRKY4 (PDB ID:1WJ2) and AtWRKY1 (PDBID:2AYD) as template protein structures] through homology modeling using Discovery Studio 3.0. The generated models were further evaluated for their accuracy and reliability based on qualitative and quantitative parameters. The modeled proteins were found to satisfy all the crucial energy parameters and showed acceptable Ramachandran statistics when compared to the experimentally resolved NMR solution structures and/or X-Ray diffracted crystal structures (templates). The superimposition of the functional WRKY domains from SlWRKY3 and SlWRKY4 revealed remarkable structural similarity. The sequence specific DNA binding for two WRKYs was explored through DNA-protein interaction using Hex Docking server. The interaction studies found that SlWRKY4 binds with the W-box DNA through WRKYGQK with Tyr408, Arg409, and Lys419 with the initial flanking sequences also get involved in binding. In contrast, the SlWRKY3 made interaction with RKYGQK along with the residues from zinc finger motifs. Protein-protein interactions studies were done using STRING version 10.0 to explore all the possible protein partners involved in associative functional interaction networks. The Gene ontology enrichment analysis revealed the functional dimension and characterized the identified WRKYs based on their functional annotation. PMID:28611792
Suzuki, K; Kirisako, T; Kamada, Y; Mizushima, N; Noda, T; Ohsumi, Y
2001-11-01
Macroautophagy is a bulk degradation process induced by starvation in eukaryotic cells. In yeast, 15 Apg proteins coordinate the formation of autophagosomes. Several key reactions performed by these proteins have been described, but a comprehensive understanding of the overall network is still lacking. Based on Apg protein localization, we have identified a novel structure that functions in autophagosome formation. This pre-autophagosomal structure, containing at least five Apg proteins, i.e. Apg1p, Apg2p, Apg5p, Aut7p/Apg8p and Apg16p, is localized in the vicinity of the vacuole. Analysis of apg mutants revealed that the formation of both a phosphatidylethanolamine-conjugated Aut7p and an Apg12p- Apg5p conjugate is essential for the localization of Aut7p to the pre-autophagosomal structure. Vps30p/Apg6p and Apg14p, components of an autophagy- specific phosphatidylinositol 3-kinase complex, Apg9p and Apg16p are all required for the localization of Apg5p and Aut7p to the structure. The Apg1p protein kinase complex functions in the late stage of autophagosome formation. Here, we present the classification of Apg proteins into three groups that reflect each step of autophagosome formation.
Suzuki, Kuninori; Kirisako, Takayoshi; Kamada, Yoshiaki; Mizushima, Noboru; Noda, Takeshi; Ohsumi, Yoshinori
2001-01-01
Macroautophagy is a bulk degradation process induced by starvation in eukaryotic cells. In yeast, 15 Apg proteins coordinate the formation of autophagosomes. Several key reactions performed by these proteins have been described, but a comprehensive understanding of the overall network is still lacking. Based on Apg protein localization, we have identified a novel structure that functions in autophagosome formation. This pre-autophagosomal structure, containing at least five Apg proteins, i.e. Apg1p, Apg2p, Apg5p, Aut7p/Apg8p and Apg16p, is localized in the vicinity of the vacuole. Analysis of apg mutants revealed that the formation of both a phosphatidylethanolamine-conjugated Aut7p and an Apg12p– Apg5p conjugate is essential for the localization of Aut7p to the pre-autophagosomal structure. Vps30p/Apg6p and Apg14p, components of an autophagy- specific phosphatidylinositol 3-kinase complex, Apg9p and Apg16p are all required for the localization of Apg5p and Aut7p to the structure. The Apg1p protein kinase complex functions in the late stage of autophagosome formation. Here, we present the classification of Apg proteins into three groups that reflect each step of autophagosome formation. PMID:11689437
La Verde, Valentina; Dominici, Paola; Astegno, Alessandra
2018-04-30
Ca 2+ ions play a key role in a wide variety of environmental responses and developmental processes in plants, and several protein families with Ca 2+ -binding domains have evolved to meet these needs, including calmodulin (CaM) and calmodulin-like proteins (CMLs). These proteins have no catalytic activity, but rather act as sensor relays that regulate downstream targets. While CaM is well-studied, CMLs remain poorly characterized at both the structural and functional levels, even if they are the largest class of Ca 2+ sensors in plants. The major structural theme in CMLs consists of EF-hands, and variations in these domains are predicted to significantly contribute to the functional versatility of CMLs. Herein, we focus on recent advances in understanding the features of CMLs from biochemical and structural points of view. The analysis of the metal binding and structural properties of CMLs can provide valuable insight into how such a vast array of CML proteins can coexist, with no apparent functional redundancy, and how these proteins contribute to cellular signaling while maintaining properties that are distinct from CaM and other Ca 2+ sensors. An overview of the principal techniques used to study the biochemical properties of these interesting Ca 2+ sensors is also presented.
Diverse Supramolecular Nanofiber Networks Assembled by Functional Low-Complexity Domains.
An, Bolin; Wang, Xinyu; Cui, Mengkui; Gui, Xinrui; Mao, Xiuhai; Liu, Yan; Li, Ke; Chu, Cenfeng; Pu, Jiahua; Ren, Susu; Wang, Yanyi; Zhong, Guisheng; Lu, Timothy K; Liu, Cong; Zhong, Chao
2017-07-25
Self-assembling supramolecular nanofibers, common in the natural world, are of fundamental interest and technical importance to both nanotechnology and materials science. Despite important advances, synthetic nanofibers still lack the structural and functional diversity of biological molecules, and the controlled assembly of one type of molecule into a variety of fibrous structures with wide-ranging functional attributes remains challenging. Here, we harness the low-complexity (LC) sequence domain of fused in sarcoma (FUS) protein, an essential cellular nuclear protein with slow kinetics of amyloid fiber assembly, to construct random copolymer-like, multiblock, and self-sorted supramolecular fibrous networks with distinct structural features and fluorescent functionalities. We demonstrate the utilities of these networks in the templated, spatially controlled assembly of ligand-decorated gold nanoparticles, quantum dots, nanorods, DNA origami, and hybrid structures. Owing to the distinguishable nanoarchitectures of these nanofibers, this assembly is structure-dependent. By coupling a modular genetic strategy with kinetically controlled complex supramolecular self-assembly, we demonstrate that a single type of protein molecule can be used to engineer diverse one-dimensional supramolecular nanostructures with distinct functionalities.
Protein homology model refinement by large-scale energy optimization.
Park, Hahnbeom; Ovchinnikov, Sergey; Kim, David E; DiMaio, Frank; Baker, David
2018-03-20
Proteins fold to their lowest free-energy structures, and hence the most straightforward way to increase the accuracy of a partially incorrect protein structure model is to search for the lowest-energy nearby structure. This direct approach has met with little success for two reasons: first, energy function inaccuracies can lead to false energy minima, resulting in model degradation rather than improvement; and second, even with an accurate energy function, the search problem is formidable because the energy only drops considerably in the immediate vicinity of the global minimum, and there are a very large number of degrees of freedom. Here we describe a large-scale energy optimization-based refinement method that incorporates advances in both search and energy function accuracy that can substantially improve the accuracy of low-resolution homology models. The method refined low-resolution homology models into correct folds for 50 of 84 diverse protein families and generated improved models in recent blind structure prediction experiments. Analyses of the basis for these improvements reveal contributions from both the improvements in conformational sampling techniques and the energy function.
Recent advances in MeCP2 structure and function1
Hite, Kristopher C.; Adams, Valerie H.; Hansen, Jeffrey C.
2010-01-01
Mutations in methyl DNA binding protein 2 (MeCP2) cause the neurodevelopmental disorder Rett syndrome (RTT). The mechanism(s) by which the native MeCP2 protein operates in the cell are not well understood. Historically, MeCP2 has been characterized as a proximal gene silencer with 2 functional domains: a methyl DNA binding domain and a transcription repression domain. However, several lines of new data indicate that MeCP2 structure and function relationships are more complex. In this review, we first discuss recent studies that have advanced understanding of the basic structural biochemistry of MeCP2. This is followed by an analysis of cell-based experiments suggesting MeCP2 is a regulator, rather than a strict silencer, of transcription. The new data establish MeCP2 as a multifunctional nuclear protein, with potentially important roles in chromatin architecture, regulation of RNA splicing, and active transcription. We conclude by discussing clinical correlations between domain-specific mutations and RTT pathology to stress that all structural domains of MeCP2 are required to properly mediate cellular function of the intact protein. PMID:19234536
Less is More: Membrane Protein Digestion Beyond Urea-Trypsin Solution for Next-level Proteomics.
Zhang, Xi
2015-09-01
The goal of next-level bottom-up membrane proteomics is protein function investigation, via high-coverage high-throughput peptide-centric quantitation of expression, modifications and dynamic structures at systems scale. Yet efficient digestion of mammalian membrane proteins presents a daunting barrier, and prevalent day-long urea-trypsin in-solution digestion proved insufficient to reach this goal. Many efforts contributed incremental advances over past years, but involved protein denaturation that disconnected measurement from functional states. Beyond denaturation, the recent discovery of structure/proteomics omni-compatible detergent n-dodecyl-β-d-maltopyranoside, combined with pepsin and PNGase F columns, enabled breakthroughs in membrane protein digestion: a 2010 DDM-low-TCEP (DLT) method for H/D-exchange (HDX) using human G protein-coupled receptor, and a 2015 flow/detergent-facilitated protease and de-PTM digestions (FDD) for integrative deep sequencing and quantitation using full-length human ion channel complex. Distinguishing protein solubilization from denaturation, protease digestion reliability from theoretical specificity, and reduction from alkylation, these methods shifted day(s)-long paradigms into minutes, and afforded fully automatable (HDX)-protein-peptide-(tandem mass tag)-HPLC pipelines to instantly measure functional proteins at deep coverage, high peptide reproducibility, low artifacts and minimal leakage. Promoting-not destroying-structures and activities harnessed membrane proteins for the next-level streamlined functional proteomics. This review analyzes recent advances in membrane protein digestion methods and highlights critical discoveries for future proteomics. © 2015 by The American Society for Biochemistry and Molecular Biology, Inc.
The TIM Barrel Architecture Facilitated the Early Evolution of Protein-Mediated Metabolism.
Goldman, Aaron David; Beatty, Joshua T; Landweber, Laura F
2016-01-01
The triosephosphate isomerase (TIM) barrel protein fold is a structurally repetitive architecture that is present in approximately 10% of all enzymes. It is generally assumed that this ubiquity in modern proteomes reflects an essential historical role in early protein-mediated metabolism. Here, we provide quantitative and comparative analyses to support several hypotheses about the early importance of the TIM barrel architecture. An information theoretical analysis of protein structures supports the hypothesis that the TIM barrel architecture could arise more easily by duplication and recombination compared to other mixed α/β structures. We show that TIM barrel enzymes corresponding to the most taxonomically broad superfamilies also have the broadest range of functions, often aided by metal and nucleotide-derived cofactors that are thought to reflect an earlier stage of metabolic evolution. By comparison to other putatively ancient protein architectures, we find that the functional diversity of TIM barrel proteins cannot be explained simply by their antiquity. Instead, the breadth of TIM barrel functions can be explained, in part, by the incorporation of a broad range of cofactors, a trend that does not appear to be shared by proteins in general. These results support the hypothesis that the simple and functionally general TIM barrel architecture may have arisen early in the evolution of protein biosynthesis and provided an ideal scaffold to facilitate the metabolic transition from ribozymes, peptides, and geochemical catalysts to modern protein enzymes.
New frontiers: discovering cilia-independent functions of cilia proteins.
Vertii, Anastassiia; Bright, Alison; Delaval, Benedicte; Hehnly, Heidi; Doxsey, Stephen
2015-10-01
In most vertebrates, mitotic spindles and primary cilia arise from a common origin, the centrosome. In non-cycling cells, the centrosome is the template for primary cilia assembly and, thus, is crucial for their associated sensory and signaling functions. During mitosis, the duplicated centrosomes mature into spindle poles, which orchestrate mitotic spindle assembly, chromosome segregation, and orientation of the cell division axis. Intriguingly, both cilia and spindle poles are centrosome-based, functionally distinct structures that require the action of microtubule-mediated, motor-driven transport for their assembly. Cilia proteins have been found at non-cilia sites, where they have distinct functions, illustrating a diverse and growing list of cellular processes and structures that utilize cilia proteins for crucial functions. In this review, we discuss cilia-independent functions of cilia proteins and re-evaluate their potential contributions to "cilia" disorders. © 2015 The Authors.
Real-Time Ligand Binding Pocket Database Search Using Local Surface Descriptors
Chikhi, Rayan; Sael, Lee; Kihara, Daisuke
2010-01-01
Due to the increasing number of structures of unknown function accumulated by ongoing structural genomics projects, there is an urgent need for computational methods for characterizing protein tertiary structures. As functions of many of these proteins are not easily predicted by conventional sequence database searches, a legitimate strategy is to utilize structure information in function characterization. Of a particular interest is prediction of ligand binding to a protein, as ligand molecule recognition is a major part of molecular function of proteins. Predicting whether a ligand molecule binds a protein is a complex problem due to the physical nature of protein-ligand interactions and the flexibility of both binding sites and ligand molecules. However, geometric and physicochemical complementarity is observed between the ligand and its binding site in many cases. Therefore, ligand molecules which bind to a local surface site in a protein can be predicted by finding similar local pockets of known binding ligands in the structure database. Here, we present two representations of ligand binding pockets and utilize them for ligand binding prediction by pocket shape comparison. These representations are based on mapping of surface properties of binding pockets, which are compactly described either by the two dimensional pseudo-Zernike moments or the 3D Zernike descriptors. These compact representations allow a fast real-time pocket searching against a database. Thorough benchmark study employing two different datasets show that our representations are competitive with the other existing methods. Limitations and potentials of the shape-based methods as well as possible improvements are discussed. PMID:20455259
Real-time ligand binding pocket database search using local surface descriptors.
Chikhi, Rayan; Sael, Lee; Kihara, Daisuke
2010-07-01
Because of the increasing number of structures of unknown function accumulated by ongoing structural genomics projects, there is an urgent need for computational methods for characterizing protein tertiary structures. As functions of many of these proteins are not easily predicted by conventional sequence database searches, a legitimate strategy is to utilize structure information in function characterization. Of particular interest is prediction of ligand binding to a protein, as ligand molecule recognition is a major part of molecular function of proteins. Predicting whether a ligand molecule binds a protein is a complex problem due to the physical nature of protein-ligand interactions and the flexibility of both binding sites and ligand molecules. However, geometric and physicochemical complementarity is observed between the ligand and its binding site in many cases. Therefore, ligand molecules which bind to a local surface site in a protein can be predicted by finding similar local pockets of known binding ligands in the structure database. Here, we present two representations of ligand binding pockets and utilize them for ligand binding prediction by pocket shape comparison. These representations are based on mapping of surface properties of binding pockets, which are compactly described either by the two-dimensional pseudo-Zernike moments or the three-dimensional Zernike descriptors. These compact representations allow a fast real-time pocket searching against a database. Thorough benchmark studies employing two different datasets show that our representations are competitive with the other existing methods. Limitations and potentials of the shape-based methods as well as possible improvements are discussed.
Structured crowding and its effects on enzyme catalysis.
Ma, Buyong; Nussinov, Ruth
2013-01-01
Macromolecular crowding decreases the diffusion rate, shifts the equilibrium of protein-protein and protein-substrate interactions, and changes protein conformational dynamics. Collectively, these effects contribute to enzyme catalysis. Here we describe how crowding may bias the conformational change and dynamics of enzyme populations and in this way affect catalysis. Crowding effects have been studied using artificial crowding agents and in vivo-like environments. These studies revealed a correlation between protein dynamics and function in the crowded environment. We suggest that crowded environments be classified into uniform crowding and structured crowding. Uniform crowding represents random crowding conditions created by synthetic particles with a narrow size distribution. Structured crowding refers to the highly coordinated cellular environment, where proteins and other macromolecules are clustered and organized. In structured crowded environments the perturbation of protein thermal stability may be lower; however, it may still be able to modulate functions effectively and dynamically. Dynamic, allosteric enzymes could be more sensitive to cellular perturbations if their free energy landscape is flatter around the native state; on the other hand, if their free energy landscape is rougher, with high kinetic barriers separating deep minima, they could be more robust. Above all, cells are structured; and this holds both for the cytosol and for the membrane environment. The crowded environment is organized, which limits the search, and the crowders are not necessarily inert. More likely, they too transmit allosteric effects, and as such play important functional roles. Overall, structured cellular crowding may lead to higher enzyme efficiency and specificity.
Zhang, Xiaoxiao; Farah, Nadya; Rolston, Laura; Ericsson, Daniel J; Catanzariti, Ann-Maree; Bernoux, Maud; Ve, Thomas; Bendak, Katerina; Chen, Chunhong; Mackay, Joel P; Lawrence, Gregory J; Hardham, Adrienne; Ellis, Jeffrey G; Williams, Simon J; Dodds, Peter N; Jones, David A; Kobe, Bostjan
2018-05-01
The effector protein AvrP is secreted by the flax rust fungal pathogen (Melampsora lini) and recognized specifically by the flax (Linum usitatissimum) P disease resistance protein, leading to effector-triggered immunity. To investigate the biological function of this effector and the mechanisms of specific recognition by the P resistance protein, we determined the crystal structure of AvrP. The structure reveals an elongated zinc-finger-like structure with a novel interleaved zinc-binding topology. The residues responsible for zinc binding are conserved in AvrP effector variants and mutations of these motifs result in a loss of P-mediated recognition. The first zinc-coordinating region of the structure displays a positively charged surface and shows some limited similarities to nucleic acid-binding and chromatin-associated proteins. We show that the majority of the AvrP protein accumulates in the plant nucleus when transiently expressed in Nicotiana benthamiana cells, suggesting a nuclear pathogenic function. Polymorphic residues in AvrP and its allelic variants map to the protein surface and could be associated with differences in recognition specificity. Several point mutations of residues on the non-conserved surface patch result in a loss of recognition by P, suggesting that these residues are required for recognition. © 2017 BSPP AND JOHN WILEY & SONS LTD.
Serrano, Pedro; Dutta, Samit K; Proudfoot, Andrew; Mohanty, Biswaranjan; Susac, Lukas; Martin, Bryan; Geralt, Michael; Jaroszewski, Lukasz; Godzik, Adam; Elsliger, Marc; Wilson, Ian A; Wüthrich, Kurt
2016-11-01
For more than a decade, the Joint Center for Structural Genomics (JCSG; www.jcsg.org) worked toward increased three-dimensional structure coverage of the protein universe. This coordinated quest was one of the main goals of the four high-throughput (HT) structure determination centers of the Protein Structure Initiative (PSI; www.nigms.nih.gov/Research/specificareas/PSI). To achieve the goals of the PSI, the JCSG made use of the complementarity of structure determination by X-ray crystallography and nuclear magnetic resonance (NMR) spectroscopy to increase and diversify the range of targets entering the HT structure determination pipeline. The overall strategy, for both techniques, was to determine atomic resolution structures for representatives of large protein families, as defined by the Pfam database, which had no structural coverage and could make significant contributions to biological and biomedical research. Furthermore, the experimental structures could be leveraged by homology modeling to further expand the structural coverage of the protein universe and increase biological insights. Here, we describe what could be achieved by this structural genomics approach, using as an illustration the contributions from 20 NMR structure determinations out of a total of 98 JCSG NMR structures, which were selected because they are the first three-dimensional structure representations of the respective Pfam protein families. The information from this small sample is representative for the overall results from crystal and NMR structure determination in the JCSG. There are five new folds, which were classified as domains of unknown functions (DUF), three of the proteins could be functionally annotated based on three-dimensional structure similarity with previously characterized proteins, and 12 proteins showed only limited similarity with previous deposits in the Protein Data Bank (PDB) and were classified as DUFs. © 2016 Federation of European Biochemical Societies.
SInCRe—structural interactome computational resource for Mycobacterium tuberculosis
Metri, Rahul; Hariharaputran, Sridhar; Ramakrishnan, Gayatri; Anand, Praveen; Raghavender, Upadhyayula S.; Ochoa-Montaño, Bernardo; Higueruelo, Alicia P.; Sowdhamini, Ramanathan; Chandra, Nagasuma R.; Blundell, Tom L.; Srinivasan, Narayanaswamy
2015-01-01
We have developed an integrated database for Mycobacterium tuberculosis H37Rv (Mtb) that collates information on protein sequences, domain assignments, functional annotation and 3D structural information along with protein–protein and protein–small molecule interactions. SInCRe (Structural Interactome Computational Resource) is developed out of CamBan (Cambridge and Bangalore) collaboration. The motivation for development of this database is to provide an integrated platform to allow easily access and interpretation of data and results obtained by all the groups in CamBan in the field of Mtb informatics. In-house algorithms and databases developed independently by various academic groups in CamBan are used to generate Mtb-specific datasets and are integrated in this database to provide a structural dimension to studies on tuberculosis. The SInCRe database readily provides information on identification of functional domains, genome-scale modelling of structures of Mtb proteins and characterization of the small-molecule binding sites within Mtb. The resource also provides structure-based function annotation, information on small-molecule binders including FDA (Food and Drug Administration)-approved drugs, protein–protein interactions (PPIs) and natural compounds that bind to pathogen proteins potentially and result in weakening or elimination of host–pathogen protein–protein interactions. Together they provide prerequisites for identification of off-target binding. Database URL: http://proline.biochem.iisc.ernet.in/sincre PMID:26130660
Kinesin and Dynein Mechanics: Measurement Methods and Research Applications.
Abraham, Zachary; Hawley, Emma; Hayosh, Daniel; Webster-Wood, Victoria A; Akkus, Ozan
2018-02-01
Motor proteins play critical roles in the normal function of cells and proper development of organisms. Among motor proteins, failings in the normal function of two types of proteins, kinesin and dynein, have been shown to lead many pathologies, including neurodegenerative diseases and cancers. As such, it is critical to researchers to understand the underlying mechanics and behaviors of these proteins, not only to shed light on how failures may lead to disease, but also to guide research toward novel treatment and nano-engineering solutions. To this end, many experimental techniques have been developed to measure the force and motility capabilities of these proteins. This review will (a) discuss such techniques, specifically microscopy, atomic force microscopy (AFM), optical trapping, and magnetic tweezers, and (b) the resulting nanomechanical properties of motor protein functions such as stalling force, velocity, and dependence on adenosine triphosophate (ATP) concentrations will be comparatively discussed. Additionally, this review will highlight the clinical importance of these proteins. Furthermore, as the understanding of the structure and function of motor proteins improves, novel applications are emerging in the field. Specifically, researchers have begun to modify the structure of existing proteins, thereby engineering novel elements to alter and improve native motor protein function, or even allow the motor proteins to perform entirely new tasks as parts of nanomachines. Kinesin and dynein are vital elements for the proper function of cells. While many exciting experiments have shed light on their function, mechanics, and applications, additional research is needed to completely understand their behavior.
Stochastic Protein Multimerization, Cooperativity and Fitness
NASA Astrophysics Data System (ADS)
Hagner, Kyle; Setayeshgar, Sima; Lynch, Michael
Many proteins assemble into multimeric structures that can vary greatly among phylogenetic lineages. As protein-protein interactions (PPI) require productive encounters among subunits, these structural variations are related in part to variation in cellular protein abundance. The protein abundance in turn depends on the intrinsic rates of production and decay of mRNA and protein molecules, as well as rates of cell growth and division. We present a stochastic model for prediction of the multimeric state of a protein as a function of these processes and the free energy associated with binding interfaces. We demonstrate favorable agreement between the model and a wide class of proteins using E. coli proteome data. As such, this platform, which links protein abundance, PPI and quaternary structure in growing and dividing cells can be extended to evolutionary models for the emergence and diversification of multimeric proteins. We investigate cooperativity - a ubiquitous functional property of multimeric proteins - as a possible selective force driving multimerization, demonstrating a reduction in the cost of protein production relative to the overall proteome energy budget that can be tied to fitness.
3D bioprinting of structural proteins.
Włodarczyk-Biegun, Małgorzata K; Del Campo, Aránzazu
2017-07-01
3D bioprinting is a booming method to obtain scaffolds of different materials with predesigned and customized morphologies and geometries. In this review we focus on the experimental strategies and recent achievements in the bioprinting of major structural proteins (collagen, silk, fibrin), as a particularly interesting technology to reconstruct the biochemical and biophysical composition and hierarchical morphology of natural scaffolds. The flexibility in molecular design offered by structural proteins, combined with the flexibility in mixing, deposition, and mechanical processing inherent to bioprinting technologies, enables the fabrication of highly functional scaffolds and tissue mimics with a degree of complexity and organization which has only just started to be explored. Here we describe the printing parameters and physical (mechanical) properties of bioinks based on structural proteins, including the biological function of the printed scaffolds. We describe applied printing techniques and cross-linking methods, highlighting the modifications implemented to improve scaffold properties. The used cell types, cell viability, and possible construct applications are also reported. We envision that the application of printing technologies to structural proteins will enable unprecedented control over their supramolecular organization, conferring printed scaffolds biological properties and functions close to natural systems. Copyright © 2017 Elsevier Ltd. All rights reserved.
Dias, Raquel; Manny, Austin; Kolaczkowski, Oralia; Kolaczkowski, Bryan
2017-06-01
Reconstruction of ancestral protein sequences using phylogenetic methods is a powerful technique for directly examining the evolution of molecular function. Although ancestral sequence reconstruction (ASR) is itself very efficient, downstream functional, and structural studies necessary to characterize when and how changes in molecular function occurred are often costly and time-consuming, currently limiting ASR studies to examining a relatively small number of discrete functional shifts. As a result, we have very little direct information about how molecular function evolves across large protein families. Here we develop an approach combining ASR with structure and function prediction to efficiently examine the evolution of ligand affinity across a large family of double-stranded RNA binding proteins (DRBs) spanning animals and plants. We find that the characteristic domain architecture of DRBs-consisting of 2-3 tandem double-stranded RNA binding motifs (dsrms)-arose independently in early animal and plant lineages. The affinity with which individual dsrms bind double-stranded RNA appears to have increased and decreased often across both animal and plant phylogenies, primarily through convergent structural mechanisms involving RNA-contact residues within the β1-β2 loop and a small region of α2. These studies provide some of the first direct information about how protein function evolves across large gene families and suggest that changes in molecular function may occur often and unassociated with major phylogenetic events, such as gene or domain duplications. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
A Template-Based Protein Structure Reconstruction Method Using Deep Autoencoder Learning.
Li, Haiou; Lyu, Qiang; Cheng, Jianlin
2016-12-01
Protein structure prediction is an important problem in computational biology, and is widely applied to various biomedical problems such as protein function study, protein design, and drug design. In this work, we developed a novel deep learning approach based on a deeply stacked denoising autoencoder for protein structure reconstruction. We applied our approach to a template-based protein structure prediction using only the 3D structural coordinates of homologous template proteins as input. The templates were identified for a target protein by a PSI-BLAST search. 3DRobot (a program that automatically generates diverse and well-packed protein structure decoys) was used to generate initial decoy models for the target from the templates. A stacked denoising autoencoder was trained on the decoys to obtain a deep learning model for the target protein. The trained deep model was then used to reconstruct the final structural model for the target sequence. With target proteins that have highly similar template proteins as benchmarks, the GDT-TS score of the predicted structures is greater than 0.7, suggesting that the deep autoencoder is a promising method for protein structure reconstruction.
Yu, Isseki; Mori, Takaharu; Ando, Tadashi; Harada, Ryuhei; Jung, Jaewoon; Sugita, Yuji; Feig, Michael
2016-11-01
Biological macromolecules function in highly crowded cellular environments. The structure and dynamics of proteins and nucleic acids are well characterized in vitro, but in vivo crowding effects remain unclear. Using molecular dynamics simulations of a comprehensive atomistic model cytoplasm we found that protein-protein interactions may destabilize native protein structures, whereas metabolite interactions may induce more compact states due to electrostatic screening. Protein-protein interactions also resulted in significant variations in reduced macromolecular diffusion under crowded conditions, while metabolites exhibited significant two-dimensional surface diffusion and altered protein-ligand binding that may reduce the effective concentration of metabolites and ligands in vivo. Metabolic enzymes showed weak non-specific association in cellular environments attributed to solvation and entropic effects. These effects are expected to have broad implications for the in vivo functioning of biomolecules. This work is a first step towards physically realistic in silico whole-cell models that connect molecular with cellular biology.
Langó, Tamás; Róna, Gergely; Hunyadi-Gulyás, Éva; Turiák, Lilla; Varga, Julia; Dobson, László; Várady, György; Drahos, László; Vértessy, Beáta G; Medzihradszky, Katalin F; Szakács, Gergely; Tusnády, Gábor E
2017-02-13
Transmembrane proteins play crucial role in signaling, ion transport, nutrient uptake, as well as in maintaining the dynamic equilibrium between the internal and external environment of cells. Despite their important biological functions and abundance, less than 2% of all determined structures are transmembrane proteins. Given the persisting technical difficulties associated with high resolution structure determination of transmembrane proteins, additional methods, including computational and experimental techniques remain vital in promoting our understanding of their topologies, 3D structures, functions and interactions. Here we report a method for the high-throughput determination of extracellular segments of transmembrane proteins based on the identification of surface labeled and biotin captured peptide fragments by LC/MS/MS. We show that reliable identification of extracellular protein segments increases the accuracy and reliability of existing topology prediction algorithms. Using the experimental topology data as constraints, our improved prediction tool provides accurate and reliable topology models for hundreds of human transmembrane proteins.
NASA Astrophysics Data System (ADS)
Kutuzova, G. D.; Ugarova, N. N.; Berezin, Ilya V.
1984-11-01
The principal structural and physicochemical factors determining the stability of protein macromolecules in solution and the characteristics of the structure of the proteins from thermophilic microorganisms are examined. The mechanism of the changes in the thermal stability of proteins and enzymes after the chemical modification of their functional side groups and the experimental data concerning the influence of chemical modification on the thermal stability of proteins are analysed. The dependence of the stabilisation effect and of the changes in the structure of protein macromolecules on the degree of modification and on the nature of the modified groups and the groups introduced into proteins in the course of modification (their charge and hydrophobic properties) is demonstrated. The great practical value of the method of chemical modification for the preparation of stabilised forms of biocatalysts is shown in relation to specific examples. The bibliography includes 178 references.
Motomura, Kenta; Nakamura, Morikazu; Otaki, Joji M.
2013-01-01
Protein structure and function information is coded in amino acid sequences. However, the relationship between primary sequences and three-dimensional structures and functions remains enigmatic. Our approach to this fundamental biochemistry problem is based on the frequencies of short constituent sequences (SCSs) or words. A protein amino acid sequence is considered analogous to an English sentence, where SCSs are equivalent to words. Availability scores, which are defined as real SCS frequencies in the non-redundant amino acid database relative to their probabilistically expected frequencies, demonstrate the biological usage bias of SCSs. As a result, this frequency-based linguistic approach is expected to have diverse applications, such as secondary structure specifications by structure-specific SCSs and immunological adjuvants with rare or non-existent SCSs. Linguistic similarities (e.g., wide ranges of scale-free distributions) and dissimilarities (e.g., behaviors of low-rank samples) between proteins and the natural English language have been revealed in the rank-frequency relationships of SCSs or words. We have developed a web server, the SCS Package, which contains five applications for analyzing protein sequences based on the linguistic concept. These tools have the potential to assist researchers in deciphering structurally and functionally important protein sites, species-specific sequences, and functional relationships between SCSs. The SCS Package also provides researchers with a tool to construct amino acid sequences de novo based on the idiomatic usage of SCSs. PMID:24688703
Motomura, Kenta; Nakamura, Morikazu; Otaki, Joji M
2013-01-01
Protein structure and function information is coded in amino acid sequences. However, the relationship between primary sequences and three-dimensional structures and functions remains enigmatic. Our approach to this fundamental biochemistry problem is based on the frequencies of short constituent sequences (SCSs) or words. A protein amino acid sequence is considered analogous to an English sentence, where SCSs are equivalent to words. Availability scores, which are defined as real SCS frequencies in the non-redundant amino acid database relative to their probabilistically expected frequencies, demonstrate the biological usage bias of SCSs. As a result, this frequency-based linguistic approach is expected to have diverse applications, such as secondary structure specifications by structure-specific SCSs and immunological adjuvants with rare or non-existent SCSs. Linguistic similarities (e.g., wide ranges of scale-free distributions) and dissimilarities (e.g., behaviors of low-rank samples) between proteins and the natural English language have been revealed in the rank-frequency relationships of SCSs or words. We have developed a web server, the SCS Package, which contains five applications for analyzing protein sequences based on the linguistic concept. These tools have the potential to assist researchers in deciphering structurally and functionally important protein sites, species-specific sequences, and functional relationships between SCSs. The SCS Package also provides researchers with a tool to construct amino acid sequences de novo based on the idiomatic usage of SCSs.
Modelling dynamics in protein crystal structures by ensemble refinement
Burnley, B Tom; Afonine, Pavel V; Adams, Paul D; Gros, Piet
2012-01-01
Single-structure models derived from X-ray data do not adequately account for the inherent, functionally important dynamics of protein molecules. We generated ensembles of structures by time-averaged refinement, where local molecular vibrations were sampled by molecular-dynamics (MD) simulation whilst global disorder was partitioned into an underlying overall translation–libration–screw (TLS) model. Modeling of 20 protein datasets at 1.1–3.1 Å resolution reduced cross-validated Rfree values by 0.3–4.9%, indicating that ensemble models fit the X-ray data better than single structures. The ensembles revealed that, while most proteins display a well-ordered core, some proteins exhibit a ‘molten core’ likely supporting functionally important dynamics in ligand binding, enzyme activity and protomer assembly. Order–disorder changes in HIV protease indicate a mechanism of entropy compensation for ordering the catalytic residues upon ligand binding by disordering specific core residues. Thus, ensemble refinement extracts dynamical details from the X-ray data that allow a more comprehensive understanding of structure–dynamics–function relationships. DOI: http://dx.doi.org/10.7554/eLife.00311.001 PMID:23251785
Functional Dynamics of PDZ Binding Domains: A Normal-Mode Analysis
De Los Rios, Paolo; Cecconi, Fabio; Pretre, Anna; Dietler, Giovanni; Michielin, Olivier; Piazza, Francesco; Juanico, Brice
2005-01-01
Postsynaptic density-95/disks large/zonula occludens-1 (PDZ) domains are relatively small (80–120 residues) protein binding modules central in the organization of receptor clusters and in the association of cellular proteins. Their main function is to bind C-terminals of selected proteins that are recognized through specific amino acids in their carboxyl end. Binding is associated with a deformation of the PDZ native structure and is responsible for dynamical changes in regions not in direct contact with the target. We investigate how this deformation is related to the harmonic dynamics of the PDZ structure and show that one low-frequency collective normal mode, characterized by the concerted movements of different secondary structures, is involved in the binding process. Our results suggest that even minimal structural changes are responsible for communication between distant regions of the protein, in agreement with recent NMR experiments. Thus, PDZ domains are a very clear example of how collective normal modes are able to characterize the relation between function and dynamics of proteins, and to provide indications on the precursors of binding/unbinding events. PMID:15821164
The neuronal porosome complex in health and disease
Naik, Akshata R; Lewis, Kenneth T
2015-01-01
Cup-shaped secretory portals at the cell plasma membrane called porosomes mediate the precision release of intravesicular material from cells. Membrane-bound secretory vesicles transiently dock and fuse at the base of porosomes facing the cytosol to expel pressurized intravesicular contents from the cell during secretion. The structure, isolation, composition, and functional reconstitution of the neuronal porosome complex have greatly progressed, providing a molecular understanding of its function in health and disease. Neuronal porosomes are 15 nm cup-shaped lipoprotein structures composed of nearly 40 proteins, compared to the 120 nm nuclear pore complex composed of >500 protein molecules. Membrane proteins compose the porosome complex, making it practically impossible to solve its atomic structure. However, atomic force microscopy and small-angle X-ray solution scattering studies have provided three-dimensional structural details of the native neuronal porosome at sub-nanometer resolution, providing insights into the molecular mechanism of its function. The participation of several porosome proteins previously implicated in neurotransmission and neurological disorders, further attest to the crosstalk between porosome proteins and their coordinated involvement in release of neurotransmitter at the synapse. PMID:26264442
Protein domain organisation: adding order.
Kummerfeld, Sarah K; Teichmann, Sarah A
2009-01-29
Domains are the building blocks of proteins. During evolution, they have been duplicated, fused and recombined, to produce proteins with novel structures and functions. Structural and genome-scale studies have shown that pairs or groups of domains observed together in a protein are almost always found in only one N to C terminal order and are the result of a single recombination event that has been propagated by duplication of the multi-domain unit. Previous studies of domain organisation have used graph theory to represent the co-occurrence of domains within proteins. We build on this approach by adding directionality to the graphs and connecting nodes based on their relative order in the protein. Most of the time, the linear order of domains is conserved. However, using the directed graph representation we have identified non-linear features of domain organization that are over-represented in genomes. Recognising these patterns and unravelling how they have arisen may allow us to understand the functional relationships between domains and understand how the protein repertoire has evolved. We identify groups of domains that are not linearly conserved, but instead have been shuffled during evolution so that they occur in multiple different orders. We consider 192 genomes across all three kingdoms of life and use domain and protein annotation to understand their functional significance. To identify these features and assess their statistical significance, we represent the linear order of domains in proteins as a directed graph and apply graph theoretical methods. We describe two higher-order patterns of domain organisation: clusters and bi-directionally associated domain pairs and explore their functional importance and phylogenetic conservation. Taking into account the order of domains, we have derived a novel picture of global protein organization. We found that all genomes have a higher than expected degree of clustering and more domain pairs in forward and reverse orientation in different proteins relative to random graphs with identical degree distributions. While these features were statistically over-represented, they are still fairly rare. Looking in detail at the proteins involved, we found strong functional relationships within each cluster. In addition, the domains tended to be involved in protein-protein interaction and are able to function as independent structural units. A particularly striking example was the human Jak-STAT signalling pathway which makes use of a set of domains in a range of orders and orientations to provide nuanced signaling functionality. This illustrated the importance of functional and structural constraints (or lack thereof) on domain organisation.
Expression and in vitro functional analyses of recombinant Gam1 protein
Avila, Gustavo A.; Ramirez, Daniel H.; Hildenbrand, Zacariah L.; Jacquez, Pedro; Chiocca, Susanna; Sun, Jianjun; Rosas-Acosta, German; Xiao, Chuan
2014-01-01
Gam1, an early gene product of an avian adenovirus, is essential for viral replication. Gam1 is the first viral protein found to globally inhibit cellular SUMOylation, a critical posttranslational modification that alters the function and cellular localization of proteins. The interaction details at the interface between Gam1 and its cellular targets remain unclear due to the lack of structural information. Although Gam1 has been previously characterized, the purity of the protein was not suitable for structural investigations. In the present study, the gene of Gam1 was cloned and expressed in various bacterial expression systems to obtain pure and soluble recombinant Gam1 protein for in vitro functional and structural studies. While Gam1 was insoluble in most expression systems tested, it became soluble when it was expressed as a fusion protein with trigger factor (TF), a ribosome associated bacterial chaperone, under the control of a cold shock promoter. Careful optimization indicates that both low temperature induction and the chaperone function of TF play critical roles in increasing Gam1 solubility. Soluble Gam1 was purified to homogeneity through sequential chromatography techniques. Monomeric Gam1 was obtained via size exclusion chromatography and analyzed by dynamic light scattering. The SUMOylation inhibitory function of the purified Gam1 was confirmed in an in vitro assay. These results have built the foundation for further structural investigations that will broaden our understanding of Gam1’s roles in viral replication. PMID:25450237
Expression and in vitro functional analyses of recombinant Gam1 protein.
Avila, Gustavo A; Ramirez, Daniel H; Hildenbrand, Zacariah L; Jacquez, Pedro; Chiocca, Susanna; Sun, Jianjun; Rosas-Acosta, German; Xiao, Chuan
2015-01-01
Gam1, an early gene product of an avian adenovirus, is essential for viral replication. Gam1 is the first viral protein found to globally inhibit cellular SUMOylation, a critical posttranslational modification that alters the function and cellular localization of proteins. The interaction details at the interface between Gam1 and its cellular targets remain unclear due to the lack of structural information. Although Gam1 has been previously characterized, the purity of the protein was not suitable for structural investigations. In the present study, the gene of Gam1 was cloned and expressed in various bacterial expression systems to obtain pure and soluble recombinant Gam1 protein for in vitro functional and structural studies. While Gam1 was insoluble in most expression systems tested, it became soluble when it was expressed as a fusion protein with trigger factor (TF), a ribosome associated bacterial chaperone, under the control of a cold shock promoter. Careful optimization indicates that both low temperature induction and the chaperone function of TF play critical roles in increasing Gam1 solubility. Soluble Gam1 was purified to homogeneity through sequential chromatography techniques. Monomeric Gam1 was obtained via size exclusion chromatography and analyzed by dynamic light scattering. The SUMOylation inhibitory function of the purified Gam1 was confirmed in an in vitro assay. These results have built the foundation for further structural investigations that will broaden our understanding of Gam1's roles in viral replication. Copyright © 2014 Elsevier Inc. All rights reserved.
Morales, Yalemi; Olsen, Keith J; Bulcher, Jacqueline M; Johnson, Sean J
2018-01-01
The FRH (frequency-interacting RNA helicase) protein is the Neurospora crassa homolog of yeast Mtr4, an essential RNA helicase that plays a central role in RNA metabolism as an activator of the nuclear RNA exosome. FRH is also a required component of the circadian clock, mediating protein interactions that result in the rhythmic repression of gene expression. Here we show that FRH unwinds RNA substrates in vitro with a kinetic profile similar to Mtr4, indicating that while FRH has acquired additional functionality, its core helicase function remains intact. In contrast with the earlier FRH structures, a new crystal form of FRH results in an ATP binding site that is undisturbed by crystal contacts and adopts a conformation consistent with nucleotide binding and hydrolysis. Strikingly, this new FRH structure adopts an arch domain conformation that is dramatically altered from previous structures. Comparison of the existing FRH structures reveals conserved hinge points that appear to facilitate arch motion. Regions in the arch have been previously shown to mediate a variety of protein-protein interactions critical for RNA surveillance and circadian clock functions. The conformational changes highlighted in the FRH structures provide a platform for investigating the relationship between arch dynamics and Mtr4/FRH function.
Campeotto, Ivan; Zhang, Yong; Mladenov, Miroslav G.; Freemont, Paul S.; Gründling, Angelika
2015-01-01
Signaling nucleotides are integral parts of signal transduction systems allowing bacteria to cope with and rapidly respond to changes in the environment. The Staphylococcus aureus PII-like signal transduction protein PstA was recently identified as a cyclic diadenylate monophosphate (c-di-AMP)-binding protein. Here, we present the crystal structures of the apo- and c-di-AMP-bound PstA protein, which is trimeric in solution as well as in the crystals. The structures combined with detailed bioinformatics analysis revealed that the protein belongs to a new family of proteins with a similar core fold but with distinct features to classical PII proteins, which usually function in nitrogen metabolism pathways in bacteria. The complex structure revealed three identical c-di-AMP-binding sites per trimer with each binding site at a monomer-monomer interface. Although distinctly different from other cyclic-di-nucleotide-binding sites, as the half-binding sites are not symmetrical, the complex structure also highlighted common features for c-di-AMP-binding sites. A comparison between the apo and complex structures revealed a series of conformational changes that result in the ordering of two anti-parallel β-strands that protrude from each monomer and allowed us to propose a mechanism on how the PstA protein functions as a signaling transduction protein. PMID:25505271
Overexpression of neurofilament H disrupts normal cell structure and function
NASA Technical Reports Server (NTRS)
Szebenyi, Gyorgyi; Smith, George M.; Li, Ping; Brady, Scott T.
2002-01-01
Studying exogenously expressed tagged proteins in live cells has become a standard technique for evaluating protein distribution and function. Typically, expression levels of experimentally introduced proteins are not regulated, and high levels are often preferred to facilitate detection. However, overexpression of many proteins leads to mislocalization and pathologies. Therefore, for normative studies, moderate levels of expression may be more suitable. To understand better the dynamics of intermediate filament formation, transport, and stability in a healthy, living cell, we inserted neurofilament heavy chain (NFH)-green fluorescent protein (GFP) fusion constructs in adenoviral vectors with tetracycline (tet)-regulated promoters. This system allows for turning on or off the synthesis of NFH-GFP at a selected time, for a defined period, in a dose-dependent manner. We used this inducible system for live cell imaging of changes in filament structure and cell shape, motility, and transport associated with increasing NFH-GFP expression. Cells with low to intermediate levels of NFH-GFP were structurally and functionally similar to neighboring, nonexpressing cells. In contrast, overexpression led to pathological alterations in both filament organization and cell function. Copyright 2002 Wiley-Liss, Inc.
Phylogeny-Based Systematization of Arabidopsis Proteins with Histone H1 Globular Domain1[OPEN
Knizewski, Lukasz; Schmidt, Anja; Ginalski, Krzysztof
2017-01-01
H1 (or linker) histones are basic nuclear proteins that possess an evolutionarily conserved nucleosome-binding globular domain, GH1. They perform critical functions in determining the accessibility of chromatin DNA to trans-acting factors. In most metazoan species studied so far, linker histones are highly heterogenous, with numerous nonallelic variants cooccurring in the same cells. The phylogenetic relationships among these variants as well as their structural and functional properties have been relatively well established. This contrasts markedly with the rather limited knowledge concerning the phylogeny and structural and functional roles of an unusually diverse group of GH1-containing proteins in plants. The dearth of information and the lack of a coherent phylogeny-based nomenclature of these proteins can lead to misunderstandings regarding their identity and possible relationships, thereby hampering plant chromatin research. Based on published data and our in silico and high-throughput analyses, we propose a systematization and coherent nomenclature of GH1-containing proteins of Arabidopsis (Arabidopsis thaliana [L.] Heynh) that will be useful for both the identification and structural and functional characterization of homologous proteins from other plant species. PMID:28298478
Phylogeny-Based Systematization of Arabidopsis Proteins with Histone H1 Globular Domain.
Kotliński, Maciej; Knizewski, Lukasz; Muszewska, Anna; Rutowicz, Kinga; Lirski, Maciej; Schmidt, Anja; Baroux, Célia; Ginalski, Krzysztof; Jerzmanowski, Andrzej
2017-05-01
H1 (or linker) histones are basic nuclear proteins that possess an evolutionarily conserved nucleosome-binding globular domain, GH1. They perform critical functions in determining the accessibility of chromatin DNA to trans-acting factors. In most metazoan species studied so far, linker histones are highly heterogenous, with numerous nonallelic variants cooccurring in the same cells. The phylogenetic relationships among these variants as well as their structural and functional properties have been relatively well established. This contrasts markedly with the rather limited knowledge concerning the phylogeny and structural and functional roles of an unusually diverse group of GH1-containing proteins in plants. The dearth of information and the lack of a coherent phylogeny-based nomenclature of these proteins can lead to misunderstandings regarding their identity and possible relationships, thereby hampering plant chromatin research. Based on published data and our in silico and high-throughput analyses, we propose a systematization and coherent nomenclature of GH1-containing proteins of Arabidopsis ( Arabidopsis thaliana [L.] Heynh) that will be useful for both the identification and structural and functional characterization of homologous proteins from other plant species. © 2017 American Society of Plant Biologists. All Rights Reserved.
Fukasawa, Toshiko; Sato, Takaaki
2011-02-28
We highlight versatile applicability of a structure-factor indirect Fourier transformation (IFT) technique, hereafter called SQ-IFT. The original IFT aims at the pair distance distribution function, p(r), of colloidal particles from small angle scattering of X-rays (SAXS) and neutrons (SANS), allowing the conversion of the experimental form factor, P(q), into a more intuitive real-space spatial autocorrelation function. Instead, SQ-IFT is an interaction potential model-free approach to the 'effective' or 'experimental' structure factor to yield the pair correlation functions (PCFs), g(r), of colloidal dispersions like globular protein solutions for small-angle scattering data as well as the radial distribution functions (RDFs) of molecular liquids in liquid diffraction (LD) experiments. We show that SQ-IFT yields accurate RDFs of liquid H(2)O and monohydric alcohol reflecting their local intermolecular structures, in which q-weighted structure function, qH(q), conventionally utilized in many LD studies out of necessity of performing direct Fourier transformation, is no longer required. We also show that SQ-IFT applied to theoretically calculated structure factors for uncharged and charged colloidal dispersions almost perfectly reproduces g(r) obtained as a solution of the Ornstein-Zernike (OZ) equation. We further demonstrate the relevance of SQ-IFT in its practical applications, using SANS effective structure factors of lysozyme solutions reported in recent literatures which revealed the equilibrium cluster formation due to coexisting long range electrostatic repulsion and short range attraction between the proteins. Finally, we present SAXS experiments on human serum albumin (HSA) at different ionic strength and protein concentration, in which we discuss the real space picture of spatial distributions of the proteins via the interaction potential model-free route.
Doshi, Ankita; Sharma, Mrinal; Prabha, C Ratna
2017-06-01
Posttranslational conjugation of ubiquitin to proteins either regulates their function directly or concentration through ubiquitination dependent degradation. High degree of conservation of ubiquitin's sequence implies structural and functional importance of the conserved residues. Ubiquitin gene of Saccharomyces cerevisiae was evolved in vitro by us to study the significance of conserved residues. Present study investigates the structural changes in the protein resulting from the single mutations UbS20F, UbA46S, UbL50P, UbI61T and their functional consequences in the SUB60 strain of S. cerevisiae. Expression of UbL50P and UbI61T decreased Cdc28 protein kinase, enhanced Fus3 levels, caused dosage dependent lethality and at sublethal level produced drastic effects on stress tolerance, protein sorting, protein degradation by ubiquitin fusion degradation pathway and by lysosomes. UbS20F and UbA46S produced insignificant effects over the cells. All four mutations of ubiquitin were incorporated into polyubiquitin. However, polyubiquitination with K63 linkage decreased significantly in cells expressing UbL50P and UbI61T. Structural studies on UbL50P and UbI61T revealed distorted structure with greatly reduced α-helical and elevated β-sheet contents, while UbS20F and UbA46S show mild structural alterations. Our results on functional efficacy of ubiquitin in relation to structural integrity may be useful for designing inhibitors to investigate and modulate eukaryotic cellular dynamics. Copyright © 2017 Elsevier B.V. All rights reserved.
González-Díaz, Humberto; Munteanu, Cristian R; Postelnicu, Lucian; Prado-Prado, Francisco; Gestal, Marcos; Pazos, Alejandro
2012-03-01
Lipid-Binding Proteins (LIBPs) or Fatty Acid-Binding Proteins (FABPs) play an important role in many diseases such as different types of cancer, kidney injury, atherosclerosis, diabetes, intestinal ischemia and parasitic infections. Thus, the computational methods that can predict LIBPs based on 3D structure parameters became a goal of major importance for drug-target discovery, vaccine design and biomarker selection. In addition, the Protein Data Bank (PDB) contains 3000+ protein 3D structures with unknown function. This list, as well as new experimental outcomes in proteomics research, is a very interesting source to discover relevant proteins, including LIBPs. However, to the best of our knowledge, there are no general models to predict new LIBPs based on 3D structures. We developed new Quantitative Structure-Activity Relationship (QSAR) models based on 3D electrostatic parameters of 1801 different proteins, including 801 LIBPs. We calculated these electrostatic parameters with the MARCH-INSIDE software and they correspond to the entire protein or to specific protein regions named core, inner, middle, and surface. We used these parameters as inputs to develop a simple Linear Discriminant Analysis (LDA) classifier to discriminate 3D structure of LIBPs from other proteins. We implemented this predictor in the web server named LIBP-Pred, freely available at , along with other important web servers of the Bio-AIMS portal. The users can carry out an automatic retrieval of protein structures from PDB or upload their custom protein structural models from their disk created with LOMETS server. We demonstrated the PDB mining option performing a predictive study of 2000+ proteins with unknown function. Interesting results regarding the discovery of new Cancer Biomarkers in humans or drug targets in parasites have been discussed here in this sense.
Bhattacharyya, Moitrayee; Vishveshwara, Saraswathi
2009-01-01
Background The genome of a wide variety of prokaryotes contains the luxS gene homologue, which encodes for the protein S-ribosylhomocysteinelyase (LuxS). This protein is responsible for the production of the quorum sensing molecule, AI-2 and has been implicated in a variety of functions such as flagellar motility, metabolic regulation, toxin production and even in pathogenicity. A high structural similarity is present in the LuxS structures determined from a few species. In this study, we have modelled the structures from several other species and have investigated their dimer interfaces. We have attempted to correlate the interface features of LuxS with the phenotypic nature of the organisms. Results The protein structure networks (PSN) are constructed and graph theoretical analysis is performed on the structures obtained from X-ray crystallography and on the modelled ones. The interfaces, which are known to contain the active site, are characterized from the PSNs of these homodimeric proteins. The key features presented by the protein interfaces are investigated for the classification of the proteins in relation to their function. From our analysis, structural interface motifs are identified for each class in our dataset, which showed distinctly different pattern at the interface of LuxS for the probiotics and some extremophiles. Our analysis also reveals potential sites of mutation and geometric patterns at the interface that was not evident from conventional sequence alignment studies. Conclusion The structure network approach employed in this study for the analysis of dimeric interfaces in LuxS has brought out certain structural details at the side-chain interaction level, which were elusive from the conventional structure comparison methods. The results from this study provide a better understanding of the relation between the luxS gene and its functional role in the prokaryotes. This study also makes it possible to explore the potential direction towards the design of inhibitors of LuxS and thus towards a wide range of antimicrobials. PMID:19243584
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kozbial, Piotr; Xu, Qingping; Chiu, Hsiu-Ju
2009-08-28
To extend the structural coverage of proteins with unknown functions, we targeted a novel protein family (Pfam accession number PF08807, DUF1798) for which we proposed and determined the structures of two representative members. The MW1337R gene of Staphylococcus aureus subsp. aureus Rosenbach (Wood 46) encodes a protein with a molecular weight of 13.8 kDa (residues 1-116) and a calculated isoelectric point of 5.15. The lin2004 gene of the nonspore-forming bacterium Listeria innocua Clip11262 encodes a protein with a molecular weight of 14.6 kDa (residues 1-121) and a calculated isoelectric point of 5.45. MW1337R and lin2004, as well as their homologs,more » which, so far, have been found only in Bacillus, Staphylococcus, Listeria, and related genera (Geobacillus, Exiguobacterium, and Oceanobacillus), have unknown functions and are annotated as hypothetical proteins. The genomic contexts of MW1337R and lin2004 are similar and conserved in related species. In prokaryotic genomes, most often, functionally interacting proteins are coded by genes, which are colocated in conserved operons. Proteins from the same operon as MW1337R and lin2004 either have unknown functions (i.e., belong to DUF1273, Pfam accession number PF06908) or are similar to ypsB from Bacillus subtilis. The function of ypsB is unclear, although it has a strong similarity to the N-terminal region of DivIVA, which was characterized as a bifunctional protein with distinct roles during vegetative growth and sporulation. In addition, members of the DUF1273 family display distant sequence similarity with the DprA/Smf protein, which acts downstream of the DNA uptake machinery, possibly in conjunction with RecA. The RecA activities in Bacillus subtilis are modulated by RecU Holliday-junction resolvase. In all analyzed cases, the gene coding for RecU is in the vicinity of MW1337R, lin2004, or their orthologs, but on a different operon located in the complementary DNA strand. Here, we report the crystal structures of MW1337R and lin2004, which were determined using the semiautomated, high-throughput pipeline of the Joint Center for Structural Genomics (JCSG), part of the National Institute of General Medical Sciences Protein Structure Initiative.« less
Lee, Juyong; Lee, Jinhyuk; Sasaki, Takeshi N; Sasai, Masaki; Seok, Chaok; Lee, Jooyoung
2011-08-01
Ab initio protein structure prediction is a challenging problem that requires both an accurate energetic representation of a protein structure and an efficient conformational sampling method for successful protein modeling. In this article, we present an ab initio structure prediction method which combines a recently suggested novel way of fragment assembly, dynamic fragment assembly (DFA) and conformational space annealing (CSA) algorithm. In DFA, model structures are scored by continuous functions constructed based on short- and long-range structural restraint information from a fragment library. Here, DFA is represented by the full-atom model by CHARMM with the addition of the empirical potential of DFIRE. The relative contributions between various energy terms are optimized using linear programming. The conformational sampling was carried out with CSA algorithm, which can find low energy conformations more efficiently than simulated annealing used in the existing DFA study. The newly introduced DFA energy function and CSA sampling algorithm are implemented into CHARMM. Test results on 30 small single-domain proteins and 13 template-free modeling targets of the 8th Critical Assessment of protein Structure Prediction show that the current method provides comparable and complementary prediction results to existing top methods. Copyright © 2011 Wiley-Liss, Inc.
An estimated 5% of new protein structures solved today represent a new Pfam family
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mistry, Jaina; Kloppmann, Edda; Rost, Burkhard
2013-11-01
This study uses the Pfam database to show that the sequence redundancy of protein structures deposited in the PDB is increasing. The possible reasons behind this trend are discussed. High-resolution structural knowledge is key to understanding how proteins function at the molecular level. The number of entries in the Protein Data Bank (PDB), the repository of all publicly available protein structures, continues to increase, with more than 8000 structures released in 2012 alone. The authors of this article have studied how structural coverage of the protein-sequence space has changed over time by monitoring the number of Pfam families that acquiredmore » their first representative structure each year from 1976 to 2012. Twenty years ago, for every 100 new PDB entries released, an estimated 20 Pfam families acquired their first structure. By 2012, this decreased to only about five families per 100 structures. The reasons behind the slower pace at which previously uncharacterized families are being structurally covered were investigated. It was found that although more than 50% of current Pfam families are still without a structural representative, this set is enriched in families that are small, functionally uncharacterized or rich in problem features such as intrinsically disordered and transmembrane regions. While these are important constraints, the reasons why it may not yet be time to give up the pursuit of a targeted but more comprehensive structural coverage of the protein-sequence space are discussed.« less
Computational modeling of Repeat1 region of INI1/hSNF5: An evolutionary link with ubiquitin
Bhutoria, Savita
2016-01-01
Abstract The structure of a protein can be very informative of its function. However, determining protein structures experimentally can often be very challenging. Computational methods have been used successfully in modeling structures with sufficient accuracy. Here we have used computational tools to predict the structure of an evolutionarily conserved and functionally significant domain of Integrase interactor (INI)1/hSNF5 protein. INI1 is a component of the chromatin remodeling SWI/SNF complex, a tumor suppressor and is involved in many protein‐protein interactions. It belongs to SNF5 family of proteins that contain two conserved repeat (Rpt) domains. Rpt1 domain of INI1 binds to HIV‐1 Integrase, and acts as a dominant negative mutant to inhibit viral replication. Rpt1 domain also interacts with oncogene c‐MYC and modulates its transcriptional activity. We carried out an ab initio modeling of a segment of INI1 protein containing the Rpt1 domain. The structural model suggested the presence of a compact and well defined ββαα topology as core structure in the Rpt1 domain of INI1. This topology in Rpt1 was similar to PFU domain of Phospholipase A2 Activating Protein, PLAA. Interestingly, PFU domain shares similarity with Ubiquitin and has ubiquitin binding activity. Because of the structural similarity between Rpt1 domain of INI1 and PFU domain of PLAA, we propose that Rpt1 domain of INI1 may participate in ubiquitin recognition or binding with ubiquitin or ubiquitin related proteins. This modeling study may shed light on the mode of interactions of Rpt1 domain of INI1 and is likely to facilitate future functional studies of INI1. PMID:27261671
Relationships between residue Voronoi volume and sequence conservation in proteins.
Liu, Jen-Wei; Cheng, Chih-Wen; Lin, Yu-Feng; Chen, Shao-Yu; Hwang, Jenn-Kang; Yen, Shih-Chung
2018-02-01
Functional and biophysical constraints can cause different levels of sequence conservation in proteins. Previously, structural properties, e.g., relative solvent accessibility (RSA) and packing density of the weighted contact number (WCN), have been found to be related to protein sequence conservation (CS). The Voronoi volume has recently been recognized as a new structural property of the local protein structural environment reflecting CS. However, for surface residues, it is sensitive to water molecules surrounding the protein structure. Herein, we present a simple structural determinant termed the relative space of Voronoi volume (RSV); it uses the Voronoi volume and the van der Waals volume of particular residues to quantify the local structural environment. RSV (range, 0-1) is defined as (Voronoi volume-van der Waals volume)/Voronoi volume of the target residue. The concept of RSV describes the extent of available space for every protein residue. RSV and Voronoi profiles with and without water molecules (RSVw, RSV, VOw, and VO) were compared for 554 non-homologous proteins. RSV (without water) showed better Pearson's correlations with CS than did RSVw, VO, or VOw values. The mean correlation coefficient between RSV and CS was 0.51, which is comparable to the correlation between RSA and CS (0.49) and that between WCN and CS (0.56). RSV is a robust structural descriptor with and without water molecules and can quantitatively reflect evolutionary information in a single protein structure. Therefore, it may represent a practical structural determinant to study protein sequence, structure, and function relationships. Copyright © 2017 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Hilaire, Mary Rose
Proteins possess unique physical and chemical properties that allow them to carry out a wide variety of biological activities and functions. While it is generally understood that a protein's function is dictated by its structure and dynamics, arriving at a molecule-level understanding of the underlying structure-dynamics-function relationship still poses a challenging task in many cases. This is due, at least in part, to the fact that we lack the ability to take snapshots along the reaction coordinate of proteins with sufficient temporal and structural resolution. Therefore, to improve one's ability to acquire site-specific structural and/or environmental information of proteins via either infrared (IR) or fluorescence spectroscopy, the main focus of this thesis is to develop and characterize amino acid-based spectroscopic probes as well as to use such probes to study important biological questions. Specifically, we show that (1) p-cyanophenylalanine and selenomethionine constitute an efficient fluorophore-quencher pair, useful for characterizing protein conformational changes that occur on a short distance; (2) 4-cyanotryptophan is a novel blue fluorescent amino acid, applicable for biological imaging due to its unique photophysical properties; (3) the dielectric constant inside the hydrophobic interior of staphylococcal nuclease is about 10-15, significantly larger than previously assumed; and (4) a single mutation in a short segment of the protein transthyretin (i.e., 110-115) induces formation of amyloid fibrils consisting of both beta- and alpha-sheets, where the latter is a proposed structure in proteins, but has never been observed previously.
Normal mode-guided transition pathway generation in proteins
Lee, Byung Ho; Seo, Sangjae; Kim, Min Hyeok; Kim, Youngjin; Jo, Soojin; Choi, Moon-ki; Lee, Hoomin; Choi, Jae Boong
2017-01-01
The biological function of proteins is closely related to its structural motion. For instance, structurally misfolded proteins do not function properly. Although we are able to experimentally obtain structural information on proteins, it is still challenging to capture their dynamics, such as transition processes. Therefore, we need a simulation method to predict the transition pathways of a protein in order to understand and study large functional deformations. Here, we present a new simulation method called normal mode-guided elastic network interpolation (NGENI) that performs normal modes analysis iteratively to predict transition pathways of proteins. To be more specific, NGENI obtains displacement vectors that determine intermediate structures by interpolating the distance between two end-point conformations, similar to a morphing method called elastic network interpolation. However, the displacement vector is regarded as a linear combination of the normal mode vectors of each intermediate structure, in order to enhance the physical sense of the proposed pathways. As a result, we can generate more reasonable transition pathways geometrically and thermodynamically. By using not only all normal modes, but also in part using only the lowest normal modes, NGENI can still generate reasonable pathways for large deformations in proteins. This study shows that global protein transitions are dominated by collective motion, which means that a few lowest normal modes play an important role in this process. NGENI has considerable merit in terms of computational cost because it is possible to generate transition pathways by partial degrees of freedom, while conventional methods are not capable of this. PMID:29020017
Konc, Janez; Cesnik, Tomo; Konc, Joanna Trykowska; Penca, Matej; Janežič, Dušanka
2012-02-27
ProBiS-Database is a searchable repository of precalculated local structural alignments in proteins detected by the ProBiS algorithm in the Protein Data Bank. Identification of functionally important binding regions of the protein is facilitated by structural similarity scores mapped to the query protein structure. PDB structures that have been aligned with a query protein may be rapidly retrieved from the ProBiS-Database, which is thus able to generate hypotheses concerning the roles of uncharacterized proteins. Presented with uncharacterized protein structure, ProBiS-Database can discern relationships between such a query protein and other better known proteins in the PDB. Fast access and a user-friendly graphical interface promote easy exploration of this database of over 420 million local structural alignments. The ProBiS-Database is updated weekly and is freely available online at http://probis.cmm.ki.si/database.
Protein linguistics - a grammar for modular protein assembly?
Gimona, Mario
2006-01-01
The correspondence between biology and linguistics at the level of sequence and lexical inventories, and of structure and syntax, has fuelled attempts to describe genome structure by the rules of formal linguistics. But how can we define protein linguistic rules? And how could compositional semantics improve our understanding of protein organization and functional plasticity?
Preparation and Characterization of Biofunctionalized Inorganic Substrates.
Dugger, Jason W; Webb, Lauren J
2015-09-29
Integrating the function of biological molecules into traditional inorganic materials and substrates couples biologically relevant function to synthetic devices and generates new materials and capabilities by combining biological and inorganic functions. At this so-called "bio/abio interface," basic biological functions such as ligand binding and catalysis can be co-opted to detect analytes with exceptional sensitivity or to generate useful molecules with chiral specificity under entirely benign reaction conditions. Proteins function in dynamic, complex, and crowded environments (the living cell) and are therefore appropriate for integrating into multistep, multiscale, multimaterial devices such as integrated circuits and heterogeneous catalysts. However, the goal of reproducing the highly specific activities of biomolecules in the perturbed chemical and electrostatic environment at an inorganic interface while maintaining their native conformations is challenging to achieve. Moreover, characterizing protein structure and function at a surface is often difficult, particularly if one wishes to compare the activity of the protein to that of the dilute, aqueous solution phase. Our laboratory has developed a general strategy to address this challenge by taking advantage of the structural and chemical properties of alkanethiol self-assembled monolayers (SAMs) on gold surfaces that are functionalized with covalently tethered peptides. These surface-bound peptides then act as the chemical recognition element for a target protein, generating a biomimetic surface in which protein orientation, structure, density, and function are controlled and variable. Herein we discuss current research and future directions related to generating a chemically tunable biofunctionalization strategy that has potential to successfully incorporate the highly specialized functions of proteins onto inorganic substrates.
Structural Elements Regulating AAA+ Protein Quality Control Machines.
Chang, Chiung-Wen; Lee, Sukyeong; Tsai, Francis T F
2017-01-01
Members of the ATPases Associated with various cellular Activities (AAA+) superfamily participate in essential and diverse cellular pathways in all kingdoms of life by harnessing the energy of ATP binding and hydrolysis to drive their biological functions. Although most AAA+ proteins share a ring-shaped architecture, AAA+ proteins have evolved distinct structural elements that are fine-tuned to their specific functions. A central question in the field is how ATP binding and hydrolysis are coupled to substrate translocation through the central channel of ring-forming AAA+ proteins. In this mini-review, we will discuss structural elements present in AAA+ proteins involved in protein quality control, drawing similarities to their known role in substrate interaction by AAA+ proteins involved in DNA translocation. Elements to be discussed include the pore loop-1, the Inter-Subunit Signaling (ISS) motif, and the Pre-Sensor I insert (PS-I) motif. Lastly, we will summarize our current understanding on the inter-relationship of those structural elements and propose a model how ATP binding and hydrolysis might be coupled to polypeptide translocation in protein quality control machines.
Pum, Dietmar; Toca-Herrera, Jose Luis; Sleytr, Uwe B.
2013-01-01
Crystalline S(urface)-layers are the most commonly observed cell surface structures in prokaryotic organisms (bacteria and archaea). S-layers are highly porous protein meshworks with unit cell sizes in the range of 3 to 30 nm, and thicknesses of ~10 nm. One of the key features of S-layer proteins is their intrinsic capability to form self-assembled mono- or double layers in solution, and at interfaces. Basic research on S-layer proteins laid foundation to make use of the unique self-assembly properties of native and, in particular, genetically functionalized S-layer protein lattices, in a broad range of applications in the life and non-life sciences. This contribution briefly summarizes the knowledge about structure, genetics, chemistry, morphogenesis, and function of S-layer proteins and pays particular attention to the self-assembly in solution, and at differently functionalized solid supports. PMID:23354479
Rational Protein Engineering Guided by Deep Mutational Scanning
Shin, HyeonSeok; Cho, Byung-Kwan
2015-01-01
Sequence–function relationship in a protein is commonly determined by the three-dimensional protein structure followed by various biochemical experiments. However, with the explosive increase in the number of genome sequences, facilitated by recent advances in sequencing technology, the gap between protein sequences available and three-dimensional structures is rapidly widening. A recently developed method termed deep mutational scanning explores the functional phenotype of thousands of mutants via massive sequencing. Coupled with a highly efficient screening system, this approach assesses the phenotypic changes made by the substitution of each amino acid sequence that constitutes a protein. Such an informational resource provides the functional role of each amino acid sequence, thereby providing sufficient rationale for selecting target residues for protein engineering. Here, we discuss the current applications of deep mutational scanning and consider experimental design. PMID:26404267
Solid state NMR: The essential technology for helical membrane protein structural characterization
Cross, Timothy A.; Ekanayake, Vindana; Paulino, Joana; Wright, Anna
2014-01-01
NMR spectroscopy of helical membrane proteins has been very challenging on multiple fronts. The expression and purification of these proteins while maintaining functionality has consumed countless graduate student hours. Sample preparations have depended on whether solution or solid-state NMR spectroscopy was to be performed – neither have been easy. In recent years it has become increasingly apparent that membrane mimic environments influence the structural result. Indeed, in these recent years we have rediscovered that Nobel laureate, Christian Anfinsen, did not say that protein structure was exclusively dictated by the amino acid sequence, but rather by the sequence in a given environment (Anfinsen, 1973) [106]. The environment matters, molecular interactions with the membrane environment are significant and many examples of distorted, non-native membrane protein structures have recently been documented in the literature. However, solid-state NMR structures of helical membrane proteins in proteoliposomes and bilayers are proving to be native structures that permit a high resolution characterization of their functional states. Indeed, solid-state NMR is uniquely able to characterize helical membrane protein structures in lipid environments without detergents. Recent progress in expression, purification, reconstitution, sample preparation and in the solid-state NMR spectroscopy of both oriented samples and magic angle spinning samples has demonstrated that helical membrane protein structures can be achieved in a timely fashion. Indeed, this is a spectacular opportunity for the NMR community to have a major impact on biomedical research through the solid-state NMR spectroscopy of these proteins. PMID:24412099
Solid state NMR: The essential technology for helical membrane protein structural characterization
NASA Astrophysics Data System (ADS)
Cross, Timothy A.; Ekanayake, Vindana; Paulino, Joana; Wright, Anna
2014-02-01
NMR spectroscopy of helical membrane proteins has been very challenging on multiple fronts. The expression and purification of these proteins while maintaining functionality has consumed countless graduate student hours. Sample preparations have depended on whether solution or solid-state NMR spectroscopy was to be performed - neither have been easy. In recent years it has become increasingly apparent that membrane mimic environments influence the structural result. Indeed, in these recent years we have rediscovered that Nobel laureate, Christian Anfinsen, did not say that protein structure was exclusively dictated by the amino acid sequence, but rather by the sequence in a given environment (Anfinsen, 1973) [106]. The environment matters, molecular interactions with the membrane environment are significant and many examples of distorted, non-native membrane protein structures have recently been documented in the literature. However, solid-state NMR structures of helical membrane proteins in proteoliposomes and bilayers are proving to be native structures that permit a high resolution characterization of their functional states. Indeed, solid-state NMR is uniquely able to characterize helical membrane protein structures in lipid environments without detergents. Recent progress in expression, purification, reconstitution, sample preparation and in the solid-state NMR spectroscopy of both oriented samples and magic angle spinning samples has demonstrated that helical membrane protein structures can be achieved in a timely fashion. Indeed, this is a spectacular opportunity for the NMR community to have a major impact on biomedical research through the solid-state NMR spectroscopy of these proteins.
Structure-guided Protein Transition Modeling with a Probabilistic Roadmap Algorithm.
Maximova, Tatiana; Plaku, Erion; Shehu, Amarda
2016-07-07
Proteins are macromolecules in perpetual motion, switching between structural states to modulate their function. A detailed characterization of the precise yet complex relationship between protein structure, dynamics, and function requires elucidating transitions between functionally-relevant states. Doing so challenges both wet and dry laboratories, as protein dynamics involves disparate temporal scales. In this paper we present a novel, sampling-based algorithm to compute transition paths. The algorithm exploits two main ideas. First, it leverages known structures to initialize its search and define a reduced conformation space for rapid sampling. This is key to address the insufficient sampling issue suffered by sampling-based algorithms. Second, the algorithm embeds samples in a nearest-neighbor graph where transition paths can be efficiently computed via queries. The algorithm adapts the probabilistic roadmap framework that is popular in robot motion planning. In addition to efficiently computing lowest-cost paths between any given structures, the algorithm allows investigating hypotheses regarding the order of experimentally-known structures in a transition event. This novel contribution is likely to open up new venues of research. Detailed analysis is presented on multiple-basin proteins of relevance to human disease. Multiscaling and the AMBER ff14SB force field are used to obtain energetically-credible paths at atomistic detail.
Pukáncsik, Mária; Orbán, Ágnes; Nagy, Kinga; Matsuo, Koichi; Gekko, Kunihiko; Maurin, Damien; Hart, Darren; Kézsmárki, István; Vertessy, Beata G.
2016-01-01
A novel uracil-DNA degrading protein factor (termed UDE) was identified in Drosophila melanogaster with no significant structural and functional homology to other uracil-DNA binding or processing factors. Determination of the 3D structure of UDE is excepted to provide key information on the description of the molecular mechanism of action of UDE catalysis, as well as in general uracil-recognition and nuclease action. Towards this long-term aim, the random library ESPRIT technology was applied to the novel protein UDE to overcome problems in identifying soluble expressing constructs given the absence of precise information on domain content and arrangement. Nine constructs of UDE were chosen to decipher structural and functional relationships. Vacuum ultraviolet circular dichroism (VUVCD) spectroscopy was performed to define the secondary structure content and location within UDE and its truncated variants. The quantitative analysis demonstrated exclusive α-helical content for the full-length protein, which is preserved in the truncated constructs. Arrangement of α-helical bundles within the truncated protein segments suggested new domain boundaries which differ from the conserved motifs determined by sequence-based alignment of UDE homologues. Here we demonstrate that the combination of ESPRIT and VUVCD spectroscopy provides a new structural description of UDE and confirms that the truncated constructs are useful for further detailed functional studies. PMID:27273007
Shi, Xiaohu; Zhang, Jingfen; He, Zhiquan; Shang, Yi; Xu, Dong
2011-09-01
One of the major challenges in protein tertiary structure prediction is structure quality assessment. In many cases, protein structure prediction tools generate good structural models, but fail to select the best models from a huge number of candidates as the final output. In this study, we developed a sampling-based machine-learning method to rank protein structural models by integrating multiple scores and features. First, features such as predicted secondary structure, solvent accessibility and residue-residue contact information are integrated by two Radial Basis Function (RBF) models trained from different datasets. Then, the two RBF scores and five selected scoring functions developed by others, i.e., Opus-CA, Opus-PSP, DFIRE, RAPDF, and Cheng Score are synthesized by a sampling method. At last, another integrated RBF model ranks the structural models according to the features of sampling distribution. We tested the proposed method by using two different datasets, including the CASP server prediction models of all CASP8 targets and a set of models generated by our in-house software MUFOLD. The test result shows that our method outperforms any individual scoring function on both best model selection, and overall correlation between the predicted ranking and the actual ranking of structural quality.
The Sla2p/HIP1/HIP1R family: similar structure, similar function in endocytosis?
Gottfried, Irit; Ehrlich, Marcelo; Ashery, Uri
2010-02-01
HIP1 (huntingtin interacting protein 1) has two close relatives: HIP1R (HIP1-related) and yeast Sla2p. All three members of the family have a conserved domain structure, suggesting a common function. Over the past decade, a number of studies have characterized these proteins using a combination of biochemical, imaging, structural and genetic techniques. These studies provide valuable information on binding partners, structure and dynamics of HIP1/HIP1R/Sla2p. In general, all suggest a role in CME (clathrin-mediated endocytosis) for the three proteins, though some differences have emerged. In this mini-review we summarize the current views on the roles of these proteins, while emphasizing the unique attributes of each family member.
Cheng, Chi-Yuan; Han, Songi
2013-01-01
Membrane proteins regulate vital cellular processes, including signaling, ion transport, and vesicular trafficking. Obtaining experimental access to their structures, conformational fluctuations, orientations, locations, and hydration in membrane environments, as well as the lipid membrane properties, is critical to understanding their functions. Dynamic nuclear polarization (DNP) of frozen solids can dramatically boost the sensitivity of current solid-state nuclear magnetic resonance tools to enhance access to membrane protein structures in native membrane environments. Overhauser DNP in the solution state can map out the local and site-specific hydration dynamics landscape of membrane proteins and lipid membranes, critically complementing the structural and dynamics information obtained by electron paramagnetic resonance spectroscopy. Here, we provide an overview of how DNP methods in solids and solutions can significantly increase our understanding of membrane protein structures, dynamics, functions, and hydration in complex biological membrane environments.
A DEK Domain-Containing Protein Modulates Chromatin Structure and Function in Arabidopsis[W][OPEN
Waidmann, Sascha; Kusenda, Branislav; Mayerhofer, Juliane; Mechtler, Karl; Jonak, Claudia
2014-01-01
Chromatin is a major determinant in the regulation of virtually all DNA-dependent processes. Chromatin architectural proteins interact with nucleosomes to modulate chromatin accessibility and higher-order chromatin structure. The evolutionarily conserved DEK domain-containing protein is implicated in important chromatin-related processes in animals, but little is known about its DNA targets and protein interaction partners. In plants, the role of DEK has remained elusive. In this work, we identified DEK3 as a chromatin-associated protein in Arabidopsis thaliana. DEK3 specifically binds histones H3 and H4. Purification of other proteins associated with nuclear DEK3 also established DNA topoisomerase 1α and proteins of the cohesion complex as in vivo interaction partners. Genome-wide mapping of DEK3 binding sites by chromatin immunoprecipitation followed by deep sequencing revealed enrichment of DEK3 at protein-coding genes throughout the genome. Using DEK3 knockout and overexpressor lines, we show that DEK3 affects nucleosome occupancy and chromatin accessibility and modulates the expression of DEK3 target genes. Furthermore, functional levels of DEK3 are crucial for stress tolerance. Overall, data indicate that DEK3 contributes to modulation of Arabidopsis chromatin structure and function. PMID:25387881
NASA Astrophysics Data System (ADS)
Rauf, Muhammad; Saeed, Nasir A.; Habib, Imran; Ahmed, Moddassir; Shahzad, Khurram; Mansoor, Shahid; Ali, Rashid
2017-02-01
Structure prediction can provide information about function and active sites of protein which helps to design new functional proteins. H+-pyrophosphatase is transmembrane protein involved in establishing proton motive force for active transport of Na+ across membrane by Na+/H+ antiporters. A full length novel H+-pyrophosphatase gene was isolated from halophytic grass Leptochloa fusca using RT-PCR and RACE method. Full length LfVP1 gene sequence of 2292 nucleotides encodes protein of 764 amino acids. DNA and protein sequences were used for characterization using bioinformatics tools. Various important potential sites were predicted by PROSITE webserver. Primary structural analysis showed LfVP1 as stable protein and Grand average hydropathy (GRAVY) indicated that LfVP1 protein has good hydrosolubility. Secondary structure analysis showed that LfVP1 protein sequence contains significant proportion of alpha helix and random coil. Protein membrane topology suggested the presence of 14 transmembrane domains and presence of catalytic domain in TM3. Three dimensional structure from LfVP1 protein sequence also indicated the presence of 14 transmembrane domains and hydrophobicity surface model showed amino acid hydrophobicity. Ramachandran plot showed that 98% amino acid residues were predicted in the favored region.
The poly(C)-binding proteins: a multiplicity of functions and a search for mechanisms.
Makeyev, Aleksandr V; Liebhaber, Stephen A
2002-01-01
The poly(C) binding proteins (PCBPs) are encoded at five dispersed loci in the mouse and human genomes. These proteins, which can be divided into two groups, hnRNPs K/J and the alphaCPs (alphaCP1-4), are linked by a common evolutionary history, a shared triple KH domain configuration, and by their poly(C) binding specificity. Given these conserved characteristics it is remarkable to find a substantial diversity in PCBP functions. The roles of these proteins in mRNA stabilization, translational activation, and translational silencing suggest a complex and diverse set of post-transcriptional control pathways. Their additional putative functions in transcriptional control and as structural components of important DNA-protein complexes further support their remarkable structural and functional versatility. Clearly the identification of additional binding targets and delineation of corresponding control mechanisms and effector pathways will establish highly informative models for further exploration. PMID:12003487
The poly(C)-binding proteins: a multiplicity of functions and a search for mechanisms.
Makeyev, Aleksandr V; Liebhaber, Stephen A
2002-03-01
The poly(C) binding proteins (PCBPs) are encoded at five dispersed loci in the mouse and human genomes. These proteins, which can be divided into two groups, hnRNPs K/J and the alphaCPs (alphaCP1-4), are linked by a common evolutionary history, a shared triple KH domain configuration, and by their poly(C) binding specificity. Given these conserved characteristics it is remarkable to find a substantial diversity in PCBP functions. The roles of these proteins in mRNA stabilization, translational activation, and translational silencing suggest a complex and diverse set of post-transcriptional control pathways. Their additional putative functions in transcriptional control and as structural components of important DNA-protein complexes further support their remarkable structural and functional versatility. Clearly the identification of additional binding targets and delineation of corresponding control mechanisms and effector pathways will establish highly informative models for further exploration.
Mode localization in the cooperative dynamics of protein recognition
NASA Astrophysics Data System (ADS)
Copperman, J.; Guenza, M. G.
2016-07-01
The biological function of proteins is encoded in their structure and expressed through the mediation of their dynamics. This paper presents a study on the correlation between local fluctuations, binding, and biological function for two sample proteins, starting from the Langevin Equation for Protein Dynamics (LE4PD). The LE4PD is a microscopic and residue-specific coarse-grained approach to protein dynamics, which starts from the static structural ensemble of a protein and predicts the dynamics analytically. It has been shown to be accurate in its prediction of NMR relaxation experiments and Debye-Waller factors. The LE4PD is solved in a set of diffusive modes which span a vast range of time scales of the protein dynamics, and provides a detailed picture of the mode-dependent localization of the fluctuation as a function of the primary structure of the protein. To investigate the dynamics of protein complexes, the theory is implemented here to treat the coarse-grained dynamics of interacting macromolecules. As an example, calculations of the dynamics of monomeric and dimerized HIV protease and the free Insulin Growth Factor II Receptor (IGF2R) domain 11 and its IGF2R:IGF2 complex are presented. Either simulation-derived or experimentally measured NMR conformers are used as input structural ensembles to the theory. The picture that emerges suggests a dynamical heterogeneous protein where biologically active regions provide energetically comparable conformational states that are trapped by a reacting partner in agreement with the conformation-selection mechanism of binding.