Comparative Protein Structure Modeling Using MODELLER.
Webb, Benjamin; Sali, Andrej
2014-09-08
Functional characterization of a protein sequence is one of the most frequent problems in biology. This task is usually facilitated by accurate three-dimensional (3-D) structure of the studied protein. In the absence of an experimentally determined structure, comparative or homology modeling can sometimes provide a useful 3-D model for a protein that is related to at least one known protein structure. Comparative modeling predicts the 3-D structure of a given protein sequence (target) based primarily on its alignment to one or more proteins of known structure (templates). The prediction process consists of fold assignment, target-template alignment, model building, and model evaluation. This unit describes how to calculate comparative models using the program MODELLER and discusses all four steps of comparative modeling, frequently observed errors, and some applications. Modeling lactate dehydrogenase from Trichomonas vaginalis (TvLDH) is described as an example. The download and installation of the MODELLER software is also described. Copyright © 2014 John Wiley & Sons, Inc.
Ensemble-based evaluation for protein structure models.
Jamroz, Michal; Kolinski, Andrzej; Kihara, Daisuke
2016-06-15
Comparing protein tertiary structures is a fundamental procedure in structural biology and protein bioinformatics. Structure comparison is important particularly for evaluating computational protein structure models. Most of the model structure evaluation methods perform rigid body superimposition of a structure model to its crystal structure and measure the difference of the corresponding residue or atom positions between them. However, these methods neglect intrinsic flexibility of proteins by treating the native structure as a rigid molecule. Because different parts of proteins have different levels of flexibility, for example, exposed loop regions are usually more flexible than the core region of a protein structure, disagreement of a model to the native needs to be evaluated differently depending on the flexibility of residues in a protein. We propose a score named FlexScore for comparing protein structures that consider flexibility of each residue in the native state of proteins. Flexibility information may be extracted from experiments such as NMR or molecular dynamics simulation. FlexScore considers an ensemble of conformations of a protein described as a multivariate Gaussian distribution of atomic displacements and compares a query computational model with the ensemble. We compare FlexScore with other commonly used structure similarity scores over various examples. FlexScore agrees with experts' intuitive assessment of computational models and provides information of practical usefulness of models. https://bitbucket.org/mjamroz/flexscore dkihara@purdue.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.
Ensemble-based evaluation for protein structure models
Jamroz, Michal; Kolinski, Andrzej; Kihara, Daisuke
2016-01-01
Motivation: Comparing protein tertiary structures is a fundamental procedure in structural biology and protein bioinformatics. Structure comparison is important particularly for evaluating computational protein structure models. Most of the model structure evaluation methods perform rigid body superimposition of a structure model to its crystal structure and measure the difference of the corresponding residue or atom positions between them. However, these methods neglect intrinsic flexibility of proteins by treating the native structure as a rigid molecule. Because different parts of proteins have different levels of flexibility, for example, exposed loop regions are usually more flexible than the core region of a protein structure, disagreement of a model to the native needs to be evaluated differently depending on the flexibility of residues in a protein. Results: We propose a score named FlexScore for comparing protein structures that consider flexibility of each residue in the native state of proteins. Flexibility information may be extracted from experiments such as NMR or molecular dynamics simulation. FlexScore considers an ensemble of conformations of a protein described as a multivariate Gaussian distribution of atomic displacements and compares a query computational model with the ensemble. We compare FlexScore with other commonly used structure similarity scores over various examples. FlexScore agrees with experts’ intuitive assessment of computational models and provides information of practical usefulness of models. Availability and implementation: https://bitbucket.org/mjamroz/flexscore Contact: dkihara@purdue.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27307633
WEBnm@ v2.0: Web server and services for comparing protein flexibility.
Tiwari, Sandhya P; Fuglebakk, Edvin; Hollup, Siv M; Skjærven, Lars; Cragnolini, Tristan; Grindhaug, Svenn H; Tekle, Kidane M; Reuter, Nathalie
2014-12-30
Normal mode analysis (NMA) using elastic network models is a reliable and cost-effective computational method to characterise protein flexibility and by extension, their dynamics. Further insight into the dynamics-function relationship can be gained by comparing protein motions between protein homologs and functional classifications. This can be achieved by comparing normal modes obtained from sets of evolutionary related proteins. We have developed an automated tool for comparative NMA of a set of pre-aligned protein structures. The user can submit a sequence alignment in the FASTA format and the corresponding coordinate files in the Protein Data Bank (PDB) format. The computed normalised squared atomic fluctuations and atomic deformation energies of the submitted structures can be easily compared on graphs provided by the web user interface. The web server provides pairwise comparison of the dynamics of all proteins included in the submitted set using two measures: the Root Mean Squared Inner Product and the Bhattacharyya Coefficient. The Comparative Analysis has been implemented on our web server for NMA, WEBnm@, which also provides recently upgraded functionality for NMA of single protein structures. This includes new visualisations of protein motion, visualisation of inter-residue correlations and the analysis of conformational change using the overlap analysis. In addition, programmatic access to WEBnm@ is now available through a SOAP-based web service. Webnm@ is available at http://apps.cbu.uib.no/webnma . WEBnm@ v2.0 is an online tool offering unique capability for comparative NMA on multiple protein structures. Along with a convenient web interface, powerful computing resources, and several methods for mode analyses, WEBnm@ facilitates the assessment of protein flexibility within protein families and superfamilies. These analyses can give a good view of how the structures move and how the flexibility is conserved over the different structures.
CARd-3D: Carbon Distribution in 3D Structure Program for Globular Proteins
Ekambaram, Rajasekaran; Kannaiyan, Akila; Marimuthu, Vijayasarathy; Swaminathan, Vinobha Chinnaiah; Renganathan, Senthil; Perumal, Ananda Gopu
2014-01-01
Spatial arrangement of carbon in protein structure is analyzed here. Particularly, the carbon fractions around individual atoms are compared. It is hoped that it follows the principle of 31.45% carbon around individual atoms. The results reveal that globular protein's atoms follow this principle. A comparative study on monomer versus dimer reveal that carbon is better distributed in dimeric form than in its monomeric form. Similar study on solid versus liquid structures reveals that the liquid (NMR) structure has better carbon distribution over the corresponding solid (X-Ray) structure. The carbon fraction distributions in fiber and toxin protein are compared. Fiber proteins follow the principle of carbon fraction distribution. At the same time it has another broad spectrum of carbon distribution than in globular proteins. The toxin protein follows an abnormal carbon fraction distribution. The carbon fraction distribution plays an important role in deciding the structure and shape of proteins. It is hoped to help in understanding the protein folding and function. PMID:24748753
Recent developments in structural proteomics for protein structure determination.
Liu, Hsuan-Liang; Hsu, Jyh-Ping
2005-05-01
The major challenges in structural proteomics include identifying all the proteins on the genome-wide scale, determining their structure-function relationships, and outlining the precise three-dimensional structures of the proteins. Protein structures are typically determined by experimental approaches such as X-ray crystallography or nuclear magnetic resonance (NMR) spectroscopy. However, the knowledge of three-dimensional space by these techniques is still limited. Thus, computational methods such as comparative and de novo approaches and molecular dynamic simulations are intensively used as alternative tools to predict the three-dimensional structures and dynamic behavior of proteins. This review summarizes recent developments in structural proteomics for protein structure determination; including instrumental methods such as X-ray crystallography and NMR spectroscopy, and computational methods such as comparative and de novo structure prediction and molecular dynamics simulations.
Comparative Protein Structure Modeling Using MODELLER
Webb, Benjamin; Sali, Andrej
2016-01-01
Comparative protein structure modeling predicts the three-dimensional structure of a given protein sequence (target) based primarily on its alignment to one or more proteins of known structure (templates). The prediction process consists of fold assignment, target-template alignment, model building, and model evaluation. This unit describes how to calculate comparative models using the program MODELLER and how to use the ModBase database of such models, and discusses all four steps of comparative modeling, frequently observed errors, and some applications. Modeling lactate dehydrogenase from Trichomonas vaginalis (TvLDH) is described as an example. The download and installation of the MODELLER software is also described. PMID:27322406
A new definition and properties of the similarity value between two protein structures.
Saberi Fathi, S M
2016-10-01
Knowledge regarding the 3D structure of a protein provides useful information about the protein's functional properties. Particularly, structural similarity between proteins can be used as a good predictor of functional similarity. One method that uses the 3D geometrical structure of proteins in order to compare them is the similarity value (SV). In this paper, we introduce a new definition of the SV measure for comparing two proteins. To this end, we consider the mass of the protein's atoms and concentrate on the number of protein's atoms to be compared. This defines a new measure, called the weighted similarity value (WSV), adding physical properties to geometrical properties. We also show that our results are in good agreement with the results obtained by TM-SCORE and DALILITE. WSV can be of use in protein classification and in drug discovery.
3D-SURFER 2.0: web platform for real-time search and characterization of protein surfaces.
Xiong, Yi; Esquivel-Rodriguez, Juan; Sael, Lee; Kihara, Daisuke
2014-01-01
The increasing number of uncharacterized protein structures necessitates the development of computational approaches for function annotation using the protein tertiary structures. Protein structure database search is the basis of any structure-based functional elucidation of proteins. 3D-SURFER is a web platform for real-time protein surface comparison of a given protein structure against the entire PDB using 3D Zernike descriptors. It can smoothly navigate the protein structure space in real-time from one query structure to another. A major new feature of Release 2.0 is the ability to compare the protein surface of a single chain, a single domain, or a single complex against databases of protein chains, domains, complexes, or a combination of all three in the latest PDB. Additionally, two types of protein structures can now be compared: all-atom-surface and backbone-atom-surface. The server can also accept a batch job for a large number of database searches. Pockets in protein surfaces can be identified by VisGrid and LIGSITE (csc) . The server is available at http://kiharalab.org/3d-surfer/.
Structure-based barcoding of proteins.
Metri, Rahul; Jerath, Gaurav; Kailas, Govind; Gacche, Nitin; Pal, Adityabarna; Ramakrishnan, Vibin
2014-01-01
A reduced representation in the format of a barcode has been developed to provide an overview of the topological nature of a given protein structure from 3D coordinate file. The molecular structure of a protein coordinate file from Protein Data Bank is first expressed in terms of an alpha-numero code and further converted to a barcode image. The barcode representation can be used to compare and contrast different proteins based on their structure. The utility of this method has been exemplified by comparing structural barcodes of proteins that belong to same fold family, and across different folds. In addition to this, we have attempted to provide an illustration to (i) the structural changes often seen in a given protein molecule upon interaction with ligands and (ii) Modifications in overall topology of a given protein during evolution. The program is fully downloadable from the website http://www.iitg.ac.in/probar/. © 2013 The Protein Society.
Gaia: automated quality assessment of protein structure models.
Kota, Pradeep; Ding, Feng; Ramachandran, Srinivas; Dokholyan, Nikolay V
2011-08-15
Increasing use of structural modeling for understanding structure-function relationships in proteins has led to the need to ensure that the protein models being used are of acceptable quality. Quality of a given protein structure can be assessed by comparing various intrinsic structural properties of the protein to those observed in high-resolution protein structures. In this study, we present tools to compare a given structure to high-resolution crystal structures. We assess packing by calculating the total void volume, the percentage of unsatisfied hydrogen bonds, the number of steric clashes and the scaling of the accessible surface area. We assess covalent geometry by determining bond lengths, angles, dihedrals and rotamers. The statistical parameters for the above measures, obtained from high-resolution crystal structures enable us to provide a quality-score that points to specific areas where a given protein structural model needs improvement. We provide these tools that appraise protein structures in the form of a web server Gaia (http://chiron.dokhlab.org). Gaia evaluates the packing and covalent geometry of a given protein structure and provides quantitative comparison of the given structure to high-resolution crystal structures. dokh@unc.edu Supplementary data are available at Bioinformatics online.
Mao, Xiaoying; Hua, Yufei
2012-01-01
In this study, composition, structure and the functional properties of protein concentrate (WPC) and protein isolate (WPI) produced from defatted walnut flour (DFWF) were investigated. The results showed that the composition and structure of walnut protein concentrate (WPC) and walnut protein isolate (WPI) were significantly different. The molecular weight distribution of WPI was uniform and the protein composition of DFWF and WPC was complex with the protein aggregation. H(0) of WPC was significantly higher (p < 0.05) than those of DFWF and WPI, whilst WPI had a higher H(0) compared to DFWF. The secondary structure of WPI was similar to WPC. WPI showed big flaky plate like structures; whereas WPC appeared as a small flaky and more compact structure. The most functional properties of WPI were better than WPC. In comparing most functional properties of WPI and WPC with soybean protein concentrate and isolate, WPI and WPC showed higher fat absorption capacity (FAC). Emulsifying properties and foam properties of WPC and WPI in alkaline pH were comparable with that of soybean protein concentrate and isolate. Walnut protein concentrates and isolates can be considered as potential functional food ingredients.
Bhasi, Ashwini; Philip, Philge; Manikandan, Vinu; Senapathy, Periannan
2009-01-01
We have developed ExDom, a unique database for the comparative analysis of the exon–intron structures of 96 680 protein domains from seven eukaryotic organisms (Homo sapiens, Mus musculus, Bos taurus, Rattus norvegicus, Danio rerio, Gallus gallus and Arabidopsis thaliana). ExDom provides integrated access to exon-domain data through a sophisticated web interface which has the following analytical capabilities: (i) intergenomic and intragenomic comparative analysis of exon–intron structure of domains; (ii) color-coded graphical display of the domain architecture of proteins correlated with their corresponding exon-intron structures; (iii) graphical analysis of multiple sequence alignments of amino acid and coding nucleotide sequences of homologous protein domains from seven organisms; (iv) comparative graphical display of exon distributions within the tertiary structures of protein domains; and (v) visualization of exon–intron structures of alternative transcripts of a gene correlated to variations in the domain architecture of corresponding protein isoforms. These novel analytical features are highly suited for detailed investigations on the exon–intron structure of domains and make ExDom a powerful tool for exploring several key questions concerning the function, origin and evolution of genes and proteins. ExDom database is freely accessible at: http://66.170.16.154/ExDom/. PMID:18984624
Conservation of protein structure over four billion years.
Ingles-Prieto, Alvaro; Ibarra-Molero, Beatriz; Delgado-Delgado, Asuncion; Perez-Jimenez, Raul; Fernandez, Julio M; Gaucher, Eric A; Sanchez-Ruiz, Jose M; Gavira, Jose A
2013-09-03
Little is known about the evolution of protein structures and the degree of protein structure conservation over planetary time scales. Here, we report the X-ray crystal structures of seven laboratory resurrections of Precambrian thioredoxins dating up to approximately four billion years ago. Despite considerable sequence differences compared with extant enzymes, the ancestral proteins display the canonical thioredoxin fold, whereas only small structural changes have occurred over four billion years. This remarkable degree of structure conservation since a time near the last common ancestor of life supports a punctuated-equilibrium model of structure evolution in which the generation of new folds occurs over comparatively short periods and is followed by long periods of structural stasis. Copyright © 2013 Elsevier Ltd. All rights reserved.
@TOME-2: a new pipeline for comparative modeling of protein-ligand complexes.
Pons, Jean-Luc; Labesse, Gilles
2009-07-01
@TOME 2.0 is new web pipeline dedicated to protein structure modeling and small ligand docking based on comparative analyses. @TOME 2.0 allows fold recognition, template selection, structural alignment editing, structure comparisons, 3D-model building and evaluation. These tasks are routinely used in sequence analyses for structure prediction. In our pipeline the necessary software is efficiently interconnected in an original manner to accelerate all the processes. Furthermore, we have also connected comparative docking of small ligands that is performed using protein-protein superposition. The input is a simple protein sequence in one-letter code with no comment. The resulting 3D model, protein-ligand complexes and structural alignments can be visualized through dedicated Web interfaces or can be downloaded for further studies. These original features will aid in the functional annotation of proteins and the selection of templates for molecular modeling and virtual screening. Several examples are described to highlight some of the new functionalities provided by this pipeline. The server and its documentation are freely available at http://abcis.cbs.cnrs.fr/AT2/
Zhou, Ren-Bin; Lu, Hui-Meng; Liu, Jie; Shi, Jian-Yu; Zhu, Jing; Lu, Qin-Qin; Yin, Da-Chuan
2016-01-01
Recombinant expression of proteins has become an indispensable tool in modern day research. The large yields of recombinantly expressed proteins accelerate the structural and functional characterization of proteins. Nevertheless, there are literature reported that the recombinant proteins show some differences in structure and function as compared with the native ones. Now there have been more than 100,000 structures (from both recombinant and native sources) publicly available in the Protein Data Bank (PDB) archive, which makes it possible to investigate if there exist any proteins in the RCSB PDB archive that have identical sequence but have some difference in structures. In this paper, we present the results of a systematic comparative study of the 3D structures of identical naturally purified versus recombinantly expressed proteins. The structural data and sequence information of the proteins were mined from the RCSB PDB archive. The combinatorial extension (CE), FATCAT-flexible and TM-Align methods were employed to align the protein structures. The root-mean-square distance (RMSD), TM-score, P-value, Z-score, secondary structural elements and hydrogen bonds were used to assess the structure similarity. A thorough analysis of the PDB archive generated five-hundred-seventeen pairs of native and recombinant proteins that have identical sequence. There were no pairs of proteins that had the same sequence and significantly different structural fold, which support the hypothesis that expression in a heterologous host usually could fold correctly into their native forms.
Zhou, Ren-Bin; Lu, Hui-Meng; Liu, Jie; Shi, Jian-Yu; Zhu, Jing; Lu, Qin-Qin; Yin, Da-Chuan
2016-01-01
Recombinant expression of proteins has become an indispensable tool in modern day research. The large yields of recombinantly expressed proteins accelerate the structural and functional characterization of proteins. Nevertheless, there are literature reported that the recombinant proteins show some differences in structure and function as compared with the native ones. Now there have been more than 100,000 structures (from both recombinant and native sources) publicly available in the Protein Data Bank (PDB) archive, which makes it possible to investigate if there exist any proteins in the RCSB PDB archive that have identical sequence but have some difference in structures. In this paper, we present the results of a systematic comparative study of the 3D structures of identical naturally purified versus recombinantly expressed proteins. The structural data and sequence information of the proteins were mined from the RCSB PDB archive. The combinatorial extension (CE), FATCAT-flexible and TM-Align methods were employed to align the protein structures. The root-mean-square distance (RMSD), TM-score, P-value, Z-score, secondary structural elements and hydrogen bonds were used to assess the structure similarity. A thorough analysis of the PDB archive generated five-hundred-seventeen pairs of native and recombinant proteins that have identical sequence. There were no pairs of proteins that had the same sequence and significantly different structural fold, which support the hypothesis that expression in a heterologous host usually could fold correctly into their native forms. PMID:27517583
Cao, Hu; Lu, Yonggang
2017-01-01
With the rapid growth of known protein 3D structures in number, how to efficiently compare protein structures becomes an essential and challenging problem in computational structural biology. At present, many protein structure alignment methods have been developed. Among all these methods, flexible structure alignment methods are shown to be superior to rigid structure alignment methods in identifying structure similarities between proteins, which have gone through conformational changes. It is also found that the methods based on aligned fragment pairs (AFPs) have a special advantage over other approaches in balancing global structure similarities and local structure similarities. Accordingly, we propose a new flexible protein structure alignment method based on variable-length AFPs. Compared with other methods, the proposed method possesses three main advantages. First, it is based on variable-length AFPs. The length of each AFP is separately determined to maximally represent a local similar structure fragment, which reduces the number of AFPs. Second, it uses local coordinate systems, which simplify the computation at each step of the expansion of AFPs during the AFP identification. Third, it decreases the number of twists by rewarding the situation where nonconsecutive AFPs share the same transformation in the alignment, which is realized by dynamic programming with an improved transition function. The experimental data show that compared with FlexProt, FATCAT, and FlexSnap, the proposed method can achieve comparable results by introducing fewer twists. Meanwhile, it can generate results similar to those of the FATCAT method in much less running time due to the reduced number of AFPs.
Using linear algebra for protein structural comparison and classification
2009-01-01
In this article, we describe a novel methodology to extract semantic characteristics from protein structures using linear algebra in order to compose structural signature vectors which may be used efficiently to compare and classify protein structures into fold families. These signatures are built from the pattern of hydrophobic intrachain interactions using Singular Value Decomposition (SVD) and Latent Semantic Indexing (LSI) techniques. Considering proteins as documents and contacts as terms, we have built a retrieval system which is able to find conserved contacts in samples of myoglobin fold family and to retrieve these proteins among proteins of varied folds with precision of up to 80%. The classifier is a web tool available at our laboratory website. Users can search for similar chains from a specific PDB, view and compare their contact maps and browse their structures using a JMol plug-in. PMID:21637532
Using linear algebra for protein structural comparison and classification.
Gomide, Janaína; Melo-Minardi, Raquel; Dos Santos, Marcos Augusto; Neshich, Goran; Meira, Wagner; Lopes, Júlio César; Santoro, Marcelo
2009-07-01
In this article, we describe a novel methodology to extract semantic characteristics from protein structures using linear algebra in order to compose structural signature vectors which may be used efficiently to compare and classify protein structures into fold families. These signatures are built from the pattern of hydrophobic intrachain interactions using Singular Value Decomposition (SVD) and Latent Semantic Indexing (LSI) techniques. Considering proteins as documents and contacts as terms, we have built a retrieval system which is able to find conserved contacts in samples of myoglobin fold family and to retrieve these proteins among proteins of varied folds with precision of up to 80%. The classifier is a web tool available at our laboratory website. Users can search for similar chains from a specific PDB, view and compare their contact maps and browse their structures using a JMol plug-in.
Kinjo, Akira R; Nakamura, Haruki
2013-01-01
Protein functions are mediated by interactions between proteins and other molecules. One useful approach to analyze protein functions is to compare and classify the structures of interaction interfaces of proteins. Here, we describe the procedures for compiling a database of interface structures and efficiently comparing the interface structures. To do so requires a good understanding of the data structures of the Protein Data Bank (PDB). Therefore, we also provide a detailed account of the PDB exchange dictionary necessary for extracting data that are relevant for analyzing interaction interfaces and secondary structures. We identify recurring structural motifs by classifying similar interface structures, and we define a coarse-grained representation of supersecondary structures (SSS) which represents a sequence of two or three secondary structure elements including their relative orientations as a string of four to seven letters. By examining the correspondence between structural motifs and SSS strings, we show that no SSS string has particularly high propensity to be found interaction interfaces in general, indicating any SSS can be used as a binding interface. When individual structural motifs are examined, there are some SSS strings that have high propensity for particular groups of structural motifs. In addition, it is shown that while the SSS strings found in particular structural motifs for nonpolymer and protein interfaces are as abundant as in other structural motifs that belong to the same subunit, structural motifs for nucleic acid interfaces exhibit somewhat stronger preference for SSS strings. In regard to protein folds, many motif-specific SSS strings were found across many folds, suggesting that SSS may be a useful description to investigate the universality of ligand binding modes.
MODBASE, a database of annotated comparative protein structure models
Pieper, Ursula; Eswar, Narayanan; Stuart, Ashley C.; Ilyin, Valentin A.; Sali, Andrej
2002-01-01
MODBASE (http://guitar.rockefeller.edu/modbase) is a relational database of annotated comparative protein structure models for all available protein sequences matched to at least one known protein structure. The models are calculated by MODPIPE, an automated modeling pipeline that relies on PSI-BLAST, IMPALA and MODELLER. MODBASE uses the MySQL relational database management system for flexible and efficient querying, and the MODVIEW Netscape plugin for viewing and manipulating multiple sequences and structures. It is updated regularly to reflect the growth of the protein sequence and structure databases, as well as improvements in the software for calculating the models. For ease of access, MODBASE is organized into different datasets. The largest dataset contains models for domains in 304 517 out of 539 171 unique protein sequences in the complete TrEMBL database (23 March 2001); only models based on significant alignments (PSI-BLAST E-value < 10–4) and models assessed to have the correct fold are included. Other datasets include models for target selection and structure-based annotation by the New York Structural Genomics Research Consortium, models for prediction of genes in the Drosophila melanogaster genome, models for structure determination of several ribosomal particles and models calculated by the MODWEB comparative modeling web server. PMID:11752309
Alsenaidy, Mohammad A.; Jain, Nishant K.; Kim, Jae H.; Middaugh, C. Russell; Volkin, David B.
2014-01-01
In this review, some of the challenges and opportunities encountered during protein comparability assessments are summarized with an emphasis on developing new analytical approaches to better monitor higher-order protein structures. Several case studies are presented using high throughput biophysical methods to collect protein physical stability data as function of temperature, agitation, ionic strength and/or solution pH. These large data sets were then used to construct empirical phase diagrams (EPDs), radar charts, and comparative signature diagrams (CSDs) for data visualization and structural comparisons between the different proteins. Protein samples with different sizes, post-translational modifications, and inherent stability are presented: acidic fibroblast growth factor (FGF-1) mutants, different glycoforms of an IgG1 mAb prepared by deglycosylation, as well as comparisons of different formulations of an IgG1 mAb and granulocyte colony stimulating factor (GCSF). Using this approach, differences in structural integrity and conformational stability profiles were detected under stress conditions that could not be resolved by using the same techniques under ambient conditions (i.e., no stress). Thus, an evaluation of conformational stability differences may serve as an effective surrogate to monitor differences in higher-order structure between protein samples. These case studies are discussed in the context of potential utility in protein comparability studies. PMID:24659968
Alsenaidy, Mohammad A; Jain, Nishant K; Kim, Jae H; Middaugh, C Russell; Volkin, David B
2014-01-01
In this review, some of the challenges and opportunities encountered during protein comparability assessments are summarized with an emphasis on developing new analytical approaches to better monitor higher-order protein structures. Several case studies are presented using high throughput biophysical methods to collect protein physical stability data as function of temperature, agitation, ionic strength and/or solution pH. These large data sets were then used to construct empirical phase diagrams (EPDs), radar charts, and comparative signature diagrams (CSDs) for data visualization and structural comparisons between the different proteins. Protein samples with different sizes, post-translational modifications, and inherent stability are presented: acidic fibroblast growth factor (FGF-1) mutants, different glycoforms of an IgG1 mAb prepared by deglycosylation, as well as comparisons of different formulations of an IgG1 mAb and granulocyte colony stimulating factor (GCSF). Using this approach, differences in structural integrity and conformational stability profiles were detected under stress conditions that could not be resolved by using the same techniques under ambient conditions (i.e., no stress). Thus, an evaluation of conformational stability differences may serve as an effective surrogate to monitor differences in higher-order structure between protein samples. These case studies are discussed in the context of potential utility in protein comparability studies.
Fan, Ming; Zheng, Bin; Li, Lihua
2015-10-01
Knowledge of the structural class of a given protein is important for understanding its folding patterns. Although a lot of efforts have been made, it still remains a challenging problem for prediction of protein structural class solely from protein sequences. The feature extraction and classification of proteins are the main problems in prediction. In this research, we extended our earlier work regarding these two aspects. In protein feature extraction, we proposed a scheme by calculating the word frequency and word position from sequences of amino acid, reduced amino acid, and secondary structure. For an accurate classification of the structural class of protein, we developed a novel Multi-Agent Ada-Boost (MA-Ada) method by integrating the features of Multi-Agent system into Ada-Boost algorithm. Extensive experiments were taken to test and compare the proposed method using four benchmark datasets in low homology. The results showed classification accuracies of 88.5%, 96.0%, 88.4%, and 85.5%, respectively, which are much better compared with the existing methods. The source code and dataset are available on request.
Protein Structure Determination using Metagenome sequence data
Ovchinnikov, Sergey; Park, Hahnbeom; Varghese, Neha; Huang, Po-Ssu; Pavlopoulos, Georgios A.; Kim, David E.; Kamisetty, Hetunandan; Kyrpides, Nikos C.; Baker, David
2017-01-01
Despite decades of work by structural biologists, there are still ~5200 protein families with unknown structure outside the range of comparative modeling. We show that Rosetta structure prediction guided by residue-residue contacts inferred from evolutionary information can accurately model proteins that belong to large families, and that metagenome sequence data more than triples the number of protein families with sufficient sequences for accurate modeling. We then integrate metagenome data, contact based structure matching and Rosetta structure calculations to generate models for 614 protein families with currently unknown structures; 206 are membrane proteins and 137 have folds not represented in the PDB. This approach provides the representative models for large protein families originally envisioned as the goal of the protein structure initiative at a fraction of the cost. PMID:28104891
Holm, Liisa; Laakso, Laura M
2016-07-08
The Dali server (http://ekhidna2.biocenter.helsinki.fi/dali) is a network service for comparing protein structures in 3D. In favourable cases, comparing 3D structures may reveal biologically interesting similarities that are not detectable by comparing sequences. The Dali server has been running in various places for over 20 years and is used routinely by crystallographers on newly solved structures. The latest update of the server provides enhanced analytics for the study of sequence and structure conservation. The server performs three types of structure comparisons: (i) Protein Data Bank (PDB) search compares one query structure against those in the PDB and returns a list of similar structures; (ii) pairwise comparison compares one query structure against a list of structures specified by the user; and (iii) all against all structure comparison returns a structural similarity matrix, a dendrogram and a multidimensional scaling projection of a set of structures specified by the user. Structural superimpositions are visualized using the Java-free WebGL viewer PV. The structural alignment view is enhanced by sequence similarity searches against Uniprot. The combined structure-sequence alignment information is compressed to a stack of aligned sequence logos. In the stack, each structure is structurally aligned to the query protein and represented by a sequence logo. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Local-global alignment for finding 3D similarities in protein structures
Zemla, Adam T [Brentwood, CA
2011-09-20
A method of finding 3D similarities in protein structures of a first molecule and a second molecule. The method comprises providing preselected information regarding the first molecule and the second molecule. Comparing the first molecule and the second molecule using Longest Continuous Segments (LCS) analysis. Comparing the first molecule and the second molecule using Global Distance Test (GDT) analysis. Comparing the first molecule and the second molecule using Local Global Alignment Scoring function (LGA_S) analysis. Verifying constructed alignment and repeating the steps to find the regions of 3D similarities in protein structures.
Knowledge-based model building of proteins: concepts and examples.
Bajorath, J.; Stenkamp, R.; Aruffo, A.
1993-01-01
We describe how to build protein models from structural templates. Methods to identify structural similarities between proteins in cases of significant, moderate to low, or virtually absent sequence similarity are discussed. The detection and evaluation of structural relationships is emphasized as a central aspect of protein modeling, distinct from the more technical aspects of model building. Computational techniques to generate and complement comparative protein models are also reviewed. Two examples, P-selectin and gp39, are presented to illustrate the derivation of protein model structures and their use in experimental studies. PMID:7505680
Maugini, Elisa; Tronelli, Daniele; Bossa, Francesco; Pascarella, Stefano
2009-04-01
Enzymes from thermophilic and, particularly, from hyperthermophilic organisms are surprisingly stable. Understanding of the molecular origin of protein thermostability and thermoactivity attracted the interest of many scientist both for the perspective comprehension of the principles of protein structure and for the possible biotechnological applications through application of protein engineering. Comparative studies at sequence and structure levels were aimed at detecting significant differences of structural parameters related to protein stability between thermophilic and hyperhermophilic structures and their mesophilic homologs. Comparative studies were useful in the identification of a few recurrent themes which the evolution utilized in different combinations in different protein families. These studies were mostly carried out at the monomer level. However, maintenance of a proper quaternary structure is an essential prerequisite for a functional macromolecule. At the environmental temperatures experienced typically by hyper- and thermophiles, the subunit interactions mediated by the interface must be sufficiently stable. Our analysis was therefore aimed at the identification of the molecular strategies adopted by evolution to enhance interface thermostability of oligomeric enzymes. The variation of several structural properties related to protein stability were tested at the subunit interfaces of thermophilic and hyperthermophilic oligomers. The differences of the interface structural features observed between the hyperthermophilic and thermophilic enzymes were compared with the differences of the same properties calculated from pairwise comparisons of oligomeric mesophilic proteins contained in a reference dataset. The significance of the observed differences of structural properties was measured by a t-test. Ion pairs and hydrogen bonds do not vary significantly while hydrophobic contact area increases specially in hyperthermophilic interfaces. Interface compactness also appears to increase in the hyperthermophilic proteins. Variations of amino acid composition at the interfaces reflects the variation of the interface properties.
Pritchard, Caroline; O'Connor, Gavin; Ashcroft, Alison E
2013-08-06
To achieve comparability of measurement results of protein amount of substance content between clinical laboratories, suitable reference materials are required. The impact on measurement comparability of potential differences in the tertiary and quaternary structure of protein reference standards is as yet not well understood. With the use of human growth hormone as a model protein, the potential of ion mobility spectrometry-mass spectrometry as a tool to assess differences in the structure of protein reference materials and their interactions with antibodies has been investigated here.
Template-based structure modeling of protein-protein interactions
Szilagyi, Andras; Zhang, Yang
2014-01-01
The structure of protein-protein complexes can be constructed by using the known structure of other protein complexes as a template. The complex structure templates are generally detected either by homology-based sequence alignments or, given the structure of monomer components, by structure-based comparisons. Critical improvements have been made in recent years by utilizing interface recognition and by recombining monomer and complex template libraries. Encouraging progress has also been witnessed in genome-wide applications of template-based modeling, with modeling accuracy comparable to high-throughput experimental data. Nevertheless, bottlenecks exist due to the incompleteness of the proteinprotein complex structure library and the lack of methods for distant homologous template identification and full-length complex structure refinement. PMID:24721449
Modeling Structure and Dynamics of Protein Complexes with SAXS Profiles
Schneidman-Duhovny, Dina; Hammel, Michal
2018-01-01
Small-angle X-ray scattering (SAXS) is an increasingly common and useful technique for structural characterization of molecules in solution. A SAXS experiment determines the scattering intensity of a molecule as a function of spatial frequency, termed SAXS profile. SAXS profiles can be utilized in a variety of molecular modeling applications, such as comparing solution and crystal structures, structural characterization of flexible proteins, assembly of multi-protein complexes, and modeling of missing regions in the high-resolution structure. Here, we describe protocols for modeling atomic structures based on SAXS profiles. The first protocol is for comparing solution and crystal structures including modeling of missing regions and determination of the oligomeric state. The second protocol performs multi-state modeling by finding a set of conformations and their weights that fit the SAXS profile starting from a single-input structure. The third protocol is for protein-protein docking based on the SAXS profile of the complex. We describe the underlying software, followed by demonstrating their application on interleukin 33 (IL33) with its primary receptor ST2 and DNA ligase IV-XRCC4 complex. PMID:29605933
General overview on structure prediction of twilight-zone proteins.
Khor, Bee Yin; Tye, Gee Jun; Lim, Theam Soon; Choong, Yee Siew
2015-09-04
Protein structure prediction from amino acid sequence has been one of the most challenging aspects in computational structural biology despite significant progress in recent years showed by critical assessment of protein structure prediction (CASP) experiments. When experimentally determined structures are unavailable, the predictive structures may serve as starting points to study a protein. If the target protein consists of homologous region, high-resolution (typically <1.5 Å) model can be built via comparative modelling. However, when confronted with low sequence similarity of the target protein (also known as twilight-zone protein, sequence identity with available templates is less than 30%), the protein structure prediction has to be initiated from scratch. Traditionally, twilight-zone proteins can be predicted via threading or ab initio method. Based on the current trend, combination of different methods brings an improved success in the prediction of twilight-zone proteins. In this mini review, the methods, progresses and challenges for the prediction of twilight-zone proteins were discussed.
Yu, Clinton; Huszagh, Alexander; Viner, Rosa; Novitsky, Eric J; Rychnovsky, Scott D; Huang, Lan
2016-10-18
Cross-linking mass spectrometry (XL-MS) represents a recently popularized hybrid methodology for defining protein-protein interactions (PPIs) and analyzing structures of large protein assemblies. In particular, XL-MS strategies have been demonstrated to be effective in elucidating molecular details of PPIs at the peptide resolution, providing a complementary set of structural data that can be utilized to refine existing complex structures or direct de novo modeling of unknown protein structures. To study structural and interaction dynamics of protein complexes, quantitative cross-linking mass spectrometry (QXL-MS) strategies based on isotope-labeled cross-linkers have been developed. Although successful, these approaches are mostly limited to pairwise comparisons. In order to establish a robust workflow enabling comparative analysis of multiple cross-linked samples simultaneously, we have developed a multiplexed QXL-MS strategy, namely, QMIX (Quantitation of Multiplexed, Isobaric-labeled cross (X)-linked peptides) by integrating MS-cleavable cross-linkers with isobaric labeling reagents. This study has established a new analytical platform for quantitative analysis of cross-linked peptides, which can be directly applied for multiplexed comparisons of the conformational dynamics of protein complexes and PPIs at the proteome scale in future studies.
Relationships between residue Voronoi volume and sequence conservation in proteins.
Liu, Jen-Wei; Cheng, Chih-Wen; Lin, Yu-Feng; Chen, Shao-Yu; Hwang, Jenn-Kang; Yen, Shih-Chung
2018-02-01
Functional and biophysical constraints can cause different levels of sequence conservation in proteins. Previously, structural properties, e.g., relative solvent accessibility (RSA) and packing density of the weighted contact number (WCN), have been found to be related to protein sequence conservation (CS). The Voronoi volume has recently been recognized as a new structural property of the local protein structural environment reflecting CS. However, for surface residues, it is sensitive to water molecules surrounding the protein structure. Herein, we present a simple structural determinant termed the relative space of Voronoi volume (RSV); it uses the Voronoi volume and the van der Waals volume of particular residues to quantify the local structural environment. RSV (range, 0-1) is defined as (Voronoi volume-van der Waals volume)/Voronoi volume of the target residue. The concept of RSV describes the extent of available space for every protein residue. RSV and Voronoi profiles with and without water molecules (RSVw, RSV, VOw, and VO) were compared for 554 non-homologous proteins. RSV (without water) showed better Pearson's correlations with CS than did RSVw, VO, or VOw values. The mean correlation coefficient between RSV and CS was 0.51, which is comparable to the correlation between RSA and CS (0.49) and that between WCN and CS (0.56). RSV is a robust structural descriptor with and without water molecules and can quantitatively reflect evolutionary information in a single protein structure. Therefore, it may represent a practical structural determinant to study protein sequence, structure, and function relationships. Copyright © 2017 Elsevier B.V. All rights reserved.
Peng, Quanhui; Khan, Nazir A; Wang, Zhisheng; Yu, Peiqiang
2014-01-01
The objectives of the present study were to investigate the nutritive value of camelina seeds (Camelina sativa L. Crantz) in ruminant nutrition and to use molecular spectroscopy as a novel technique to quantify the heat-induced changes in protein molecular structures in relation to protein digestive behavior in the rumen and intestine of dairy cattle. In this study, camelina seeds were used as a model for feed protein. The seeds were kept as raw (control) or heated in an autoclave (moist heating) or in an air-draft oven (dry heating) at 120°C for 60 min. The parameters evaluated were (1) chemical profiles, (2) Cornell Net Protein and Carbohydrate System protein subfractions, (3) nutrient digestibilities and estimated energy values, (4) in situ rumen degradation and intestinal digestibility, and (5) protein molecular structures. Compared with raw seeds, moist heating markedly decreased (52.73 to 20.41%) the content of soluble protein and increased (2.00 to 9.01%) the content of neutral detergent insoluble protein in total crude protein (CP). Subsequently, the rapidly degradable Cornell Net Protein and Carbohydrate System CP fraction markedly decreased (45.06 to 16.69% CP), with a concomitant increase in the intermediately degradable (45.28 to 74.02% CP) and slowly degradable (1.13 to 8.02% CP) fractions, demonstrating a decrease in overall protein degradability in the rumen. The in situ rumen incubation study revealed that moist heating decreased (75.45 to 57.92%) rumen-degradable protein and increased (43.90 to 82.95%) intestinal digestibility of rumen-undegradable protein. The molecular spectroscopy study revealed that moist heating increased the amide I-to-amide II ratio and decreased α-helix and α-helix-to-β-sheet ratio. In contrast, dry heating did not significantly change CP solubility, rumen degradability, intestinal digestibility, and protein molecular structures compared with the raw seeds. Our results indicated that, compared with dry heating, moist heating markedly changed protein chemical profiles, protein subfractions, rumen protein degradability, and intestinal digestibility, which were associated with changes in protein molecular structures (amide I-to-amid II ratio and α-helix-to-β-sheet ratio). Moist heating improved the nutritive value and utilization of protein in camelina seeds compared with dry heating. Copyright © 2014 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Mallik, Saurav; Kundu, Sudip
2013-01-01
Here we compare the structural and evolutionary attributes of Thermus thermophilus and Escherichia coli small ribosomal subunits (SSU). Our results indicate that with few exceptions, thermophilic 16S ribosomal RNA (16S rRNA) is densely packed compared to that of mesophilic at most of the analogous spatial regions. In addition, we have located species-specific cavity clusters (SSCCs) in both species. E. coli SSCCs are numerous and larger compared to T. thermophilus SSCCs, which again indicates densely packed thermophilic 16S rRNA. Thermophilic ribosomal proteins (r-proteins) have longer disordered regions than their mesophilic homologs and they experience larger disorder-to-order transitions during SSU-assembly. This is reflected in the predicted higher conformational changes of thermophilic r-proteins compared to their mesophilic homologs during SSU-assembly. This high conformational change of thermophilic r-proteins may help them to associate with the 16S ribosomal RNA with high complementary interfaces, larger interface areas, and denser molecular contacts, compared to those of mesophilic. Thus, thermophilic protein-rRNA interfaces are tightly associated with 16S rRNA than their mesophilic homologs. Densely packed 16S rRNA interior and tight protein-rRNA binding of T. thermophilus (compared to those of E. coli) are likely the signatures of its thermal adaptation. We have found a linear correlation between the free energy of protein-RNA interface formation, interface size, and square of conformational changes, which is followed in both prokaryotic and eukaryotic SSU. Disorder is associated with high protein-RNA interface polarity. We have found an evolutionary tendency to maintain high polarity (thereby disorder) at protein-rRNA interfaces, than that at rest of the protein structures. However, some proteins exhibit exceptions to this general trend. PMID:23940533
Nema, Vijay; Pal, Sudhir Kumar
2013-01-01
This study was conducted to find the best suited freely available software for modelling of proteins by taking a few sample proteins. The proteins used were small to big in size with available crystal structures for the purpose of benchmarking. Key players like Phyre2, Swiss-Model, CPHmodels-3.0, Homer, (PS)2, (PS)(2)-V(2), Modweb were used for the comparison and model generation. Benchmarking process was done for four proteins, Icl, InhA, and KatG of Mycobacterium tuberculosis and RpoB of Thermus Thermophilus to get the most suited software. Parameters compared during analysis gave relatively better values for Phyre2 and Swiss-Model. This comparative study gave the information that Phyre2 and Swiss-Model make good models of small and large proteins as compared to other screened software. Other software was also good but is often not very efficient in providing full-length and properly folded structure.
Aftab, D T; Ballas, L M; Loomis, C R; Hait, W N
1991-11-01
Phenothiazines are known to inhibit the activity of protein kinase C. To identify structural features that determine inhibitory activity against the enzyme, we utilized a semiautomated assay [Anal. Biochem. 187:84-88 (1990)] to compare the potency of greater than 50 phenothiazines and related compounds. Potency was decreased by trifluoro substitution at position 2 on the phenothiazine nucleus and increased by quinoid structures on the nucleus. An alkyl bridge of at least three carbons connecting the terminal amine to the nucleus was required for activity. Primary amines and unsubstituted piperazines were the most potent amino side chains. We selected 7,8-dihydroxychlorpromazine (DHCP) (IC50 = 8.3 microM) and 2-chloro-9-(3-[1-piperazinyl]propylidene)thioxanthene (N751) (IC50 = 14 microM) for further study because of their potency and distinct structural features. Under standard (vesicle) assay conditions, DHCP was noncompetitive with respect to phosphatidylserine and a mixed-type inhibitor with respect to ATP. N751 was competitive with respect to phosphatidylserine and noncompetitive with respect to ATP. Using the mixed micelle assay, DHCP was a competitive inhibitor with respect to both phosphatidylserine and ATP. DHCP was selective for protein kinase C compared with cAMP-dependent protein kinase, calmodulin-dependent protein kinase type II, and casein kinase. N751 was more potent against protein kinase C compared with cAMP-dependent protein kinase and casein kinase but less potent against protein kinase C compared with calmodulin-dependent protein kinase type II. DHCP was analyzed for its ability to inhibit different isoenzymes of protein kinase C, and no significant isozyme selectivity was detected. These data provide important information for the rational design of more potent and selective inhibitors of protein kinase C.
Biological and functional relevance of CASP predictions.
Liu, Tianyun; Ish-Shalom, Shirbi; Torng, Wen; Lafita, Aleix; Bock, Christian; Mort, Matthew; Cooper, David N; Bliven, Spencer; Capitani, Guido; Mooney, Sean D; Altman, Russ B
2018-03-01
Our goal is to answer the question: compared with experimental structures, how useful are predicted models for functional annotation? We assessed the functional utility of predicted models by comparing the performances of a suite of methods for functional characterization on the predictions and the experimental structures. We identified 28 sites in 25 protein targets to perform functional assessment. These 28 sites included nine sites with known ligand binding (holo-sites), nine sites that are expected or suggested by experimental authors for small molecule binding (apo-sites), and Ten sites containing important motifs, loops, or key residues with important disease-associated mutations. We evaluated the utility of the predictions by comparing their microenvironments to the experimental structures. Overall structural quality correlates with functional utility. However, the best-ranked predictions (global) may not have the best functional quality (local). Our assessment provides an ability to discriminate between predictions with high structural quality. When assessing ligand-binding sites, most prediction methods have higher performance on apo-sites than holo-sites. Some servers show consistently high performance for certain types of functional sites. Finally, many functional sites are associated with protein-protein interaction. We also analyzed biologically relevant features from the protein assemblies of two targets where the active site spanned the protein-protein interface. For the assembly targets, we find that the features in the models are mainly determined by the choice of template. © 2017 The Authors Proteins: Structure, Function and Bioinformatics Published by Wiley Periodicals, Inc.
Planchard, Noelya; Point, Élodie; Dahmane, Tassadite; Giusti, Fabrice; Renault, Marie; Le Bon, Christel; Durand, Grégory; Milon, Alain; Guittet, Éric; Zoonens, Manuela; Popot, Jean-Luc; Catoire, Laurent J
2014-10-01
Solution-state nuclear magnetic resonance studies of membrane proteins are facilitated by the increased stability that trapping with amphipols confers to most of them as compared to detergent solutions. They have yielded information on the state of folding of the proteins, their areas of contact with the polymer, their dynamics, water accessibility, and the structure of protein-bound ligands. They benefit from the diversification of amphipol chemical structures and the availability of deuterated amphipols. The advantages and constraints of working with amphipols are discussed and compared to those associated with other non-conventional environments, such as bicelles and nanodiscs.
Kato, Koichi; Nakayoshi, Tomoki; Fukuyoshi, Shuichi; Kurimoto, Eiji; Oda, Akifumi
2017-10-12
Although various higher-order protein structure prediction methods have been developed, almost all of them were developed based on the three-dimensional (3D) structure information of known proteins. Here we predicted the short protein structures by molecular dynamics (MD) simulations in which only Newton's equations of motion were used and 3D structural information of known proteins was not required. To evaluate the ability of MD simulationto predict protein structures, we calculated seven short test protein (10-46 residues) in the denatured state and compared their predicted and experimental structures. The predicted structure for Trp-cage (20 residues) was close to the experimental structure by 200-ns MD simulation. For proteins shorter or longer than Trp-cage, root-mean square deviation values were larger than those for Trp-cage. However, secondary structures could be reproduced by MD simulations for proteins with 10-34 residues. Simulations by replica exchange MD were performed, but the results were similar to those from normal MD simulations. These results suggest that normal MD simulations can roughly predict short protein structures and 200-ns simulations are frequently sufficient for estimating the secondary structures of protein (approximately 20 residues). Structural prediction method using only fundamental physical laws are useful for investigating non-natural proteins, such as primitive proteins and artificial proteins for peptide-based drug delivery systems.
Scavuzzo-Duggan, Tess R.; Chaves, Arielle M.; Roberts, Alison W.
2015-07-14
Here, a method for rapid in vivo functional analysis of engineered proteins was developed using Physcomitrella patens. A complementation assay was designed for testing structure/function relationships in cellulose synthase (CESA) proteins. The components of the assay include (1) construction of test vectors that drive expression of epitope-tagged PpCESA5 carrying engineered mutations, (2) transformation of a ppcesa5 knockout line that fails to produce gametophores with test and control vectors, (3) scoring the stable transformants for gametophore production, (4) statistical analysis comparing complementation rates for test vectors to positive and negative control vectors, and (5) analysis of transgenic protein expression by Westernmore » blotting. The assay distinguished mutations that generate fully functional, nonfunctional, and partially functional proteins. In conclusion, compared with existing methods for in vivo testing of protein function, this complementation assay provides a rapid method for investigating protein structure/function relationships in plants.« less
Mallik, Saurav; Kundu, Sudip
2015-01-01
Using the available crystal structures of 50S ribosomal subunits from three prokaryotic species: Escherichia coli (mesophilic), Thermus thermophilus (thermophilic), and Haloarcula marismortui (halophilic), we have analyzed different structural features of ribosomal RNAs (rRNAs), proteins, and of their interfaces. We have correlated these structural features with the environmental adaptation strategies of the corresponding species. While dense intra-rRNA packing is observed in thermophilic, loose intra-rRNA packing is observed in halophilic (both compared to mesophilic). Interestingly, protein-rRNA interfaces of both the extremophiles are densely packed compared to that of the mesophilic. The intersubunit bridge regions are almost devoid of cavities, probably ensuring the proper formation of each bridge (by not allowing any loosely packed region nearby). During rRNA binding, the ribosomal proteins experience some structural transitions. Here, we have analyzed the intrinsically disordered and ordered regions of the ribosomal proteins, which are subjected to such transitions. The intrinsically disordered and disorder-to-order transition sites of the thermophilic and mesophilic ribosomal proteins are simultaneously (i) highly conserved and (ii) slowly evolving compared to rest of the protein structure. Although high conservation is observed at such sites of halophilic ribosomal proteins, but slow rate of evolution is absent. Such differences between thermophilic, mesophilic, and halophilic can be explained from their environmental adaptation strategy. Interestingly, a universal biophysical principle evident by a linear relationship between the free energy of interface formation, interface area, and structural changes of r-proteins during assembly is always maintained, irrespective of the environmental conditions.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Raymond, Amy; Lovell, Scott; Lorimer, Don
2009-12-01
With the goal of improving yield and success rates of heterologous protein production for structural studies we have developed the database and algorithm software package Gene Composer. This freely available electronic tool facilitates the information-rich design of protein constructs and their engineered synthetic gene sequences, as detailed in the accompanying manuscript. In this report, we compare heterologous protein expression levels from native sequences to that of codon engineered synthetic gene constructs designed by Gene Composer. A test set of proteins including a human kinase (P38{alpha}), viral polymerase (HCV NS5B), and bacterial structural protein (FtsZ) were expressed in both E. colimore » and a cell-free wheat germ translation system. We also compare the protein expression levels in E. coli for a set of 11 different proteins with greatly varied G:C content and codon bias. The results consistently demonstrate that protein yields from codon engineered Gene Composer designs are as good as or better than those achieved from the synonymous native genes. Moreover, structure guided N- and C-terminal deletion constructs designed with the aid of Gene Composer can lead to greater success in gene to structure work as exemplified by the X-ray crystallographic structure determination of FtsZ from Bacillus subtilis. These results validate the Gene Composer algorithms, and suggest that using a combination of synthetic gene and protein construct engineering tools can improve the economics of gene to structure research.« less
Rysavy, Steven J; Beck, David A C; Daggett, Valerie
2014-11-01
Protein function is intimately linked to protein structure and dynamics yet experimentally determined structures frequently omit regions within a protein due to indeterminate data, which is often due protein dynamics. We propose that atomistic molecular dynamics simulations provide a diverse sampling of biologically relevant structures for these missing segments (and beyond) to improve structural modeling and structure prediction. Here we make use of the Dynameomics data warehouse, which contains simulations of representatives of essentially all known protein folds. We developed novel computational methods to efficiently identify, rank and retrieve small peptide structures, or fragments, from this database. We also created a novel data model to analyze and compare large repositories of structural data, such as contained within the Protein Data Bank and the Dynameomics data warehouse. Our evaluation compares these structural repositories for improving loop predictions and analyzes the utility of our methods and models. Using a standard set of loop structures, containing 510 loops, 30 for each loop length from 4 to 20 residues, we find that the inclusion of Dynameomics structures in fragment-based methods improves the quality of the loop predictions without being dependent on sequence homology. Depending on loop length, ∼ 25-75% of the best predictions came from the Dynameomics set, resulting in lower main chain root-mean-square deviations for all fragment lengths using the combined fragment library. We also provide specific cases where Dynameomics fragments provide better predictions for NMR loop structures than fragments from crystal structures. Online access to these fragment libraries is available at http://www.dynameomics.org/fragments. © 2014 The Protein Society.
Discriminative structural approaches for enzyme active-site prediction.
Kato, Tsuyoshi; Nagano, Nozomi
2011-02-15
Predicting enzyme active-sites in proteins is an important issue not only for protein sciences but also for a variety of practical applications such as drug design. Because enzyme reaction mechanisms are based on the local structures of enzyme active-sites, various template-based methods that compare local structures in proteins have been developed to date. In comparing such local sites, a simple measurement, RMSD, has been used so far. This paper introduces new machine learning algorithms that refine the similarity/deviation for comparison of local structures. The similarity/deviation is applied to two types of applications, single template analysis and multiple template analysis. In the single template analysis, a single template is used as a query to search proteins for active sites, whereas a protein structure is examined as a query to discover the possible active-sites using a set of templates in the multiple template analysis. This paper experimentally illustrates that the machine learning algorithms effectively improve the similarity/deviation measurements for both the analyses.
Hensen, Ulf; Meyer, Tim; Haas, Jürgen; Rex, René; Vriend, Gert; Grubmüller, Helmut
2012-01-01
Proteins are usually described and classified according to amino acid sequence, structure or function. Here, we develop a minimally biased scheme to compare and classify proteins according to their internal mobility patterns. This approach is based on the notion that proteins not only fold into recurring structural motifs but might also be carrying out only a limited set of recurring mobility motifs. The complete set of these patterns, which we tentatively call the dynasome, spans a multi-dimensional space with axes, the dynasome descriptors, characterizing different aspects of protein dynamics. The unique dynamic fingerprint of each protein is represented as a vector in the dynasome space. The difference between any two vectors, consequently, gives a reliable measure of the difference between the corresponding protein dynamics. We characterize the properties of the dynasome by comparing the dynamics fingerprints obtained from molecular dynamics simulations of 112 proteins but our approach is, in principle, not restricted to any specific source of data of protein dynamics. We conclude that: 1. the dynasome consists of a continuum of proteins, rather than well separated classes. 2. For the majority of proteins we observe strong correlations between structure and dynamics. 3. Proteins with similar function carry out similar dynamics, which suggests a new method to improve protein function annotation based on protein dynamics. PMID:22606222
Dawson, Natalie L; Sillitoe, Ian; Lees, Jonathan G; Lam, Su Datt; Orengo, Christine A
2017-01-01
This chapter describes the generation of the data in the CATH-Gene3D online resource and how it can be used to study protein domains and their evolutionary relationships. Methods will be presented for: comparing protein structures, recognizing homologs, predicting domain structures within protein sequences, and subclassifying superfamilies into functionally pure families, together with a guide on using the webpages.
Automatic Classification of Protein Structure Using the Maximum Contact Map Overlap Metric
DOE Office of Scientific and Technical Information (OSTI.GOV)
Andonov, Rumen; Djidjev, Hristo Nikolov; Klau, Gunnar W.
In this paper, we propose a new distance measure for comparing two protein structures based on their contact map representations. We show that our novel measure, which we refer to as the maximum contact map overlap (max-CMO) metric, satisfies all properties of a metric on the space of protein representations. Having a metric in that space allows one to avoid pairwise comparisons on the entire database and, thus, to significantly accelerate exploring the protein space compared to no-metric spaces. We show on a gold standard superfamily classification benchmark set of 6759 proteins that our exact k-nearest neighbor (k-NN) scheme classifiesmore » up to 224 out of 236 queries correctly and on a larger, extended version of the benchmark with 60; 850 additional structures, up to 1361 out of 1369 queries. Finally, our k-NN classification thus provides a promising approach for the automatic classification of protein structures based on flexible contact map overlap alignments.« less
Automatic Classification of Protein Structure Using the Maximum Contact Map Overlap Metric
Andonov, Rumen; Djidjev, Hristo Nikolov; Klau, Gunnar W.; ...
2015-10-09
In this paper, we propose a new distance measure for comparing two protein structures based on their contact map representations. We show that our novel measure, which we refer to as the maximum contact map overlap (max-CMO) metric, satisfies all properties of a metric on the space of protein representations. Having a metric in that space allows one to avoid pairwise comparisons on the entire database and, thus, to significantly accelerate exploring the protein space compared to no-metric spaces. We show on a gold standard superfamily classification benchmark set of 6759 proteins that our exact k-nearest neighbor (k-NN) scheme classifiesmore » up to 224 out of 236 queries correctly and on a larger, extended version of the benchmark with 60; 850 additional structures, up to 1361 out of 1369 queries. Finally, our k-NN classification thus provides a promising approach for the automatic classification of protein structures based on flexible contact map overlap alignments.« less
Implementation of a parallel protein structure alignment service on cloud.
Hung, Che-Lun; Lin, Yaw-Ling
2013-01-01
Protein structure alignment has become an important strategy by which to identify evolutionary relationships between protein sequences. Several alignment tools are currently available for online comparison of protein structures. In this paper, we propose a parallel protein structure alignment service based on the Hadoop distribution framework. This service includes a protein structure alignment algorithm, a refinement algorithm, and a MapReduce programming model. The refinement algorithm refines the result of alignment. To process vast numbers of protein structures in parallel, the alignment and refinement algorithms are implemented using MapReduce. We analyzed and compared the structure alignments produced by different methods using a dataset randomly selected from the PDB database. The experimental results verify that the proposed algorithm refines the resulting alignments more accurately than existing algorithms. Meanwhile, the computational performance of the proposed service is proportional to the number of processors used in our cloud platform.
Implementation of a Parallel Protein Structure Alignment Service on Cloud
Hung, Che-Lun; Lin, Yaw-Ling
2013-01-01
Protein structure alignment has become an important strategy by which to identify evolutionary relationships between protein sequences. Several alignment tools are currently available for online comparison of protein structures. In this paper, we propose a parallel protein structure alignment service based on the Hadoop distribution framework. This service includes a protein structure alignment algorithm, a refinement algorithm, and a MapReduce programming model. The refinement algorithm refines the result of alignment. To process vast numbers of protein structures in parallel, the alignment and refinement algorithms are implemented using MapReduce. We analyzed and compared the structure alignments produced by different methods using a dataset randomly selected from the PDB database. The experimental results verify that the proposed algorithm refines the resulting alignments more accurately than existing algorithms. Meanwhile, the computational performance of the proposed service is proportional to the number of processors used in our cloud platform. PMID:23671842
Brown, Peter; Pullan, Wayne; Yang, Yuedong; Zhou, Yaoqi
2016-02-01
The three dimensional tertiary structure of a protein at near atomic level resolution provides insight alluding to its function and evolution. As protein structure decides its functionality, similarity in structure usually implies similarity in function. As such, structure alignment techniques are often useful in the classifications of protein function. Given the rapidly growing rate of new, experimentally determined structures being made available from repositories such as the Protein Data Bank, fast and accurate computational structure comparison tools are required. This paper presents SPalignNS, a non-sequential protein structure alignment tool using a novel asymmetrical greedy search technique. The performance of SPalignNS was evaluated against existing sequential and non-sequential structure alignment methods by performing trials with commonly used datasets. These benchmark datasets used to gauge alignment accuracy include (i) 9538 pairwise alignments implied by the HOMSTRAD database of homologous proteins; (ii) a subset of 64 difficult alignments from set (i) that have low structure similarity; (iii) 199 pairwise alignments of proteins with similar structure but different topology; and (iv) a subset of 20 pairwise alignments from the RIPC set. SPalignNS is shown to achieve greater alignment accuracy (lower or comparable root-mean squared distance with increased structure overlap coverage) for all datasets, and the highest agreement with reference alignments from the challenging dataset (iv) above, when compared with both sequentially constrained alignments and other non-sequential alignments. SPalignNS was implemented in C++. The source code, binary executable, and a web server version is freely available at: http://sparks-lab.org yaoqi.zhou@griffith.edu.au. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Jahandideh, Samad; Srinivasasainagendra, Vinodh; Zhi, Degui
2012-11-07
RNA-protein interaction plays an important role in various cellular processes, such as protein synthesis, gene regulation, post-transcriptional gene regulation, alternative splicing, and infections by RNA viruses. In this study, using Gene Ontology Annotated (GOA) and Structural Classification of Proteins (SCOP) databases an automatic procedure was designed to capture structurally solved RNA-binding protein domains in different subclasses. Subsequently, we applied tuned multi-class SVM (TMCSVM), Random Forest (RF), and multi-class ℓ1/ℓq-regularized logistic regression (MCRLR) for analysis and classifying RNA-binding protein domains based on a comprehensive set of sequence and structural features. In this study, we compared prediction accuracy of three different state-of-the-art predictor methods. From our results, TMCSVM outperforms the other methods and suggests the potential of TMCSVM as a useful tool for facilitating the multi-class prediction of RNA-binding protein domains. On the other hand, MCRLR by elucidating importance of features for their contribution in predictive accuracy of RNA-binding protein domains subclasses, helps us to provide some biological insights into the roles of sequences and structures in protein-RNA interactions.
Rescore protein-protein docked ensembles with an interface contact statistics.
Mezei, Mihaly
2017-02-01
The recently developed statistical measure for the type of residue-residue contact at protein complex interfaces, based on a parameter-free definition of contact, has been used to define a contact score that is correlated with the likelihood of correctness of a proposed complex structure. Comparing the proposed contact scores on the native structure and on a set of model structures the proposed measure was shown to generally favor the native structure but in itself was not able to reliably score the native structure to be the best. Adjusting the scores of redocking experiments with the contact score showed that the adjusted score was able to move up the ranking of the native-like structure among the proposed complexes when the native-like was not ranked the best by the respective program. Tests on docking of unbound proteins compared the contact scores of the complexes with the contact score of the crystal structure again showing the tendency of the contact score to favor native-like conformations. The possibility of using the contact score to improve the determination of biological dimers in a crystal structure was also explored. Proteins 2017; 85:235-241. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Modularity of Protein Folds as a Tool for Template-Free Modeling of Structures.
Vallat, Brinda; Madrid-Aliste, Carlos; Fiser, Andras
2015-08-01
Predicting the three-dimensional structure of proteins from their amino acid sequences remains a challenging problem in molecular biology. While the current structural coverage of proteins is almost exclusively provided by template-based techniques, the modeling of the rest of the protein sequences increasingly require template-free methods. However, template-free modeling methods are much less reliable and are usually applicable for smaller proteins, leaving much space for improvement. We present here a novel computational method that uses a library of supersecondary structure fragments, known as Smotifs, to model protein structures. The library of Smotifs has saturated over time, providing a theoretical foundation for efficient modeling. The method relies on weak sequence signals from remotely related protein structures to create a library of Smotif fragments specific to the target protein sequence. This Smotif library is exploited in a fragment assembly protocol to sample decoys, which are assessed by a composite scoring function. Since the Smotif fragments are larger in size compared to the ones used in other fragment-based methods, the proposed modeling algorithm, SmotifTF, can employ an exhaustive sampling during decoy assembly. SmotifTF successfully predicts the overall fold of the target proteins in about 50% of the test cases and performs competitively when compared to other state of the art prediction methods, especially when sequence signal to remote homologs is diminishing. Smotif-based modeling is complementary to current prediction methods and provides a promising direction in addressing the structure prediction problem, especially when targeting larger proteins for modeling.
Interactive comparison and remediation of collections of macromolecular structures.
Moriarty, Nigel W; Liebschner, Dorothee; Klei, Herbert E; Echols, Nathaniel; Afonine, Pavel V; Headd, Jeffrey J; Poon, Billy K; Adams, Paul D
2018-01-01
Often similar structures need to be compared to reveal local differences throughout the entire model or between related copies within the model. Therefore, a program to compare multiple structures and enable correction any differences not supported by the density map was written within the Phenix framework (Adams et al., Acta Cryst 2010; D66:213-221). This program, called Structure Comparison, can also be used for structures with multiple copies of the same protein chain in the asymmetric unit, that is, as a result of non-crystallographic symmetry (NCS). Structure Comparison was designed to interface with Coot(Emsley et al., Acta Cryst 2010; D66:486-501) and PyMOL(DeLano, PyMOL 0.99; 2002) to facilitate comparison of large numbers of related structures. Structure Comparison analyzes collections of protein structures using several metrics, such as the rotamer conformation of equivalent residues, displays the results in tabular form and allows superimposed protein chains and density maps to be quickly inspected and edited (via the tools in Coot) for consistency, completeness and correctness. © 2017 The Protein Society.
@TOME-2: a new pipeline for comparative modeling of protein–ligand complexes
Pons, Jean-Luc; Labesse, Gilles
2009-01-01
@TOME 2.0 is new web pipeline dedicated to protein structure modeling and small ligand docking based on comparative analyses. @TOME 2.0 allows fold recognition, template selection, structural alignment editing, structure comparisons, 3D-model building and evaluation. These tasks are routinely used in sequence analyses for structure prediction. In our pipeline the necessary software is efficiently interconnected in an original manner to accelerate all the processes. Furthermore, we have also connected comparative docking of small ligands that is performed using protein–protein superposition. The input is a simple protein sequence in one-letter code with no comment. The resulting 3D model, protein–ligand complexes and structural alignments can be visualized through dedicated Web interfaces or can be downloaded for further studies. These original features will aid in the functional annotation of proteins and the selection of templates for molecular modeling and virtual screening. Several examples are described to highlight some of the new functionalities provided by this pipeline. The server and its documentation are freely available at http://abcis.cbs.cnrs.fr/AT2/ PMID:19443448
A Stochastic Point Cloud Sampling Method for Multi-Template Protein Comparative Modeling.
Li, Jilong; Cheng, Jianlin
2016-05-10
Generating tertiary structural models for a target protein from the known structure of its homologous template proteins and their pairwise sequence alignment is a key step in protein comparative modeling. Here, we developed a new stochastic point cloud sampling method, called MTMG, for multi-template protein model generation. The method first superposes the backbones of template structures, and the Cα atoms of the superposed templates form a point cloud for each position of a target protein, which are represented by a three-dimensional multivariate normal distribution. MTMG stochastically resamples the positions for Cα atoms of the residues whose positions are uncertain from the distribution, and accepts or rejects new position according to a simulated annealing protocol, which effectively removes atomic clashes commonly encountered in multi-template comparative modeling. We benchmarked MTMG on 1,033 sequence alignments generated for CASP9, CASP10 and CASP11 targets, respectively. Using multiple templates with MTMG improves the GDT-TS score and TM-score of structural models by 2.96-6.37% and 2.42-5.19% on the three datasets over using single templates. MTMG's performance was comparable to Modeller in terms of GDT-TS score, TM-score, and GDT-HA score, while the average RMSD was improved by a new sampling approach. The MTMG software is freely available at: http://sysbio.rnet.missouri.edu/multicom_toolbox/mtmg.html.
A Stochastic Point Cloud Sampling Method for Multi-Template Protein Comparative Modeling
Li, Jilong; Cheng, Jianlin
2016-01-01
Generating tertiary structural models for a target protein from the known structure of its homologous template proteins and their pairwise sequence alignment is a key step in protein comparative modeling. Here, we developed a new stochastic point cloud sampling method, called MTMG, for multi-template protein model generation. The method first superposes the backbones of template structures, and the Cα atoms of the superposed templates form a point cloud for each position of a target protein, which are represented by a three-dimensional multivariate normal distribution. MTMG stochastically resamples the positions for Cα atoms of the residues whose positions are uncertain from the distribution, and accepts or rejects new position according to a simulated annealing protocol, which effectively removes atomic clashes commonly encountered in multi-template comparative modeling. We benchmarked MTMG on 1,033 sequence alignments generated for CASP9, CASP10 and CASP11 targets, respectively. Using multiple templates with MTMG improves the GDT-TS score and TM-score of structural models by 2.96–6.37% and 2.42–5.19% on the three datasets over using single templates. MTMG’s performance was comparable to Modeller in terms of GDT-TS score, TM-score, and GDT-HA score, while the average RMSD was improved by a new sampling approach. The MTMG software is freely available at: http://sysbio.rnet.missouri.edu/multicom_toolbox/mtmg.html. PMID:27161489
Predictive and comparative analysis of Ebolavirus proteins
Cong, Qian; Pei, Jimin; Grishin, Nick V
2015-01-01
Ebolavirus is the pathogen for Ebola Hemorrhagic Fever (EHF). This disease exhibits a high fatality rate and has recently reached a historically epidemic proportion in West Africa. Out of the 5 known Ebolavirus species, only Reston ebolavirus has lost human pathogenicity, while retaining the ability to cause EHF in long-tailed macaque. Significant efforts have been spent to determine the three-dimensional (3D) structures of Ebolavirus proteins, to study their interaction with host proteins, and to identify the functional motifs in these viral proteins. Here, in light of these experimental results, we apply computational analysis to predict the 3D structures and functional sites for Ebolavirus protein domains with unknown structure, including a zinc-finger domain of VP30, the RNA-dependent RNA polymerase catalytic domain and a methyltransferase domain of protein L. In addition, we compare sequences of proteins that interact with Ebolavirus proteins from RESTV-resistant primates with those from RESTV-susceptible monkeys. The host proteins that interact with GP and VP35 show an elevated level of sequence divergence between the RESTV-resistant and RESTV-susceptible species, suggesting that they may be responsible for host specificity. Meanwhile, we detect variable positions in protein sequences that are likely associated with the loss of human pathogenicity in RESTV, map them onto the 3D structures and compare their positions to known functional sites. VP35 and VP30 are significantly enriched in these potential pathogenicity determinants and the clustering of such positions on the surfaces of VP35 and GP suggests possible uncharacterized interaction sites with host proteins that contribute to the virulence of Ebolavirus. PMID:26158395
Predictive and comparative analysis of Ebolavirus proteins.
Cong, Qian; Pei, Jimin; Grishin, Nick V
2015-01-01
Ebolavirus is the pathogen for Ebola Hemorrhagic Fever (EHF). This disease exhibits a high fatality rate and has recently reached a historically epidemic proportion in West Africa. Out of the 5 known Ebolavirus species, only Reston ebolavirus has lost human pathogenicity, while retaining the ability to cause EHF in long-tailed macaque. Significant efforts have been spent to determine the three-dimensional (3D) structures of Ebolavirus proteins, to study their interaction with host proteins, and to identify the functional motifs in these viral proteins. Here, in light of these experimental results, we apply computational analysis to predict the 3D structures and functional sites for Ebolavirus protein domains with unknown structure, including a zinc-finger domain of VP30, the RNA-dependent RNA polymerase catalytic domain and a methyltransferase domain of protein L. In addition, we compare sequences of proteins that interact with Ebolavirus proteins from RESTV-resistant primates with those from RESTV-susceptible monkeys. The host proteins that interact with GP and VP35 show an elevated level of sequence divergence between the RESTV-resistant and RESTV-susceptible species, suggesting that they may be responsible for host specificity. Meanwhile, we detect variable positions in protein sequences that are likely associated with the loss of human pathogenicity in RESTV, map them onto the 3D structures and compare their positions to known functional sites. VP35 and VP30 are significantly enriched in these potential pathogenicity determinants and the clustering of such positions on the surfaces of VP35 and GP suggests possible uncharacterized interaction sites with host proteins that contribute to the virulence of Ebolavirus.
Statistical inference of protein structural alignments using information and compression.
Collier, James H; Allison, Lloyd; Lesk, Arthur M; Stuckey, Peter J; Garcia de la Banda, Maria; Konagurthu, Arun S
2017-04-01
Structural molecular biology depends crucially on computational techniques that compare protein three-dimensional structures and generate structural alignments (the assignment of one-to-one correspondences between subsets of amino acids based on atomic coordinates). Despite its importance, the structural alignment problem has not been formulated, much less solved, in a consistent and reliable way. To overcome these difficulties, we present here a statistical framework for the precise inference of structural alignments, built on the Bayesian and information-theoretic principle of Minimum Message Length (MML). The quality of any alignment is measured by its explanatory power-the amount of lossless compression achieved to explain the protein coordinates using that alignment. We have implemented this approach in MMLigner , the first program able to infer statistically significant structural alignments. We also demonstrate the reliability of MMLigner 's alignment results when compared with the state of the art. Importantly, MMLigner can also discover different structural alignments of comparable quality, a challenging problem for oligomers and protein complexes. Source code, binaries and an interactive web version are available at http://lcb.infotech.monash.edu.au/mmligner . arun.konagurthu@monash.edu. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Zemla, Adam T; Lang, Dorothy M; Kostova, Tanya; Andino, Raul; Ecale Zhou, Carol L
2011-06-02
Most of the currently used methods for protein function prediction rely on sequence-based comparisons between a query protein and those for which a functional annotation is provided. A serious limitation of sequence similarity-based approaches for identifying residue conservation among proteins is the low confidence in assigning residue-residue correspondences among proteins when the level of sequence identity between the compared proteins is poor. Multiple sequence alignment methods are more satisfactory--still, they cannot provide reliable results at low levels of sequence identity. Our goal in the current work was to develop an algorithm that could help overcome these difficulties by facilitating the identification of structurally (and possibly functionally) relevant residue-residue correspondences between compared protein structures. Here we present StralSV (structure-alignment sequence variability), a new algorithm for detecting closely related structure fragments and quantifying residue frequency from tight local structure alignments. We apply StralSV in a study of the RNA-dependent RNA polymerase of poliovirus, and we demonstrate that the algorithm can be used to determine regions of the protein that are relatively unique, or that share structural similarity with proteins that would be considered distantly related. By quantifying residue frequencies among many residue-residue pairs extracted from local structural alignments, one can infer potential structural or functional importance of specific residues that are determined to be highly conserved or that deviate from a consensus. We further demonstrate that considerable detailed structural and phylogenetic information can be derived from StralSV analyses. StralSV is a new structure-based algorithm for identifying and aligning structure fragments that have similarity to a reference protein. StralSV analysis can be used to quantify residue-residue correspondences and identify residues that may be of particular structural or functional importance, as well as unusual or unexpected residues at a given sequence position. StralSV is provided as a web service at http://proteinmodel.org/AS2TS/STRALSV/.
Krissinel, E; Henrick, K
2004-12-01
The present paper describes the SSM algorithm of protein structure comparison in three dimensions, which includes an original procedure of matching graphs built on the protein's secondary-structure elements, followed by an iterative three-dimensional alignment of protein backbone Calpha atoms. The SSM results are compared with those obtained from other protein comparison servers, and the advantages and disadvantages of different scores that are used for structure recognition are discussed. A new score, balancing the r.m.s.d. and alignment length Nalign, is proposed. It is found that different servers agree reasonably well on the new score, while showing considerable differences in r.m.s.d. and Nalign.
Comparative analyses of putative toxin gene homologs from an Old World viper, Daboia russelii
Krishnan, Neeraja M.
2017-01-01
Availability of snake genome sequences has opened up exciting areas of research on comparative genomics and gene diversity. One of the challenges in studying snake genomes is the acquisition of biological material from live animals, especially from the venomous ones, making the process cumbersome and time-consuming. Here, we report comparative sequence analyses of putative toxin gene homologs from Russell’s viper (Daboia russelii) using whole-genome sequencing data obtained from shed skin. When compared with the major venom proteins in Russell’s viper studied previously, we found 45–100% sequence similarity between the venom proteins and their putative homologs in the skin. Additionally, comparative analyses of 20 putative toxin gene family homologs provided evidence of unique sequence motifs in nerve growth factor (NGF), platelet derived growth factor (PDGF), Kunitz/Bovine pancreatic trypsin inhibitor (Kunitz BPTI), cysteine-rich secretory proteins, antigen 5, andpathogenesis-related1 proteins (CAP) and cysteine-rich secretory protein (CRISP). In those derived proteins, we identified V11 and T35 in the NGF domain; F23 and A29 in the PDGF domain; N69, K2 and A5 in the CAP domain; and Q17 in the CRISP domain to be responsible for differences in the largest pockets across the protein domain structures in crotalines, viperines and elapids from the in silico structure-based analysis. Similarly, residues F10, Y11 and E20 appear to play an important role in the protein structures across the kunitz protein domain of viperids and elapids. Our study highlights the usefulness of shed skin in obtaining good quality high-molecular weight DNA for comparative genomic studies, and provides evidence towards the unique features and evolution of putative venom gene homologs in vipers. PMID:29230357
Taking structure searches to the next dimension.
Schafferhans, Andrea; Rost, Burkhard
2014-07-08
Structure comparisons are now the first step when a new experimental high-resolution protein structure has been determined. In this issue of Structure, Wiederstein and colleagues describe their latest tool for comparing structures, which gives us the unprecedented power to discover crucial structural connections between whole complexes of proteins in the full structural database in real time. Copyright © 2014 Elsevier Ltd. All rights reserved.
Rysavy, Steven J; Beck, David AC; Daggett, Valerie
2014-01-01
Protein function is intimately linked to protein structure and dynamics yet experimentally determined structures frequently omit regions within a protein due to indeterminate data, which is often due protein dynamics. We propose that atomistic molecular dynamics simulations provide a diverse sampling of biologically relevant structures for these missing segments (and beyond) to improve structural modeling and structure prediction. Here we make use of the Dynameomics data warehouse, which contains simulations of representatives of essentially all known protein folds. We developed novel computational methods to efficiently identify, rank and retrieve small peptide structures, or fragments, from this database. We also created a novel data model to analyze and compare large repositories of structural data, such as contained within the Protein Data Bank and the Dynameomics data warehouse. Our evaluation compares these structural repositories for improving loop predictions and analyzes the utility of our methods and models. Using a standard set of loop structures, containing 510 loops, 30 for each loop length from 4 to 20 residues, we find that the inclusion of Dynameomics structures in fragment-based methods improves the quality of the loop predictions without being dependent on sequence homology. Depending on loop length, ∼25–75% of the best predictions came from the Dynameomics set, resulting in lower main chain root-mean-square deviations for all fragment lengths using the combined fragment library. We also provide specific cases where Dynameomics fragments provide better predictions for NMR loop structures than fragments from crystal structures. Online access to these fragment libraries is available at http://www.dynameomics.org/fragments. PMID:25142412
In situ structural analysis of the Yersinia enterocolitica injectisome
Kudryashev, Mikhail; Stenta, Marco; Schmelz, Stefan; Amstutz, Marlise; Wiesand, Ulrich; Castaño-Díez, Daniel; Degiacomi, Matteo T; Münnich, Stefan; Bleck, Christopher KE; Kowal, Julia; Diepold, Andreas; Heinz, Dirk W; Dal Peraro, Matteo; Cornelis, Guy R; Stahlberg, Henning
2013-01-01
Injectisomes are multi-protein transmembrane machines allowing pathogenic bacteria to inject effector proteins into eukaryotic host cells, a process called type III secretion. Here we present the first three-dimensional structure of Yersinia enterocolitica and Shigella flexneri injectisomes in situ and the first structural analysis of the Yersinia injectisome. Unexpectedly, basal bodies of injectisomes inside the bacterial cells showed length variations of 20%. The in situ structures of the Y. enterocolitica and S. flexneri injectisomes had similar dimensions and were significantly longer than the isolated structures of related injectisomes. The crystal structure of the inner membrane injectisome component YscD appeared elongated compared to a homologous protein, and molecular dynamics simulations documented its elongation elasticity. The ring-shaped secretin YscC at the outer membrane was stretched by 30–40% in situ, compared to its isolated liposome-embedded conformation. We suggest that elasticity is critical for some two-membrane spanning protein complexes to cope with variations in the intermembrane distance. DOI: http://dx.doi.org/10.7554/eLife.00792.001 PMID:23908767
(PS)2: protein structure prediction server version 3.0.
Huang, Tsun-Tsao; Hwang, Jenn-Kang; Chen, Chu-Huang; Chu, Chih-Sheng; Lee, Chi-Wen; Chen, Chih-Chieh
2015-07-01
Protein complexes are involved in many biological processes. Examining coupling between subunits of a complex would be useful to understand the molecular basis of protein function. Here, our updated (PS)(2) web server predicts the three-dimensional structures of protein complexes based on comparative modeling; furthermore, this server examines the coupling between subunits of the predicted complex by combining structural and evolutionary considerations. The predicted complex structure could be indicated and visualized by Java-based 3D graphics viewers and the structural and evolutionary profiles are shown and compared chain-by-chain. For each subunit, considerations with or without the packing contribution of other subunits cause the differences in similarities between structural and evolutionary profiles, and these differences imply which form, complex or monomeric, is preferred in the biological condition for the subunit. We believe that the (PS)(2) server would be a useful tool for biologists who are interested not only in the structures of protein complexes but also in the coupling between subunits of the complexes. The (PS)(2) is freely available at http://ps2v3.life.nctu.edu.tw/. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Protein domain assignment from the recurrence of locally similar structures
Tai, Chin-Hsien; Sam, Vichetra; Gibrat, Jean-Francois; Garnier, Jean; Munson, Peter J.
2010-01-01
Domains are basic units of protein structure and essential for exploring protein fold space and structure evolution. With the structural genomics initiative, the number of protein structures in the Protein Databank (PDB) is increasing dramatically and domain assignments need to be done automatically. Most existing structural domain assignment programs define domains using the compactness of the domains and/or the number and strength of intra-domain versus inter-domain contacts. Here we present a different approach based on the recurrence of locally similar structural pieces (LSSPs) found by one-against-all structure comparisons with a dataset of 6,373 protein chains from the PDB. Residues of the query protein are clustered using LSSPs via three different procedures to define domains. This approach gives results that are comparable to several existing programs that use geometrical and other structural information explicitly. Remarkably, most of the proteins that contribute the LSSPs defining a domain do not themselves contain the domain of interest. This study shows that domains can be defined by a collection of relatively small locally similar structural pieces containing, on average, four secondary structure elements. In addition, it indicates that domains are indeed made of recurrent small structural pieces that are used to build protein structures of many different folds as suggested by recent studies. PMID:21287617
Bryksa, Brian C; Grahame, Douglas A; Yada, Rickey Y
2017-05-01
The present study characterized the aspartic protease saposin-like domains of four plant species, Solanum tuberosum (potato), Hordeum vulgare L. (barley), Cynara cardunculus L. (cardoon; artichoke thistle) and Arabidopsis thaliana, in terms of bilayer disruption and fusion, and structure pH-dependence. Comparison of the recombinant saposin-like domains revealed that each induced leakage of bilayer vesicles composed of a simple phospholipid mixture with relative rates Arabidopsis>barley>cardoon>potato. When compared for leakage of bilayer composed of a vacuole-like phospholipid mixture, leakage was approximately five times higher for potato saposin-like domain compared to the others. In terms of fusogenic activity, distinctions between particle size profiles were noted among the four proteins, particularly for potato saposin-like domain. Bilayer fusion assays in reducing conditions resulted in altered fusion profiles except in the case of cardoon saposin-like domain which was virtually unchanged. Secondary structure profiles were similar across all four proteins under different pH conditions, although cardoon saposin-like domain appeared to have higher overall helix structure. Furthermore, increases in Trp emission upon protein-bilayer interactions suggested that protein structure rearrangements equilibrated with half-times ranging from 52 to 120s, with cardoon saposin-like domain significantly slower than the other three species. Overall, the present findings serve as a foundation for future studies seeking to delineate protein structural features and motifs in protein-bilayer interactions based upon variability in plant aspartic protease saposin-like domain structures. Copyright © 2017 Elsevier B.V. All rights reserved.
Nema, Vijay; Pal, Sudhir Kumar
2013-01-01
Aim: This study was conducted to find the best suited freely available software for modelling of proteins by taking a few sample proteins. The proteins used were small to big in size with available crystal structures for the purpose of benchmarking. Key players like Phyre2, Swiss-Model, CPHmodels-3.0, Homer, (PS)2, (PS)2-V2, Modweb were used for the comparison and model generation. Results: Benchmarking process was done for four proteins, Icl, InhA, and KatG of Mycobacterium tuberculosis and RpoB of Thermus Thermophilus to get the most suited software. Parameters compared during analysis gave relatively better values for Phyre2 and Swiss-Model. Conclusion: This comparative study gave the information that Phyre2 and Swiss-Model make good models of small and large proteins as compared to other screened software. Other software was also good but is often not very efficient in providing full-length and properly folded structure. PMID:24023424
[A structural protein study of the influenza A (H1N1) virus by polyacrylamide gel electrophoresis].
Pérez Guevara, M T; Savón Valdés, C; Rivas Arjona, M; Goyenechea Hernández, A
1992-01-01
Influenza is an acute respiratory disease typically appearing as an epidemic. Three immunological types of the influenza virus are known: A, B and C. Continually, antigen changes occur, especially in type A. Therefore, a comparative study was carried out on 4 influenza A(H1N1) virus strains in relation to protein structure (surface antigens), by using polyacrylamide gel electrophoresis by the modified Laemmli method. The objective was to compare the structural proteins of the A/Havana/1292/78 (H1N1) national strain with the proteins of 3 international pattern strains. In all the cases, 6 bands were detected by densitometry. In the 4 strains studied the most abundant protein was M. Great differences between the Cuban strain and the 3 international patterns were not seen.
Mukherjee, Koel; Pandey, Dev Mani; Vidyarthi, Ambarish Saran
2015-02-06
Gaining access to sequence and structure information of telomere binding proteins helps in understanding the essential biological processes involve in conserved sequence specific interaction between DNA and the proteins. Rice telomere binding protein (RTBP1) and Nicotiana glutinosa telomere repeat binding factor (NgTRF1) are helix turn helix motif type of proteins that plays role in telomeric DNA protection and length regulation. Both the proteins share same type of domain but till now there is very less communication on the in silico studies of these complete proteins.Here we intend to do a comparative study between two proteins through modeling of the complete proteins, physiochemical characterization, MD simulation and DNA-protein docking. I-TASSER and CLC protein work bench was performed to find out the protein 3D structure as well as the different parameters to characterize the proteins. MD simulation was completed by GROMOS forcefield of GROMACS for 10 ns of time stretch. The simulated 3D structures were docked with template DNA (3D DNA modeled through 3D-DART) of TTTAGGG conserved sequence motif using HADDOCK web server.Digging up all the facts about the proteins it was reveled that around 120 amino acids in the tail part was showing a good sequence similarity between the proteins. Molecular modeling, sequence characterization and secondary structure prediction also indicates the similarity between the protein's structure and sequence. The result of MD simulation highlights on the RMSD, RMSF, Rg, PCA and Energy plots which also conveys the similar type of motional behavior between them. The best complex formation for both the proteins in docking result also indicates for the first interaction site which is mainly the helix3 region of the DNA binding domain. The overall computational analysis reveals that RTBP1 and NgTRF1 proteins display good amount of similarity in their physicochemical properties, structure, dynamics and binding mode.
Mukherjee, Koel; Pandey, Dev Mani; Vidyarthi, Ambarish Saran
2015-09-01
Gaining access to sequence and structure information of telomere-binding proteins helps in understanding the essential biological processes involve in conserved sequence-specific interaction between DNA and the proteins. Rice telomere-binding protein (RTBP1) and Nicotiana glutinosa telomere repeat binding factor (NgTRF1) are helix-turn-helix motif type of proteins that plays role in telomeric DNA protection and length regulation. Both the proteins share same type of domain, but till now there is very less communication on the in silico studies of these complete proteins. Here we intend to do a comparative study between two proteins through modeling of the complete proteins, physiochemical characterization, MD simulation and DNA-protein docking. I-TASSER and CLC protein work bench was performed to find out the protein 3D structure as well as the different parameters to characterize the proteins. MD simulation was completed by GROMOS forcefield of GROMACS for 10 ns of time stretch. The simulated 3D structures were docked with template DNA (3D DNA modeled through 3D-DART) of TTTAGGG conserved sequence motif using HADDOCK Web server. By digging up all the facts about the proteins, it was revealed that around 120 amino acids in the tail part were showing a good sequence similarity between the proteins. Molecular modeling, sequence characterization and secondary structure prediction also indicate the similarity between the protein's structure and sequence. The result of MD simulation highlights on the RMSD, RMSF, Rg, PCA and energy plots which also conveys the similar type of motional behavior between them. The best complex formation for both the proteins in docking result also indicates for the first interaction site which is mainly the helix3 region of the DNA-binding domain. The overall computational analysis reveals that RTBP1 and NgTRF1 proteins display good amount of similarity in their physicochemical properties, structure, dynamics and binding mode.
Lorenzo, J Ramiro; Alonso, Leonardo G; Sánchez, Ignacio E
2015-01-01
Asparagine residues in proteins undergo spontaneous deamidation, a post-translational modification that may act as a molecular clock for the regulation of protein function and turnover. Asparagine deamidation is modulated by protein local sequence, secondary structure and hydrogen bonding. We present NGOME, an algorithm able to predict non-enzymatic deamidation of internal asparagine residues in proteins in the absence of structural data, using sequence-based predictions of secondary structure and intrinsic disorder. Compared to previous algorithms, NGOME does not require three-dimensional structures yet yields better predictions than available sequence-only methods. Four case studies of specific proteins show how NGOME may help the user identify deamidation-prone asparagine residues, often related to protein gain of function, protein degradation or protein misfolding in pathological processes. A fifth case study applies NGOME at a proteomic scale and unveils a correlation between asparagine deamidation and protein degradation in yeast. NGOME is freely available as a webserver at the National EMBnet node Argentina, URL: http://www.embnet.qb.fcen.uba.ar/ in the subpage "Protein and nucleic acid structure and sequence analysis".
Minireview: DNA Replication in Plant Mitochondria
Cupp, John D.; Nielsen, Brent L.
2014-01-01
Higher plant mitochondrial genomes exhibit much greater structural complexity as compared to most other organisms. Unlike well-characterized metazoan mitochondrial DNA (mtDNA) replication, an understanding of the mechanism(s) and proteins involved in plant mtDNA replication remains unclear. Several plant mtDNA replication proteins, including DNA polymerases, DNA primase/helicase, and accessory proteins have been identified. Mitochondrial dynamics, genome structure, and the complexity of dual-targeted and dual-function proteins that provide at least partial redundancy suggest that plants have a unique model for maintaining and replicating mtDNA when compared to the replication mechanism utilized by most metazoan organisms. PMID:24681310
Dos Santos Vasconcelos, Crhisllane Rafaele; de Lima Campos, Túlio; Rezende, Antonio Mauro
2018-03-06
Systematic analysis of a parasite interactome is a key approach to understand different biological processes. It makes possible to elucidate disease mechanisms, to predict protein functions and to select promising targets for drug development. Currently, several approaches for protein interaction prediction for non-model species incorporate only small fractions of the entire proteomes and their interactions. Based on this perspective, this study presents an integration of computational methodologies, protein network predictions and comparative analysis of the protozoan species Leishmania braziliensis and Leishmania infantum. These parasites cause Leishmaniasis, a worldwide distributed and neglected disease, with limited treatment options using currently available drugs. The predicted interactions were obtained from a meta-approach, applying rigid body docking tests and template-based docking on protein structures predicted by different comparative modeling techniques. In addition, we trained a machine-learning algorithm (Gradient Boosting) using docking information performed on a curated set of positive and negative protein interaction data. Our final model obtained an AUC = 0.88, with recall = 0.69, specificity = 0.88 and precision = 0.83. Using this approach, it was possible to confidently predict 681 protein structures and 6198 protein interactions for L. braziliensis, and 708 protein structures and 7391 protein interactions for L. infantum. The predicted networks were integrated to protein interaction data already available, analyzed using several topological features and used to classify proteins as essential for network stability. The present study allowed to demonstrate the importance of integrating different methodologies of interaction prediction to increase the coverage of the protein interaction of the studied protocols, besides it made available protein structures and interactions not previously reported.
NASA Astrophysics Data System (ADS)
Santos, Marlus Alves Dos; Teixeira, Francesco Brugnera; Moreira, Heline Hellen Teixeira; Rodrigues, Adele Aud; Machado, Fabrício Castro; Clemente, Tatiana Mordente; Brigido, Paula Cristina; Silva, Rebecca Tavares E.; Purcino, Cecílio; Gomes, Rafael Gonçalves Barbosa; Bahia, Diana; Mortara, Renato Arruda; Munte, Claudia Elisabeth; Horjales, Eduardo; da Silva, Claudio Vieira
2014-03-01
Structural studies of proteins normally require large quantities of pure material that can only be obtained through heterologous expression systems and recombinant technique. In these procedures, large amounts of expressed protein are often found in the insoluble fraction, making protein purification from the soluble fraction inefficient, laborious, and costly. Usually, protein refolding is avoided due to a lack of experimental assays that can validate correct folding and that can compare the conformational population to that of the soluble fraction. Herein, we propose a validation method using simple and rapid 1D 1H nuclear magnetic resonance (NMR) spectra that can efficiently compare protein samples, including individual information of the environment of each proton in the structure.
Jian, Jhih-Wei; Elumalai, Pavadai; Pitti, Thejkiran; Wu, Chih Yuan; Tsai, Keng-Chang; Chang, Jeng-Yih; Peng, Hung-Pin; Yang, An-Suei
2016-01-01
Predicting ligand binding sites (LBSs) on protein structures, which are obtained either from experimental or computational methods, is a useful first step in functional annotation or structure-based drug design for the protein structures. In this work, the structure-based machine learning algorithm ISMBLab-LIG was developed to predict LBSs on protein surfaces with input attributes derived from the three-dimensional probability density maps of interacting atoms, which were reconstructed on the query protein surfaces and were relatively insensitive to local conformational variations of the tentative ligand binding sites. The prediction accuracy of the ISMBLab-LIG predictors is comparable to that of the best LBS predictors benchmarked on several well-established testing datasets. More importantly, the ISMBLab-LIG algorithm has substantial tolerance to the prediction uncertainties of computationally derived protein structure models. As such, the method is particularly useful for predicting LBSs not only on experimental protein structures without known LBS templates in the database but also on computationally predicted model protein structures with structural uncertainties in the tentative ligand binding sites. PMID:27513851
Núñez-Vivanco, Gabriel; Valdés-Jiménez, Alejandro; Besoaín, Felipe; Reyes-Parada, Miguel
2016-01-01
Since the structure of proteins is more conserved than the sequence, the identification of conserved three-dimensional (3D) patterns among a set of proteins, can be important for protein function prediction, protein clustering, drug discovery and the establishment of evolutionary relationships. Thus, several computational applications to identify, describe and compare 3D patterns (or motifs) have been developed. Often, these tools consider a 3D pattern as that described by the residues surrounding co-crystallized/docked ligands available from X-ray crystal structures or homology models. Nevertheless, many of the protein structures stored in public databases do not provide information about the location and characteristics of ligand binding sites and/or other important 3D patterns such as allosteric sites, enzyme-cofactor interaction motifs, etc. This makes necessary the development of new ligand-independent methods to search and compare 3D patterns in all available protein structures. Here we introduce Geomfinder, an intuitive, flexible, alignment-free and ligand-independent web server for detailed estimation of similarities between all pairs of 3D patterns detected in any two given protein structures. We used around 1100 protein structures to form pairs of proteins which were assessed with Geomfinder. In these analyses each protein was considered in only one pair (e.g. in a subset of 100 different proteins, 50 pairs of proteins can be defined). Thus: (a) Geomfinder detected identical pairs of 3D patterns in a series of monoamine oxidase-B structures, which corresponded to the effectively similar ligand binding sites at these proteins; (b) we identified structural similarities among pairs of protein structures which are targets of compounds such as acarbose, benzamidine, adenosine triphosphate and pyridoxal phosphate; these similar 3D patterns are not detected using sequence-based methods; (c) the detailed evaluation of three specific cases showed the versatility of Geomfinder, which was able to discriminate between similar and different 3D patterns related to binding sites of common substrates in a range of diverse proteins. Geomfinder allows detecting similar 3D patterns between any two pair of protein structures, regardless of the divergency among their amino acids sequences. Although the software is not intended for simultaneous multiple comparisons in a large number of proteins, it can be particularly useful in cases such as the structure-based design of multitarget drugs, where a detailed analysis of 3D patterns similarities between a few selected protein targets is essential.
NASA Astrophysics Data System (ADS)
Xu, Xianjin; Yan, Chengfei; Zou, Xiaoqin
2017-08-01
The growing number of protein-ligand complex structures, particularly the structures of proteins co-bound with different ligands, in the Protein Data Bank helps us tackle two major challenges in molecular docking studies: the protein flexibility and the scoring function. Here, we introduced a systematic strategy by using the information embedded in the known protein-ligand complex structures to improve both binding mode and binding affinity predictions. Specifically, a ligand similarity calculation method was employed to search a receptor structure with a bound ligand sharing high similarity with the query ligand for the docking use. The strategy was applied to the two datasets (HSP90 and MAP4K4) in recent D3R Grand Challenge 2015. In addition, for the HSP90 dataset, a system-specific scoring function (ITScore2_hsp90) was generated by recalibrating our statistical potential-based scoring function (ITScore2) using the known protein-ligand complex structures and the statistical mechanics-based iterative method. For the HSP90 dataset, better performances were achieved for both binding mode and binding affinity predictions comparing with the original ITScore2 and with ensemble docking. For the MAP4K4 dataset, although there were only eight known protein-ligand complex structures, our docking strategy achieved a comparable performance with ensemble docking. Our method for receptor conformational selection and iterative method for the development of system-specific statistical potential-based scoring functions can be easily applied to other protein targets that have a number of protein-ligand complex structures available to improve predictions on binding.
Parmodel: a web server for automated comparative modeling of proteins.
Uchôa, Hugo Brandão; Jorge, Guilherme Eberhart; Freitas Da Silveira, Nelson José; Camera, João Carlos; Canduri, Fernanda; De Azevedo, Walter Filgueira
2004-12-24
Parmodel is a web server for automated comparative modeling and evaluation of protein structures. The aim of this tool is to help inexperienced users to perform modeling, assessment, visualization, and optimization of protein models as well as crystallographers to evaluate structures solved experimentally. It is subdivided in four modules: Parmodel Modeling, Parmodel Assessment, Parmodel Visualization, and Parmodel Optimization. The main module is the Parmodel Modeling that allows the building of several models for a same protein in a reduced time, through the distribution of modeling processes on a Beowulf cluster. Parmodel automates and integrates the main softwares used in comparative modeling as MODELLER, Whatcheck, Procheck, Raster3D, Molscript, and Gromacs. This web server is freely accessible at .
Monte Carlo replica-exchange based ensemble docking of protein conformations.
Zhang, Zhe; Ehmann, Uwe; Zacharias, Martin
2017-05-01
A replica-exchange Monte Carlo (REMC) ensemble docking approach has been developed that allows efficient exploration of protein-protein docking geometries. In addition to Monte Carlo steps in translation and orientation of binding partners, possible conformational changes upon binding are included based on Monte Carlo selection of protein conformations stored as ordered pregenerated conformational ensembles. The conformational ensembles of each binding partner protein were generated by three different approaches starting from the unbound partner protein structure with a range spanning a root mean square deviation of 1-2.5 Å with respect to the unbound structure. Because MC sampling is performed to select appropriate partner conformations on the fly the approach is not limited by the number of conformations in the ensemble compared to ensemble docking of each conformer pair in ensemble cross docking. Although only a fraction of generated conformers was in closer agreement with the bound structure the REMC ensemble docking approach achieved improved docking results compared to REMC docking with only the unbound partner structures or using docking energy minimization methods. The approach has significant potential for further improvement in combination with more realistic structural ensembles and better docking scoring functions. Proteins 2017; 85:924-937. © 2016 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.
PLI: a web-based tool for the comparison of protein-ligand interactions observed on PDB structures.
Gallina, Anna Maria; Bisignano, Paola; Bergamino, Maurizio; Bordo, Domenico
2013-02-01
A large fraction of the entries contained in the Protein Data Bank describe proteins in complex with low molecular weight molecules such as physiological compounds or synthetic drugs. In many cases, the same molecule is found in distinct protein-ligand complexes. There is an increasing interest in Medicinal Chemistry in comparing protein binding sites to get insight on interactions that modulate the binding specificity, as this structural information can be correlated with other experimental data of biochemical or physiological nature and may help in rational drug design. The web service protein-ligand interaction presented here provides a tool to analyse and compare the binding pockets of homologous proteins in complex with a selected ligand. The information is deduced from protein-ligand complexes present in the Protein Data Bank and stored in the underlying database. Freely accessible at http://bioinformatics.istge.it/pli/.
Predicting the helix packing of globular proteins by self-correcting distance geometry.
Mumenthaler, C; Braun, W
1995-05-01
A new self-correcting distance geometry method for predicting the three-dimensional structure of small globular proteins was assessed with a test set of 8 helical proteins. With the knowledge of the amino acid sequence and the helical segments, our completely automated method calculated the correct backbone topology of six proteins. The accuracy of the predicted structures ranged from 2.3 A to 3.1 A for the helical segments compared to the experimentally determined structures. For two proteins, the predicted constraints were not restrictive enough to yield a conclusive prediction. The method can be applied to all small globular proteins, provided the secondary structure is known from NMR analysis or can be predicted with high reliability.
Pakdaman, Yasaman; Sanchez-Guixé, Monica; Kleppe, Rune; Erdal, Sigrid; Bustad, Helene J.; Bjørkhaug, Lise; Haugarvoll, Kristoffer; Tzoulis, Charalampos; Heimdal, Ketil; Knappskog, Per M.; Johansson, Stefan
2017-01-01
Spinocerebellar ataxia, autosomal recessive 16 (SCAR16) is caused by biallelic mutations in the STIP1 homology and U-box containing protein 1 (STUB1) gene encoding the ubiquitin E3 ligase and dimeric co-chaperone C-terminus of Hsc70-interacting protein (CHIP). It has been proposed that the disease mechanism is related to CHIP’s impaired E3 ubiquitin ligase properties and/or interaction with its chaperones. However, there is limited knowledge on how these mutations affect the stability, folding, and protein structure of CHIP itself. To gain further insight, six previously reported pathogenic STUB1 variants (E28K, N65S, K145Q, M211I, S236T, and T246M) were expressed as recombinant proteins and studied using limited proteolysis, size-exclusion chromatography (SEC), and circular dichroism (CD). Our results reveal that N65S shows increased CHIP dimerization, higher levels of α-helical content, and decreased degradation rate compared with wild-type (WT) CHIP. By contrast, T246M demonstrates a strong tendency for aggregation, a more flexible protein structure, decreased levels of α-helical structures, and increased degradation rate compared with WT CHIP. E28K, K145Q, M211I, and S236T also show defects on structural properties compared with WT CHIP, although less profound than what observed for N65S and T246M. In conclusion, our results illustrate that some STUB1 mutations known to cause recessive SCAR16 have a profound impact on the protein structure, stability, and ability of CHIP to dimerize in vitro. These results add to the growing understanding on the mechanisms behind the disorder. PMID:28396517
Pakdaman, Yasaman; Sanchez-Guixé, Monica; Kleppe, Rune; Erdal, Sigrid; Bustad, Helene J; Bjørkhaug, Lise; Haugarvoll, Kristoffer; Tzoulis, Charalampos; Heimdal, Ketil; Knappskog, Per M; Johansson, Stefan; Aukrust, Ingvild
2017-04-30
Spinocerebellar ataxia, autosomal recessive 16 (SCAR16) is caused by biallelic mutations in the STIP1 homology and U-box containing protein 1 ( STUB1 ) gene encoding the ubiquitin E3 ligase and dimeric co-chaperone C-terminus of Hsc70-interacting protein (CHIP). It has been proposed that the disease mechanism is related to CHIP's impaired E3 ubiquitin ligase properties and/or interaction with its chaperones. However, there is limited knowledge on how these mutations affect the stability, folding, and protein structure of CHIP itself. To gain further insight, six previously reported pathogenic STUB1 variants (E28K, N65S, K145Q, M211I, S236T, and T246M) were expressed as recombinant proteins and studied using limited proteolysis, size-exclusion chromatography (SEC), and circular dichroism (CD). Our results reveal that N65S shows increased CHIP dimerization, higher levels of α-helical content, and decreased degradation rate compared with wild-type (WT) CHIP. By contrast, T246M demonstrates a strong tendency for aggregation, a more flexible protein structure, decreased levels of α-helical structures, and increased degradation rate compared with WT CHIP. E28K, K145Q, M211I, and S236T also show defects on structural properties compared with WT CHIP, although less profound than what observed for N65S and T246M. In conclusion, our results illustrate that some STUB1 mutations known to cause recessive SCAR16 have a profound impact on the protein structure, stability, and ability of CHIP to dimerize in vitro. These results add to the growing understanding on the mechanisms behind the disorder. © 2017 The Author(s).
StructAlign, a Program for Alignment of Structures of DNA-Protein Complexes.
Popov, Ya V; Galitsyna, A A; Alexeevski, A V; Karyagina, A S; Spirin, S A
2015-11-01
Comparative analysis of structures of complexes of homologous proteins with DNA is important in the analysis of DNA-protein recognition. Alignment is a necessary stage of the analysis. An alignment is a matching of amino acid residues and nucleotides of one complex to residues and nucleotides of the other. Currently, there are no programs available for aligning structures of DNA-protein complexes. We present the program StructAlign, which should fill this gap. The program inputs a pair of complexes of DNA double helix with proteins and outputs an alignment of DNA chains corresponding to the best spatial fit of the protein chains.
Preservation of protein clefts in comparative models.
Piedra, David; Lois, Sergi; de la Cruz, Xavier
2008-01-16
Comparative, or homology, modelling of protein structures is the most widely used prediction method when the target protein has homologues of known structure. Given that the quality of a model may vary greatly, several studies have been devoted to identifying the factors that influence modelling results. These studies usually consider the protein as a whole, and only a few provide a separate discussion of the behaviour of biologically relevant features of the protein. Given the value of the latter for many applications, here we extended previous work by analysing the preservation of native protein clefts in homology models. We chose to examine clefts because of their role in protein function/structure, as they are usually the locus of protein-protein interactions, host the enzymes' active site, or, in the case of protein domains, can also be the locus of domain-domain interactions that lead to the structure of the whole protein. We studied how the largest cleft of a protein varies in comparative models. To this end, we analysed a set of 53507 homology models that cover the whole sequence identity range, with a special emphasis on medium and low similarities. More precisely we examined how cleft quality - measured using six complementary parameters related to both global shape and local atomic environment, depends on the sequence identity between target and template proteins. In addition to this general analysis, we also explored the impact of a number of factors on cleft quality, and found that the relationship between quality and sequence identity varies depending on cleft rank amongst the set of protein clefts (when ordered according to size), and number of aligned residues. We have examined cleft quality in homology models at a range of seq.id. levels. Our results provide a detailed view of how quality is affected by distinct parameters and thus may help the user of comparative modelling to determine the final quality and applicability of his/her cleft models. In addition, the large variability in model quality that we observed within each sequence bin, with good models present even at low sequence identities (between 20% and 30%), indicates that properly developed identification methods could be used to recover good cleft models in this sequence range.
NASA Astrophysics Data System (ADS)
Siddaramaiah, Manjunath; Satyamoorthy, Kapaettu; Rao, Bola Sadashiva Satish; Roy, Suparna; Chandra, Subhash; Mahato, Krishna Kishore
2017-03-01
In the present study an attempt has been made to interrogate the bulk secondary structures of some selected proteins (BSA, HSA, lysozyme, trypsin and ribonuclease A) under urea and GnHCl denaturation using laser induced autofluorescence. The proteins were treated with different concentrations of urea (3 M, 6 M, 9 M) and GnHCl (2 M, 4 M, 6 M) and the corresponding steady state autofluorescence spectra were recorded at 281 nm pulsed laser excitations. The recorded fluorescence spectra of proteins were then interpreted based on the existing PDB structures of the proteins and the Trp solvent accessibility (calculated using "Scratch protein predictor" at 30% threshold). Further, the influence of rigidity and conformation of the indole ring (caused by protein secondary structures) on the intrinsic fluorescence properties of proteins were also evaluated using fluorescence of ANS-HSA complexes, CD spectroscopy as well as with trypsin digestion experiments. The outcomes obtained clearly demonstrated GnHCl preferably disrupt helix as compared to the beta β-sheets whereas, urea found was more effective in disrupting β-sheets as compared to the helices. The other way round the proteins which have shown detectable change in the intrinsic fluorescence at lower concentrations of GnHCl were rich in helices whereas, the proteins which showed detectable change in the intrinsic fluorescence at lower concentrations of urea were rich in β-sheets. Since high salt concentrations like GnHCl and urea interfere in the secondary structure analysis by circular dichroism Spectrometry, the present method of analyzing secondary structures using laser induced autofluorescence will be highly advantageous over existing tools for the same.
Iida, Satoko; Kobiyama, Atsushi; Ogata, Takehiko; Murakami, Akio
2008-01-01
Plastid encoded genes of the dinoflagellates are rapidly evolving and most divergent. The importance of unusually accumulated mutations on structure of PSII core protein and photosynthetic function was examined in the dinoflagellates, Symbiodinium sp. and Alexandrium tamarense. Full-length cDNA sequences of psbA (D1 protein) and psbD (D2 protein) were obtained and compared with the other oxygen-evolving photoautotrophs. Twenty-three amino acid positions (7%) for the D1 protein and 34 positions (10%) for the D2 were mutated in the dinoflagellates, although amino acid residues at these positions were conserved in cyanobacteria, the other algae, and plant. Many mutations were likely to distribute in the N-terminus and the D-E interhelical loop of the D1 protein and helix B of D2 protein, while the remaining regions were well conserved. The different structural properties in these mutated regions were supported by hydropathy profiles. The chlorophyll fluorescence kinetics of the dinoflagellates was compared with Synechocystis sp. PCC6803 in relation to the altered protein structure.
Rigid-Docking Approaches to Explore Protein-Protein Interaction Space.
Matsuzaki, Yuri; Uchikoga, Nobuyuki; Ohue, Masahito; Akiyama, Yutaka
Protein-protein interactions play core roles in living cells, especially in the regulatory systems. As information on proteins has rapidly accumulated on publicly available databases, much effort has been made to obtain a better picture of protein-protein interaction networks using protein tertiary structure data. Predicting relevant interacting partners from their tertiary structure is a challenging task and computer science methods have the potential to assist with this. Protein-protein rigid docking has been utilized by several projects, docking-based approaches having the advantages that they can suggest binding poses of predicted binding partners which would help in understanding the interaction mechanisms and that comparing docking results of both non-binders and binders can lead to understanding the specificity of protein-protein interactions from structural viewpoints. In this review we focus on explaining current computational prediction methods to predict pairwise direct protein-protein interactions that form protein complexes.
Identifying DNA-binding proteins using structural motifs and the electrostatic potential
Shanahan, Hugh P.; Garcia, Mario A.; Jones, Susan; Thornton, Janet M.
2004-01-01
Robust methods to detect DNA-binding proteins from structures of unknown function are important for structural biology. This paper describes a method for identifying such proteins that (i) have a solvent accessible structural motif necessary for DNA-binding and (ii) a positive electrostatic potential in the region of the binding region. We focus on three structural motifs: helix–turn-helix (HTH), helix–hairpin–helix (HhH) and helix–loop–helix (HLH). We find that the combination of these variables detect 78% of proteins with an HTH motif, which is a substantial improvement over previous work based purely on structural templates and is comparable to more complex methods of identifying DNA-binding proteins. Similar true positive fractions are achieved for the HhH and HLH motifs. We see evidence of wide evolutionary diversity for DNA-binding proteins with an HTH motif, and much smaller diversity for those with an HhH or HLH motif. PMID:15356290
Basu, Sohini; Sen, Srikanta
2013-02-25
Structure and dynamics both are known to be important for the activity of a protein. A fundamental question is whether a thermophilic protein and its mesophilic homologue exhibit similar dynamics at their respective optimal growth temperatures. We have addressed this question by performing molecular dynamics (MD) simulations of a natural mesophilic-thermophilic homologue pair at their respective optimal growth temperatures to compare their structural, dynamical, and solvent properties. The MD simulations were done in explicit aqueous solvent under periodic boundary and constant pressure and temperature (CPT) conditions and continued for 10.0 ns using the same protocol for the two proteins, excepting the temperatures. The trajectories were analyzed to compare the properties of the two proteins. Results indicated that the dynamical behaviors of the two proteins at the respective optimal growth temperatures were remarkably similar. For the common residues in the thermophilic protein, the rms fluctuations have a general trend to be slightly higher compared to that in the mesophilic counterpart. Lindemann parameter values indicated that only a few residues exhibited solid-like dynamics while the protein as a whole appeared as a molten globule in each case. Interestingly, the water-water interaction was found to be strikingly similar in spite of the difference in temperatures while, the protein-water interaction was significantly different in the two simulations.
Fast protein tertiary structure retrieval based on global surface shape similarity.
Sael, Lee; Li, Bin; La, David; Fang, Yi; Ramani, Karthik; Rustamov, Raif; Kihara, Daisuke
2008-09-01
Characterization and identification of similar tertiary structure of proteins provides rich information for investigating function and evolution. The importance of structure similarity searches is increasing as structure databases continue to expand, partly due to the structural genomics projects. A crucial drawback of conventional protein structure comparison methods, which compare structures by their main-chain orientation or the spatial arrangement of secondary structure, is that a database search is too slow to be done in real-time. Here we introduce a global surface shape representation by three-dimensional (3D) Zernike descriptors, which represent a protein structure compactly as a series expansion of 3D functions. With this simplified representation, the search speed against a few thousand structures takes less than a minute. To investigate the agreement between surface representation defined by 3D Zernike descriptor and conventional main-chain based representation, a benchmark was performed against a protein classification generated by the combinatorial extension algorithm. Despite the different representation, 3D Zernike descriptor retrieved proteins of the same conformation defined by combinatorial extension in 89.6% of the cases within the top five closest structures. The real-time protein structure search by 3D Zernike descriptor will open up new possibility of large-scale global and local protein surface shape comparison. 2008 Wiley-Liss, Inc.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zemla, A; Lang, D; Kostova, T
2010-11-29
Most of the currently used methods for protein function prediction rely on sequence-based comparisons between a query protein and those for which a functional annotation is provided. A serious limitation of sequence similarity-based approaches for identifying residue conservation among proteins is the low confidence in assigning residue-residue correspondences among proteins when the level of sequence identity between the compared proteins is poor. Multiple sequence alignment methods are more satisfactory - still, they cannot provide reliable results at low levels of sequence identity. Our goal in the current work was to develop an algorithm that could overcome these difficulties and facilitatemore » the identification of structurally (and possibly functionally) relevant residue-residue correspondences between compared protein structures. Here we present StralSV, a new algorithm for detecting closely related structure fragments and quantifying residue frequency from tight local structure alignments. We apply StralSV in a study of the RNA-dependent RNA polymerase of poliovirus and demonstrate that the algorithm can be used to determine regions of the protein that are relatively unique or that shared structural similarity with structures that are distantly related. By quantifying residue frequencies among many residue-residue pairs extracted from local alignments, one can infer potential structural or functional importance of specific residues that are determined to be highly conserved or that deviate from a consensus. We further demonstrate that considerable detailed structural and phylogenetic information can be derived from StralSV analyses. StralSV is a new structure-based algorithm for identifying and aligning structure fragments that have similarity to a reference protein. StralSV analysis can be used to quantify residue-residue correspondences and identify residues that may be of particular structural or functional importance, as well as unusual or unexpected residues at a given sequence position.« less
Feng, Yingang
2017-01-01
The use of NMR methods to determine the three-dimensional structures of carbohydrates and glycoproteins is still challenging, in part because of the lack of standard protocols. In order to increase the convenience of structure determination, the topology and parameter files for carbohydrates in the program Crystallography & NMR System (CNS) were investigated and new files were developed to be compatible with the standard simulated annealing protocols for proteins and nucleic acids. Recalculating the published structures of protein-carbohydrate complexes and glycosylated proteins demonstrates that the results are comparable to the published structures which employed more complex procedures for structure calculation. Integrating the new carbohydrate parameters into the standard structure calculation protocol will facilitate three-dimensional structural study of carbohydrates and glycosylated proteins by NMR spectroscopy.
2017-01-01
The use of NMR methods to determine the three-dimensional structures of carbohydrates and glycoproteins is still challenging, in part because of the lack of standard protocols. In order to increase the convenience of structure determination, the topology and parameter files for carbohydrates in the program Crystallography & NMR System (CNS) were investigated and new files were developed to be compatible with the standard simulated annealing protocols for proteins and nucleic acids. Recalculating the published structures of protein-carbohydrate complexes and glycosylated proteins demonstrates that the results are comparable to the published structures which employed more complex procedures for structure calculation. Integrating the new carbohydrate parameters into the standard structure calculation protocol will facilitate three-dimensional structural study of carbohydrates and glycosylated proteins by NMR spectroscopy. PMID:29232406
Representing and comparing protein structures as paths in three-dimensional space
Zhi, Degui; Krishna, S Sri; Cao, Haibo; Pevzner, Pavel; Godzik, Adam
2006-01-01
Background Most existing formulations of protein structure comparison are based on detailed atomic level descriptions of protein structures and bypass potential insights that arise from a higher-level abstraction. Results We propose a structure comparison approach based on a simplified representation of proteins that describes its three-dimensional path by local curvature along the generalized backbone of the polypeptide. We have implemented a dynamic programming procedure that aligns curvatures of proteins by optimizing a defined sum turning angle deviation measure. Conclusion Although our procedure does not directly optimize global structural similarity as measured by RMSD, our benchmarking results indicate that it can surprisingly well recover the structural similarity defined by structure classification databases and traditional structure alignment programs. In addition, our program can recognize similarities between structures with extensive conformation changes that are beyond the ability of traditional structure alignment programs. We demonstrate the applications of procedure to several contexts of structure comparison. An implementation of our procedure, CURVE, is available as a public webserver. PMID:17052359
NASA Astrophysics Data System (ADS)
Yu, Peiqiang; Jonker, Arjan; Gruber, Margaret
2009-09-01
To date there has been very little application of synchrotron radiation-based Fourier transform infrared microspectroscopy (SRFTIRM) to the study of molecular structures in plant forage in relation to livestock digestive behavior and nutrient availability. Protein inherent structure, among other factors such as protein matrix, affects nutritive quality, fermentation and degradation behavior in both humans and animals. The relative percentage of protein secondary structure influences protein value. A high percentage of β-sheets usually reduce the access of gastrointestinal digestive enzymes to the protein. Reduced accessibility results in poor digestibility and as a result, low protein value. The objective of this study was to use SRFTIRM to compare protein molecular structure of alfalfa plant tissues transformed with the maize Lc regulatory gene with non-transgenic alfalfa protein within cellular and subcellular dimensions and to quantify protein inherent structure profiles using Gaussian and Lorentzian methods of multi-component peak modeling. Protein molecular structure revealed by this method included α-helices, β-sheets and other structures such as β-turns and random coils. Hierarchical cluster analysis and principal component analysis of the synchrotron data, as well as accurate spectral analysis based on curve fitting, showed that transgenic alfalfa contained a relatively lower ( P < 0.05) percentage of the model-fitted α-helices (29 vs. 34) and model-fitted β-sheets (22 vs. 27) and a higher ( P < 0.05) percentage of other model-fitted structures (49 vs. 39). Transgenic alfalfa protein displayed no difference ( P > 0.05) in the ratio of α-helices to β-sheets (average: 1.4) and higher ( P < 0.05) ratios of α-helices to others (0.7 vs. 0.9) and β-sheets to others (0.5 vs. 0.8) than the non-transgenic alfalfa protein. The transgenic protein structures also exhibited no difference ( P > 0.05) in the vibrational intensity of protein amide I (average of 24) and amide II areas (average of 10) and their ratio (average of 2.4) compared with non-transgenic alfalfa. Cluster analysis and principal component analysis showed no significant differences between the two genotypes in the broad molecular fingerprint region, amides I and II regions, and the carbohydrate molecular region, indicating they are highly related to each other. The results suggest that transgenic Lc-alfalfa leaves contain similar proteins to non-transgenic alfalfa (because amide I and II intensities were identical), but a subtle difference in protein molecular structure after freeze drying. Further study is needed to understand the relationship between these structural profiles and biological features such as protein nutrient availability, protein bypass and digestive behavior of livestock fed with this type of forage.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yu, P.; Jonker, A; Gruber, M
2009-01-01
To date there has been very little application of synchrotron radiation-based Fourier transform infrared microspectroscopy (SRFTIRM) to the study of molecular structures in plant forage in relation to livestock digestive behavior and nutrient availability. Protein inherent structure, among other factors such as protein matrix, affects nutritive quality, fermentation and degradation behavior in both humans and animals. The relative percentage of protein secondary structure influences protein value. A high percentage of e-sheets usually reduce the access of gastrointestinal digestive enzymes to the protein. Reduced accessibility results in poor digestibility and as a result, low protein value. The objective of this studymore » was to use SRFTIRM to compare protein molecular structure of alfalfa plant tissues transformed with the maize Lc regulatory gene with non-transgenic alfalfa protein within cellular and subcellular dimensions and to quantify protein inherent structure profiles using Gaussian and Lorentzian methods of multi-component peak modeling. Protein molecular structure revealed by this method included a-helices, e-sheets and other structures such as e-turns and random coils. Hierarchical cluster analysis and principal component analysis of the synchrotron data, as well as accurate spectral analysis based on curve fitting, showed that transgenic alfalfa contained a relatively lower (P < 0.05) percentage of the model-fitted a-helices (29 vs. 34) and model-fitted e-sheets (22 vs. 27) and a higher (P < 0.05) percentage of other model-fitted structures (49 vs. 39). Transgenic alfalfa protein displayed no difference (P > 0.05) in the ratio of a-helices to e-sheets (average: 1.4) and higher (P < 0.05) ratios of a-helices to others (0.7 vs. 0.9) and e-sheets to others (0.5 vs. 0.8) than the non-transgenic alfalfa protein. The transgenic protein structures also exhibited no difference (P > 0.05) in the vibrational intensity of protein amide I (average of 24) and amide II areas (average of 10) and their ratio (average of 2.4) compared with non-transgenic alfalfa. Cluster analysis and principal component analysis showed no significant differences between the two genotypes in the broad molecular fingerprint region, amides I and II regions, and the carbohydrate molecular region, indicating they are highly related to each other. The results suggest that transgenic Lc-alfalfa leaves contain similar proteins to non-transgenic alfalfa (because amide I and II intensities were identical), but a subtle difference in protein molecular structure after freeze drying. Further study is needed to understand the relationship between these structural profiles and biological features such as protein nutrient availability, protein bypass and digestive behavior of livestock fed with this type of forage.« less
Structural anatomy of telomere OB proteins.
Horvath, Martin P
2011-10-01
Telomere DNA-binding proteins protect the ends of chromosomes in eukaryotes. A subset of these proteins are constructed with one or more OB folds and bind with G+T-rich single-stranded DNA found at the extreme termini. The resulting DNA-OB protein complex interacts with other telomere components to coordinate critical telomere functions of DNA protection and DNA synthesis. While the first crystal and NMR structures readily explained protection of telomere ends, the picture of how single-stranded DNA becomes available to serve as primer and template for synthesis of new telomere DNA is only recently coming into focus. New structures of telomere OB fold proteins alongside insights from genetic and biochemical experiments have made significant contributions towards understanding how protein-binding OB proteins collaborate with DNA-binding OB proteins to recruit telomerase and DNA polymerase for telomere homeostasis. This review surveys telomere OB protein structures alongside highly comparable structures derived from replication protein A (RPA) components, with the goal of providing a molecular context for understanding telomere OB protein evolution and mechanism of action in protection and synthesis of telomere DNA.
Structural anatomy of telomere OB proteins
Horvath, Martin P.
2015-01-01
Telomere DNA-binding proteins protect the ends of chromosomes in eukaryotes. A subset of these proteins are constructed with one or more OB folds and bind with G+T-rich single-stranded DNA found at the extreme termini. The resulting DNA-OB protein complex interacts with other telomere components to coordinate critical telomere functions of DNA protection and DNA synthesis. While the first crystal and NMR structures readily explained protection of telomere ends, the picture of how single-stranded DNA becomes available to serve as primer and template for synthesis of new telomere DNA is only recently coming into focus. New structures of telomere OB fold proteins alongside insights from genetic and biochemical experiments have made significant contributions towards understanding how protein-binding OB proteins collaborate with DNA-binding OB proteins to recruit telomerase and DNA polymerase for telomere homeostasis. This review surveys telomere OB protein structures alongside highly comparable structures derived from replication protein A (RPA) components, with the goal of providing a molecular context for understanding telomere OB protein evolution and mechanism of action in protection and synthesis of telomere DNA. PMID:21950380
Purification and protein composition of endogenous rat viruses.
Hlubinová, K; Prachar, J; Vrbenská, A; Matoska, J; Simkovic, D
1984-01-01
Endogenous retroviruses are not in the majority of cases the cause of any neoplasia, except for the laboratory conditions. As far as they might serve for the evolution of pathogenic retroviruses more attention should have been paid to them. In this paper we introduce some approaches to the purification of rat endogenous retroviruses to such a degree of purity that enabled satisfactory SDS-PAGE analysis of its structural proteins. Purities of samples obtained by usual purification methods, long-term isopycnic centrifugation at a high gravity force and velocity centrifugation are compared. Protein profile of rat endogenous virus in SDS-PAGE is compared with the ones of other retroviruses. For the first time the evidence was obtained for the striking similarity between electrophoretic protein profile of rat endogenous virus WERC and feline leukemia virus. The major structural proteins of rat endogenous retrovirus and feline leukemia virus cannot be distinguished even when resolution long gradient PAGE had been employed. The accordance of electrophoretic mobilities of major structural proteins in SDS-PAGE can indicate the relatedness of retroviruses.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Domingo Meza-Aguilar, J.; Laboratorio de Patogenicidad Bacteriana, Unidad de Hemato Oncología e Investigación, Hospital Infantil de México Federico Gómez 06720, D.F.; Fromme, Petra
Highlights: • X-ray crystal structure of the passenger domain of Plasmid encoded toxin at 2.3 Å. • Structural differences between Pet passenger domain and EspP protein are described. • High flexibility of the C-terminal beta helix is structurally assigned. - Abstract: Autotransporters (ATs) represent a superfamily of proteins produced by a variety of pathogenic bacteria, which include the pathogenic groups of Escherichia coli (E. coli) associated with gastrointestinal and urinary tract infections. We present the first X-ray structure of the passenger domain from the Plasmid-encoded toxin (Pet) a 100 kDa protein at 2.3 Å resolution which is a cause ofmore » acute diarrhea in both developing and industrialized countries. Pet is a cytoskeleton-altering toxin that induces loss of actin stress fibers. While Pet (pdb code: 4OM9) shows only a sequence identity of 50% compared to the closest related protein sequence, extracellular serine protease plasmid (EspP) the structural features of both proteins are conserved. A closer structural look reveals that Pet contains a β-pleaded sheet at the sequence region of residues 181–190, the corresponding structural domain in EspP consists of a coiled loop. Secondary, the Pet passenger domain features a more pronounced beta sheet between residues 135 and 143 compared to the structure of EspP.« less
Purely Structural Protein Scoring Functions Using Support Vector Machine and Ensemble Learning.
Mirzaei, Shokoufeh; Sidi, Tomer; Keasar, Chen; Crivelli, Silvia
2016-08-24
The function of a protein is determined by its structure, which creates a need for efficient methods of protein structure determination to advance scientific and medical research. Because current experimental structure determination methods carry a high price tag, computational predictions are highly desirable. Given a protein sequence, computational methods produce numerous 3D structures known as decoys. However, selection of the best quality decoys is challenging as the end users can handle only a few ones. Therefore, scoring functions are central to decoy selection. They combine measurable features into a single number indicator of decoy quality. Unfortunately, current scoring functions do not consistently select the best decoys. Machine learning techniques offer great potential to improve decoy scoring. This paper presents two machine-learning based scoring functions to predict the quality of proteins structures, i.e., the similarity between the predicted structure and the experimental one without knowing the latter. We use different metrics to compare these scoring functions against three state-of-the-art scores. This is a first attempt at comparing different scoring functions using the same non-redundant dataset for training and testing and the same features. The results show that adding informative features may be more significant than the method used.
Nakamura, Akira; Ohtsuka, Jun; Kashiwagi, Tatsuki; Numoto, Nobutaka; Hirota, Noriyuki; Ode, Takahiro; Okada, Hidehiko; Nagata, Koji; Kiyohara, Motosuke; Suzuki, Ei-Ichiro; Kita, Akiko; Wada, Hitoshi; Tanokura, Masaru
2016-02-26
Precise protein structure determination provides significant information on life science research, although high-quality crystals are not easily obtained. We developed a system for producing high-quality protein crystals with high throughput. Using this system, gravity-controlled crystallization are made possible by a magnetic microgravity environment. In addition, in-situ and real-time observation and time-lapse imaging of crystal growth are feasible for over 200 solution samples independently. In this paper, we also report results of crystallization experiments for two protein samples. Crystals grown in the system exhibited magnetic orientation and showed higher and more homogeneous quality compared with the control crystals. The structural analysis reveals that making use of the magnetic microgravity during the crystallization process helps us to build a well-refined protein structure model, which has no significant structural differences with a control structure. Therefore, the system contributes to improvement in efficiency of structural analysis for "difficult" proteins, such as membrane proteins and supermolecular complexes.
Understand protein functions by comparing the similarity of local structural environments.
Chen, Jiawen; Xie, Zhong-Ru; Wu, Yinghao
2017-02-01
The three-dimensional structures of proteins play an essential role in regulating binding between proteins and their partners, offering a direct relationship between structures and functions of proteins. It is widely accepted that the function of a protein can be determined if its structure is similar to other proteins whose functions are known. However, it is also observed that proteins with similar global structures do not necessarily correspond to the same function, while proteins with very different folds can share similar functions. This indicates that function similarity is originated from the local structural information of proteins instead of their global shapes. We assume that proteins with similar local environments prefer binding to similar types of molecular targets. In order to testify this assumption, we designed a new structural indicator to define the similarity of local environment between residues in different proteins. This indicator was further used to calculate the probability that a given residue binds to a specific type of structural neighbors, including DNA, RNA, small molecules and proteins. After applying the method to a large-scale non-redundant database of proteins, we show that the positive signal of binding probability calculated from the local structural indicator is statistically meaningful. In summary, our studies suggested that the local environment of residues in a protein is a good indicator to recognize specific binding partners of the protein. The new method could be a potential addition to a suite of existing template-based approaches for protein function prediction. Copyright © 2016 Elsevier B.V. All rights reserved.
Predicting residue-wise contact orders in proteins by support vector regression.
Song, Jiangning; Burrage, Kevin
2006-10-03
The residue-wise contact order (RWCO) describes the sequence separations between the residues of interest and its contacting residues in a protein sequence. It is a new kind of one-dimensional protein structure that represents the extent of long-range contacts and is considered as a generalization of contact order. Together with secondary structure, accessible surface area, the B factor, and contact number, RWCO provides comprehensive and indispensable important information to reconstructing the protein three-dimensional structure from a set of one-dimensional structural properties. Accurately predicting RWCO values could have many important applications in protein three-dimensional structure prediction and protein folding rate prediction, and give deep insights into protein sequence-structure relationships. We developed a novel approach to predict residue-wise contact order values in proteins based on support vector regression (SVR), starting from primary amino acid sequences. We explored seven different sequence encoding schemes to examine their effects on the prediction performance, including local sequence in the form of PSI-BLAST profiles, local sequence plus amino acid composition, local sequence plus molecular weight, local sequence plus secondary structure predicted by PSIPRED, local sequence plus molecular weight and amino acid composition, local sequence plus molecular weight and predicted secondary structure, and local sequence plus molecular weight, amino acid composition and predicted secondary structure. When using local sequences with multiple sequence alignments in the form of PSI-BLAST profiles, we could predict the RWCO distribution with a Pearson correlation coefficient (CC) between the predicted and observed RWCO values of 0.55, and root mean square error (RMSE) of 0.82, based on a well-defined dataset with 680 protein sequences. Moreover, by incorporating global features such as molecular weight and amino acid composition we could further improve the prediction performance with the CC to 0.57 and an RMSE of 0.79. In addition, combining the predicted secondary structure by PSIPRED was found to significantly improve the prediction performance and could yield the best prediction accuracy with a CC of 0.60 and RMSE of 0.78, which provided at least comparable performance compared with the other existing methods. The SVR method shows a prediction performance competitive with or at least comparable to the previously developed linear regression-based methods for predicting RWCO values. In contrast to support vector classification (SVC), SVR is very good at estimating the raw value profiles of the samples. The successful application of the SVR approach in this study reinforces the fact that support vector regression is a powerful tool in extracting the protein sequence-structure relationship and in estimating the protein structural profiles from amino acid sequences.
Meza-Aguilar, J. Domingo; Fromme, Petra; Torres-Larios, Alfredo; Mendoza-Hernández, Guillermo; Hernandez-Chiñas, Ulises; Monteros, Roberto A. Arreguin-Espinosa de los; Campos, Carlos A. Eslava; Fromme, Raimund
2014-01-01
Autotransporters (ATs) represent a superfamily of proteins produced by a variety of pathogenic bacteria, which include the pathogenic groups of Escherichia coli (E. coli) associated with gastrointestinal and urinary tract infections. We present the first X-ray structure of the passenger domain from the Plasmid-encoded toxin (Pet) a 100 kDa protein at 2.3 Å resolution which is a cause of acute diarrhea in both developing and industrialized countries. Pet is a cytoskeleton-altering toxin that induces loss of actin stress fibers. While Pet (pdb code: 4OM9) shows only a sequence identity of 50 % compared to the closest related protein sequence, extracellular serine protease plasmid (EspP) the structural features of both proteins are conserved. A closer structural look reveals that Pet contains a β-pleaded sheet at the sequence region of residues 181-190, the corresponding structural domain in EspP consists of a coiled loop. Secondary, the Pet passenger domain features a more pronounced beta sheet between residues 135-143 compared to the structure of EspP. PMID:24530907
Deciphering Cryptic Binding Sites on Proteins by Mixed-Solvent Molecular Dynamics.
Kimura, S Roy; Hu, Hai Peng; Ruvinsky, Anatoly M; Sherman, Woody; Favia, Angelo D
2017-06-26
In recent years, molecular dynamics simulations of proteins in explicit mixed solvents have been applied to various problems in protein biophysics and drug discovery, including protein folding, protein surface characterization, fragment screening, allostery, and druggability assessment. In this study, we perform a systematic study on how mixtures of organic solvent probes in water can reveal cryptic ligand binding pockets that are not evident in crystal structures of apo proteins. We examine a diverse set of eight PDB proteins that show pocket opening induced by ligand binding and investigate whether solvent MD simulations on the apo structures can induce the binding site observed in the holo structures. The cosolvent simulations were found to induce conformational changes on the protein surface, which were characterized and compared with the holo structures. Analyses of the biological systems, choice of probes and concentrations, druggability of the resulting induced pockets, and application to drug discovery are discussed here.
Protein Interaction Profile Sequencing (PIP-seq).
Foley, Shawn W; Gregory, Brian D
2016-10-10
Every eukaryotic RNA transcript undergoes extensive post-transcriptional processing from the moment of transcription up through degradation. This regulation is performed by a distinct cohort of RNA-binding proteins which recognize their target transcript by both its primary sequence and secondary structure. Here, we describe protein interaction profile sequencing (PIP-seq), a technique that uses ribonuclease-based footprinting followed by high-throughput sequencing to globally assess both protein-bound RNA sequences and RNA secondary structure. PIP-seq utilizes single- and double-stranded RNA-specific nucleases in the absence of proteins to infer RNA secondary structure. These libraries are also compared to samples that undergo nuclease digestion in the presence of proteins in order to find enriched protein-bound sequences. Combined, these four libraries provide a comprehensive, transcriptome-wide view of RNA secondary structure and RNA protein interaction sites from a single experimental technique. © 2016 by John Wiley & Sons, Inc. Copyright © 2016 John Wiley & Sons, Inc.
Tan, Yen Hock; Huang, He; Kihara, Daisuke
2006-08-15
Aligning distantly related protein sequences is a long-standing problem in bioinformatics, and a key for successful protein structure prediction. Its importance is increasing recently in the context of structural genomics projects because more and more experimentally solved structures are available as templates for protein structure modeling. Toward this end, recent structure prediction methods employ profile-profile alignments, and various ways of aligning two profiles have been developed. More fundamentally, a better amino acid similarity matrix can improve a profile itself; thereby resulting in more accurate profile-profile alignments. Here we have developed novel amino acid similarity matrices from knowledge-based amino acid contact potentials. Contact potentials are used because the contact propensity to the other amino acids would be one of the most conserved features of each position of a protein structure. The derived amino acid similarity matrices are tested on benchmark alignments at three different levels, namely, the family, the superfamily, and the fold level. Compared to BLOSUM45 and the other existing matrices, the contact potential-based matrices perform comparably in the family level alignments, but clearly outperform in the fold level alignments. The contact potential-based matrices perform even better when suboptimal alignments are considered. Comparing the matrices themselves with each other revealed that the contact potential-based matrices are very different from BLOSUM45 and the other matrices, indicating that they are located in a different basin in the amino acid similarity matrix space.
Protein 3D Structure and Electron Microscopy Map Retrieval Using 3D-SURFER2.0 and EM-SURFER.
Han, Xusi; Wei, Qing; Kihara, Daisuke
2017-12-08
With the rapid growth in the number of solved protein structures stored in the Protein Data Bank (PDB) and the Electron Microscopy Data Bank (EMDB), it is essential to develop tools to perform real-time structure similarity searches against the entire structure database. Since conventional structure alignment methods need to sample different orientations of proteins in the three-dimensional space, they are time consuming and unsuitable for rapid, real-time database searches. To this end, we have developed 3D-SURFER and EM-SURFER, which utilize 3D Zernike descriptors (3DZD) to conduct high-throughput protein structure comparison, visualization, and analysis. Taking an atomic structure or an electron microscopy map of a protein or a protein complex as input, the 3DZD of a query protein is computed and compared with the 3DZD of all other proteins in PDB or EMDB. In addition, local geometrical characteristics of a query protein can be analyzed using VisGrid and LIGSITE CSC in 3D-SURFER. This article describes how to use 3D-SURFER and EM-SURFER to carry out protein surface shape similarity searches, local geometric feature analysis, and interpretation of the search results. © 2017 by John Wiley & Sons, Inc. Copyright © 2017 John Wiley & Sons, Inc.
Integrating protein structural dynamics and evolutionary analysis with Bio3D.
Skjærven, Lars; Yao, Xin-Qiu; Scarabelli, Guido; Grant, Barry J
2014-12-10
Popular bioinformatics approaches for studying protein functional dynamics include comparisons of crystallographic structures, molecular dynamics simulations and normal mode analysis. However, determining how observed displacements and predicted motions from these traditionally separate analyses relate to each other, as well as to the evolution of sequence, structure and function within large protein families, remains a considerable challenge. This is in part due to the general lack of tools that integrate information of molecular structure, dynamics and evolution. Here, we describe the integration of new methodologies for evolutionary sequence, structure and simulation analysis into the Bio3D package. This major update includes unique high-throughput normal mode analysis for examining and contrasting the dynamics of related proteins with non-identical sequences and structures, as well as new methods for quantifying dynamical couplings and their residue-wise dissection from correlation network analysis. These new methodologies are integrated with major biomolecular databases as well as established methods for evolutionary sequence and comparative structural analysis. New functionality for directly comparing results derived from normal modes, molecular dynamics and principal component analysis of heterogeneous experimental structure distributions is also included. We demonstrate these integrated capabilities with example applications to dihydrofolate reductase and heterotrimeric G-protein families along with a discussion of the mechanistic insight provided in each case. The integration of structural dynamics and evolutionary analysis in Bio3D enables researchers to go beyond a prediction of single protein dynamics to investigate dynamical features across large protein families. The Bio3D package is distributed with full source code and extensive documentation as a platform independent R package under a GPL2 license from http://thegrantlab.org/bio3d/ .
Analysis of Structural Features Contributing to Weak Affinities of Ubiquitin/Protein Interactions.
Cohen, Ariel; Rosenthal, Eran; Shifman, Julia M
2017-11-10
Ubiquitin is a small protein that enables one of the most common post-translational modifications, where the whole ubiquitin molecule is attached to various target proteins, forming mono- or polyubiquitin conjugations. As a prototypical multispecific protein, ubiquitin interacts non-covalently with a variety of proteins in the cell, including ubiquitin-modifying enzymes and ubiquitin receptors that recognize signals from ubiquitin-conjugated substrates. To enable recognition of multiple targets and to support fast dissociation from the ubiquitin modifying enzymes, ubiquitin/protein interactions are characterized with low affinities, frequently in the higher μM and lower mM range. To determine how structure encodes low binding affinity of ubiquitin/protein complexes, we analyzed structures of more than a hundred such complexes compiled in the Ubiquitin Structural Relational Database. We calculated various structure-based features of ubiquitin/protein binding interfaces and compared them to the same features of general protein-protein interactions (PPIs) with various functions and generally higher affinities. Our analysis shows that ubiquitin/protein binding interfaces on average do not differ in size and shape complementarity from interfaces of higher-affinity PPIs. However, they contain fewer favorable hydrogen bonds and more unfavorable hydrophobic/charge interactions. We further analyzed how binding interfaces change upon affinity maturation of ubiquitin toward its target proteins. We demonstrate that while different features are improved in different experiments, the majority of the evolved complexes exhibit better shape complementarity and hydrogen bond pattern compared to wild-type complexes. Our analysis helps to understand how low-affinity PPIs have evolved and how they could be converted into high-affinity PPIs. Copyright © 2017 Elsevier Ltd. All rights reserved.
Systems biology of the structural proteome.
Brunk, Elizabeth; Mih, Nathan; Monk, Jonathan; Zhang, Zhen; O'Brien, Edward J; Bliven, Spencer E; Chen, Ke; Chang, Roger L; Bourne, Philip E; Palsson, Bernhard O
2016-03-11
The success of genome-scale models (GEMs) can be attributed to the high-quality, bottom-up reconstructions of metabolic, protein synthesis, and transcriptional regulatory networks on an organism-specific basis. Such reconstructions are biochemically, genetically, and genomically structured knowledge bases that can be converted into a mathematical format to enable a myriad of computational biological studies. In recent years, genome-scale reconstructions have been extended to include protein structural information, which has opened up new vistas in systems biology research and empowered applications in structural systems biology and systems pharmacology. Here, we present the generation, application, and dissemination of genome-scale models with protein structures (GEM-PRO) for Escherichia coli and Thermotoga maritima. We show the utility of integrating molecular scale analyses with systems biology approaches by discussing several comparative analyses on the temperature dependence of growth, the distribution of protein fold families, substrate specificity, and characteristic features of whole cell proteomes. Finally, to aid in the grand challenge of big data to knowledge, we provide several explicit tutorials of how protein-related information can be linked to genome-scale models in a public GitHub repository ( https://github.com/SBRG/GEMPro/tree/master/GEMPro_recon/). Translating genome-scale, protein-related information to structured data in the format of a GEM provides a direct mapping of gene to gene-product to protein structure to biochemical reaction to network states to phenotypic function. Integration of molecular-level details of individual proteins, such as their physical, chemical, and structural properties, further expands the description of biochemical network-level properties, and can ultimately influence how to model and predict whole cell phenotypes as well as perform comparative systems biology approaches to study differences between organisms. GEM-PRO offers insight into the physical embodiment of an organism's genotype, and its use in this comparative framework enables exploration of adaptive strategies for these organisms, opening the door to many new lines of research. With these provided tools, tutorials, and background, the reader will be in a position to run GEM-PRO for their own purposes.
Measuring and comparing structural fluctuation patterns in large protein datasets.
Fuglebakk, Edvin; Echave, Julián; Reuter, Nathalie
2012-10-01
The function of a protein depends not only on its structure but also on its dynamics. This is at the basis of a large body of experimental and theoretical work on protein dynamics. Further insight into the dynamics-function relationship can be gained by studying the evolutionary divergence of protein motions. To investigate this, we need appropriate comparative dynamics methods. The most used dynamical similarity score is the correlation between the root mean square fluctuations (RMSF) of aligned residues. Despite its usefulness, RMSF is in general less evolutionarily conserved than the native structure. A fundamental issue is whether RMSF is not as conserved as structure because dynamics is less conserved or because RMSF is not the best property to use to study its conservation. We performed a systematic assessment of several scores that quantify the (dis)similarity between protein fluctuation patterns. We show that the best scores perform as well as or better than structural dissimilarity, as assessed by their consistency with the SCOP classification. We conclude that to uncover the full extent of the evolutionary conservation of protein fluctuation patterns, it is important to measure the directions of fluctuations and their correlations between sites. Nathalie.Reuter@mbi.uib.no Supplementary data are available at Bioinformatics Online.
Wolf, Maxim Y; Wolf, Yuri I; Koonin, Eugene V
2008-01-01
Background Proteins show a broad range of evolutionary rates. Understanding the factors that are responsible for the characteristic rate of evolution of a given protein arguably is one of the major goals of evolutionary biology. A long-standing general assumption used to be that the evolution rate is, primarily, determined by the specific functional constraints that affect the given protein. These constrains were traditionally thought to depend both on the specific features of the protein's structure and its biological role. The advent of systems biology brought about new types of data, such as expression level and protein-protein interactions, and unexpectedly, a variety of correlations between protein evolution rate and these variables have been observed. The strongest connections by far were repeatedly seen between protein sequence evolution rate and the expression level of the respective gene. It has been hypothesized that this link is due to the selection for the robustness of the protein structure to mistranslation-induced misfolding that is particularly important for highly expressed proteins and is the dominant determinant of the sequence evolution rate. Results This work is an attempt to assess the relative contributions of protein domain structure and function, on the one hand, and expression level on the other hand, to the rate of sequence evolution. To this end, we performed a genome-wide analysis of the effect of the fusion of a pair of domains in multidomain proteins on the difference in the domain-specific evolutionary rates. The mistranslation-induced misfolding hypothesis would predict that, within multidomain proteins, fused domains, on average, should evolve at substantially closer rates than the same domains in different proteins because, within a mutlidomain protein, all domains are translated at the same rate. We performed a comprehensive comparison of the evolutionary rates of mammalian and plant protein domains that are either joined in multidomain proteins or contained in distinct proteins. Substantial homogenization of evolutionary rates in multidomain proteins was, indeed, observed in both animals and plants, although highly significant differences between domain-specific rates remained. The contributions of the translation rate, as determined by the effect of the fusion of a pair of domains within a multidomain protein, and intrinsic, domain-specific structural-functional constraints appear to be comparable in magnitude. Conclusion Fusion of domains in a multidomain protein results in substantial homogenization of the domain-specific evolutionary rates but significant differences between domain-specific evolution rates remain. Thus, the rate of translation and intrinsic structural-functional constraints both exert sizable and comparable effects on sequence evolution. Reviewers This article was reviewed by Sergei Maslov, Dennis Vitkup, Claus Wilke (nominated by Orly Alter), and Allan Drummond (nominated by Joel Bader). For the full reviews, please go to the Reviewers' Reports section. PMID:18840284
Chira, Camelia; Horvath, Dragos; Dumitrescu, D
2011-07-30
Proteins are complex structures made of amino acids having a fundamental role in the correct functioning of living cells. The structure of a protein is the result of the protein folding process. However, the general principles that govern the folding of natural proteins into a native structure are unknown. The problem of predicting a protein structure with minimum-energy starting from the unfolded amino acid sequence is a highly complex and important task in molecular and computational biology. Protein structure prediction has important applications in fields such as drug design and disease prediction. The protein structure prediction problem is NP-hard even in simplified lattice protein models. An evolutionary model based on hill-climbing genetic operators is proposed for protein structure prediction in the hydrophobic - polar (HP) model. Problem-specific search operators are implemented and applied using a steepest-ascent hill-climbing approach. Furthermore, the proposed model enforces an explicit diversification stage during the evolution in order to avoid local optimum. The main features of the resulting evolutionary algorithm - hill-climbing mechanism and diversification strategy - are evaluated in a set of numerical experiments for the protein structure prediction problem to assess their impact to the efficiency of the search process. Furthermore, the emerging consolidated model is compared to relevant algorithms from the literature for a set of difficult bidimensional instances from lattice protein models. The results obtained by the proposed algorithm are promising and competitive with those of related methods.
Swetha, Rayapadi G.
2014-01-01
The T118M mutation in PMP22 gene is associated with Charcot Marie Tooth, type 1A (CMT1A). CMT1A is a form of Charcot-Marie-Tooth disease, the most common inherited disorder of the peripheral nervous system. Mutations in CMT related disorder are seen to increase the stability of the protein resulting in the diseased state. We performed SNP analysis for all the nsSNPs of PMP22 protein and carried out molecular dynamics simulation for T118M mutation to compare the stability difference between the wild type protein structure and the mutant protein structure. The mutation T118M resulted in the overall increase in the stability of the mutant protein. The superimposed structure shows marked structural variation between the wild type and the mutant protein structures. PMID:25400662
Lubin, Johnathan W; Rao, Timsi; Mandell, Edward K; Wuttke, Deborah S; Lundblad, Victoria
2013-03-01
Mutations that confer the loss of a single biochemical property (separation-of-function mutations) can often uncover a previously unknown role for a protein in a particular biological process. However, most mutations are identified based on loss-of-function phenotypes, which cannot differentiate between separation-of-function alleles vs. mutations that encode unstable/unfolded proteins. An alternative approach is to use overexpression dominant-negative (ODN) phenotypes to identify mutant proteins that disrupt function in an otherwise wild-type strain when overexpressed. This is based on the assumption that such mutant proteins retain an overall structure that is comparable to that of the wild-type protein and are able to compete with the endogenous protein (Herskowitz 1987). To test this, the in vivo phenotypes of mutations in the Est3 telomerase subunit from Saccharomyces cerevisiae were compared with the in vitro secondary structure of these mutant proteins as analyzed by circular-dichroism spectroscopy, which demonstrates that ODN is a more sensitive assessment of protein stability than the commonly used method of monitoring protein levels from extracts. Reverse mutagenesis of EST3, which targeted different categories of amino acids, also showed that mutating highly conserved charged residues to the oppositely charged amino acid had an increased likelihood of generating a severely defective est3(-) mutation, which nevertheless encoded a structurally stable protein. These results suggest that charge-swap mutagenesis directed at a limited subset of highly conserved charged residues, combined with ODN screening to eliminate partially unfolded proteins, may provide a widely applicable and efficient strategy for generating separation-of-function mutations.
Identify High-Quality Protein Structural Models by Enhanced K-Means.
Wu, Hongjie; Li, Haiou; Jiang, Min; Chen, Cheng; Lv, Qiang; Wu, Chuang
2017-01-01
Background. One critical issue in protein three-dimensional structure prediction using either ab initio or comparative modeling involves identification of high-quality protein structural models from generated decoys. Currently, clustering algorithms are widely used to identify near-native models; however, their performance is dependent upon different conformational decoys, and, for some algorithms, the accuracy declines when the decoy population increases. Results. Here, we proposed two enhanced K -means clustering algorithms capable of robustly identifying high-quality protein structural models. The first one employs the clustering algorithm SPICKER to determine the initial centroids for basic K -means clustering ( SK -means), whereas the other employs squared distance to optimize the initial centroids ( K -means++). Our results showed that SK -means and K -means++ were more robust as compared with SPICKER alone, detecting 33 (59%) and 42 (75%) of 56 targets, respectively, with template modeling scores better than or equal to those of SPICKER. Conclusions. We observed that the classic K -means algorithm showed a similar performance to that of SPICKER, which is a widely used algorithm for protein-structure identification. Both SK -means and K -means++ demonstrated substantial improvements relative to results from SPICKER and classical K -means.
Identify High-Quality Protein Structural Models by Enhanced K-Means
Li, Haiou; Chen, Cheng; Lv, Qiang; Wu, Chuang
2017-01-01
Background. One critical issue in protein three-dimensional structure prediction using either ab initio or comparative modeling involves identification of high-quality protein structural models from generated decoys. Currently, clustering algorithms are widely used to identify near-native models; however, their performance is dependent upon different conformational decoys, and, for some algorithms, the accuracy declines when the decoy population increases. Results. Here, we proposed two enhanced K-means clustering algorithms capable of robustly identifying high-quality protein structural models. The first one employs the clustering algorithm SPICKER to determine the initial centroids for basic K-means clustering (SK-means), whereas the other employs squared distance to optimize the initial centroids (K-means++). Our results showed that SK-means and K-means++ were more robust as compared with SPICKER alone, detecting 33 (59%) and 42 (75%) of 56 targets, respectively, with template modeling scores better than or equal to those of SPICKER. Conclusions. We observed that the classic K-means algorithm showed a similar performance to that of SPICKER, which is a widely used algorithm for protein-structure identification. Both SK-means and K-means++ demonstrated substantial improvements relative to results from SPICKER and classical K-means. PMID:28421198
Bandyopadhyay, Deepak; Huan, Jun; Prins, Jan; Snoeyink, Jack; Wang, Wei; Tropsha, Alexander
2009-11-01
Protein function prediction is one of the central problems in computational biology. We present a novel automated protein structure-based function prediction method using libraries of local residue packing patterns that are common to most proteins in a known functional family. Critical to this approach is the representation of a protein structure as a graph where residue vertices (residue name used as a vertex label) are connected by geometrical proximity edges. The approach employs two steps. First, it uses a fast subgraph mining algorithm to find all occurrences of family-specific labeled subgraphs for all well characterized protein structural and functional families. Second, it queries a new structure for occurrences of a set of motifs characteristic of a known family, using a graph index to speed up Ullman's subgraph isomorphism algorithm. The confidence of function inference from structure depends on the number of family-specific motifs found in the query structure compared with their distribution in a large non-redundant database of proteins. This method can assign a new structure to a specific functional family in cases where sequence alignments, sequence patterns, structural superposition and active site templates fail to provide accurate annotation.
Classification of protein quaternary structure by functional domain composition
Yu, Xiaojing; Wang, Chuan; Li, Yixue
2006-01-01
Background The number and the arrangement of subunits that form a protein are referred to as quaternary structure. Quaternary structure is an important protein attribute that is closely related to its function. Proteins with quaternary structure are called oligomeric proteins. Oligomeric proteins are involved in various biological processes, such as metabolism, signal transduction, and chromosome replication. Thus, it is highly desirable to develop some computational methods to automatically classify the quaternary structure of proteins from their sequences. Results To explore this problem, we adopted an approach based on the functional domain composition of proteins. Every protein was represented by a vector calculated from the domains in the PFAM database. The nearest neighbor algorithm (NNA) was used for classifying the quaternary structure of proteins from this information. The jackknife cross-validation test was performed on the non-redundant protein dataset in which the sequence identity was less than 25%. The overall success rate obtained is 75.17%. Additionally, to demonstrate the effectiveness of this method, we predicted the proteins in an independent dataset and achieved an overall success rate of 84.11% Conclusion Compared with the amino acid composition method and Blast, the results indicate that the domain composition approach may be a more effective and promising high-throughput method in dealing with this complicated problem in bioinformatics. PMID:16584572
Predicting nucleic acid binding interfaces from structural models of proteins
Dror, Iris; Shazman, Shula; Mukherjee, Srayanta; Zhang, Yang; Glaser, Fabian; Mandel-Gutfreund, Yael
2011-01-01
The function of DNA- and RNA-binding proteins can be inferred from the characterization and accurate prediction of their binding interfaces. However the main pitfall of various structure-based methods for predicting nucleic acid binding function is that they are all limited to a relatively small number of proteins for which high-resolution three dimensional structures are available. In this study, we developed a pipeline for extracting functional electrostatic patches from surfaces of protein structural models, obtained using the I-TASSER protein structure predictor. The largest positive patches are extracted from the protein surface using the patchfinder algorithm. We show that functional electrostatic patches extracted from an ensemble of structural models highly overlap the patches extracted from high-resolution structures. Furthermore, by testing our pipeline on a set of 55 known nucleic acid binding proteins for which I-TASSER produces high-quality models, we show that the method accurately identifies the nucleic acids binding interface on structural models of proteins. Employing a combined patch approach we show that patches extracted from an ensemble of models better predicts the real nucleic acid binding interfaces compared to patches extracted from independent models. Overall, these results suggest that combining information from a collection of low-resolution structural models could be a valuable approach for functional annotation. We suggest that our method will be further applicable for predicting other functional surfaces of proteins with unknown structure. PMID:22086767
2010-01-01
Background Trypanosoma cruzi is the etiological agent of Chagas' disease, an endemic infection that causes thousands of deaths every year in Latin America. Therapeutic options remain inefficient, demanding the search for new drugs and/or new molecular targets. Such efforts can focus on proteins that are specific to the parasite, but analogous enzymes and enzymes with a three-dimensional (3D) structure sufficiently different from the corresponding host proteins may represent equally interesting targets. In order to find these targets we used the workflows MHOLline and AnEnΠ obtaining 3D models from homologous, analogous and specific proteins of Trypanosoma cruzi versus Homo sapiens. Results We applied genome wide comparative modelling techniques to obtain 3D models for 3,286 predicted proteins of T. cruzi. In combination with comparative genome analysis to Homo sapiens, we were able to identify a subset of 397 enzyme sequences, of which 356 are homologous, 3 analogous and 38 specific to the parasite. Conclusions In this work, we present a set of 397 enzyme models of T. cruzi that can constitute potential structure-based drug targets to be investigated for the development of new strategies to fight Chagas' disease. The strategies presented here support the concept of structural analysis in conjunction with protein functional analysis as an interesting computational methodology to detect potential targets for structure-based rational drug design. For example, 2,4-dienoyl-CoA reductase (EC 1.3.1.34) and triacylglycerol lipase (EC 3.1.1.3), classified as analogous proteins in relation to H. sapiens enzymes, were identified as new potential molecular targets. PMID:21034488
Capriles, Priscila V S Z; Guimarães, Ana C R; Otto, Thomas D; Miranda, Antonio B; Dardenne, Laurent E; Degrave, Wim M
2010-10-29
Trypanosoma cruzi is the etiological agent of Chagas' disease, an endemic infection that causes thousands of deaths every year in Latin America. Therapeutic options remain inefficient, demanding the search for new drugs and/or new molecular targets. Such efforts can focus on proteins that are specific to the parasite, but analogous enzymes and enzymes with a three-dimensional (3D) structure sufficiently different from the corresponding host proteins may represent equally interesting targets. In order to find these targets we used the workflows MHOLline and AnEnΠ obtaining 3D models from homologous, analogous and specific proteins of Trypanosoma cruzi versus Homo sapiens. We applied genome wide comparative modelling techniques to obtain 3D models for 3,286 predicted proteins of T. cruzi. In combination with comparative genome analysis to Homo sapiens, we were able to identify a subset of 397 enzyme sequences, of which 356 are homologous, 3 analogous and 38 specific to the parasite. In this work, we present a set of 397 enzyme models of T. cruzi that can constitute potential structure-based drug targets to be investigated for the development of new strategies to fight Chagas' disease. The strategies presented here support the concept of structural analysis in conjunction with protein functional analysis as an interesting computational methodology to detect potential targets for structure-based rational drug design. For example, 2,4-dienoyl-CoA reductase (EC 1.3.1.34) and triacylglycerol lipase (EC 3.1.1.3), classified as analogous proteins in relation to H. sapiens enzymes, were identified as new potential molecular targets.
Pairwise amino acid secondary structural propensities
NASA Astrophysics Data System (ADS)
Chemmama, Ilan E.; Chapagain, Prem P.; Gerstman, Bernard S.
2015-04-01
We investigate the propensities for amino acids to form a specific secondary structure when they are paired with other amino acids. Our investigations use molecular dynamics (MD) computer simulations, and we compare the results to those from the Protein Data Bank (PDB). Proper comparison requires weighting of the MD results in a manner consistent with the relative frequency of appearance in the PDB of each possible pair of amino acids. We find that the propensity for an amino acid to assume a secondary structure varies dramatically depending on the amino acid that is before or after it in the primary sequence. This cooperative effect means that when selecting amino acids to facilitate the formation of a secondary structure in peptide engineering experiments, the adjacent amino acids must be considered. We also examine the preference for a secondary structure in bacterial proteins and compare the results to those of human proteins.
NASA Astrophysics Data System (ADS)
Bordner, Andrew J.; Zorman, Barry; Abagyan, Ruben
2011-10-01
Membrane proteins comprise a significant fraction of the proteomes of sequenced organisms and are the targets of approximately half of marketed drugs. However, in spite of their prevalence and biomedical importance, relatively few experimental structures are available due to technical challenges. Computational simulations can potentially address this deficit by providing structural models of membrane proteins. Solvation within the spatially heterogeneous membrane/solvent environment provides a major component of the energetics driving protein folding and association within the membrane. We have developed an implicit solvation model for membranes that is both computationally efficient and accurate enough to enable molecular mechanics predictions for the folding and association of peptides within the membrane. We derived the new atomic solvation model parameters using an unbiased fitting procedure to experimental data and have applied it to diverse problems in order to test its accuracy and to gain insight into membrane protein folding. First, we predicted the positions and orientations of peptides and complexes within the lipid bilayer and compared the simulation results with solid-state NMR structures. Additionally, we performed folding simulations for a series of host-guest peptides with varying propensities to form alpha helices in a hydrophobic environment and compared the structures with experimental measurements. We were also able to successfully predict the structures of amphipathic peptides as well as the structures for dimeric complexes of short hexapeptides that have experimentally characterized propensities to form beta sheets within the membrane. Finally, we compared calculated relative transfer energies with data from experiments measuring the effects of mutations on the free energies of translocon-mediated insertion of proteins into lipid bilayers and of combined folding and membrane insertion of a beta barrel protein.
Feyzi, Samira; Varidi, Mehdi; Zare, Fatemeh; Varidi, Mohammad Javad
2018-03-01
Different drying methods due to protein denaturation could alter the functional properties of proteins, as well as their structure. So, this study focused on the effect of different drying methods on amino acid content, thermo and functional properties, and protein structure of fenugreek protein isolate. Freeze and spray drying methods resulted in comparable protein solubility, dynamic surface and interfacial tensions, foaming and emulsifying properties except for emulsion stability. Vacuum oven drying promoted emulsion stability, surface hydrophobicity and viscosity of fenugreek protein isolate at the expanse of its protein solubility. Vacuum oven process caused a higher level of Maillard reaction followed by the spray drying process, which was confirmed by the lower amount of lysine content and less lightness, also more browning intensity. ΔH of fenugreek protein isolates was higher than soy protein isolate, which confirmed the presence of more ordered structures. Also, the bands which are attributed to the α-helix structures in the FTIR spectrum were in the shorter wave number region for freeze and spray dried fenugreek protein isolates that show more possibility of such structures. This research suggests that any drying method must be conducted in its gentle state in order to sustain native structure of proteins and promote their functionalities. © 2017 Society of Chemical Industry. © 2017 Society of Chemical Industry.
BAYESIAN PROTEIN STRUCTURE ALIGNMENT.
Rodriguez, Abel; Schmidler, Scott C
The analysis of the three-dimensional structure of proteins is an important topic in molecular biochemistry. Structure plays a critical role in defining the function of proteins and is more strongly conserved than amino acid sequence over evolutionary timescales. A key challenge is the identification and evaluation of structural similarity between proteins; such analysis can aid in understanding the role of newly discovered proteins and help elucidate evolutionary relationships between organisms. Computational biologists have developed many clever algorithmic techniques for comparing protein structures, however, all are based on heuristic optimization criteria, making statistical interpretation somewhat difficult. Here we present a fully probabilistic framework for pairwise structural alignment of proteins. Our approach has several advantages, including the ability to capture alignment uncertainty and to estimate key "gap" parameters which critically affect the quality of the alignment. We show that several existing alignment methods arise as maximum a posteriori estimates under specific choices of prior distributions and error models. Our probabilistic framework is also easily extended to incorporate additional information, which we demonstrate by including primary sequence information to generate simultaneous sequence-structure alignments that can resolve ambiguities obtained using structure alone. This combined model also provides a natural approach for the difficult task of estimating evolutionary distance based on structural alignments. The model is illustrated by comparison with well-established methods on several challenging protein alignment examples.
Life in the fast lane for protein crystallization and X-ray crystallography
NASA Technical Reports Server (NTRS)
Pusey, Marc L.; Liu, Zhi-Jie; Tempel, Wolfram; Praissman, Jeremy; Lin, Dawei; Wang, Bi-Cheng; Gavira, Jose A.; Ng, Joseph D.
2005-01-01
The common goal for structural genomic centers and consortiums is to decipher as quickly as possible the three-dimensional structures for a multitude of recombinant proteins derived from known genomic sequences. Since X-ray crystallography is the foremost method to acquire atomic resolution for macromolecules, the limiting step is obtaining protein crystals that can be useful of structure determination. High-throughput methods have been developed in recent years to clone, express, purify, crystallize and determine the three-dimensional structure of a protein gene product rapidly using automated devices, commercialized kits and consolidated protocols. However, the average number of protein structures obtained for most structural genomic groups has been very low compared to the total number of proteins purified. As more entire genomic sequences are obtained for different organisms from the three kingdoms of life, only the proteins that can be crystallized and whose structures can be obtained easily are studied. Consequently, an astonishing number of genomic proteins remain unexamined. In the era of high-throughput processes, traditional methods in molecular biology, protein chemistry and crystallization are eclipsed by automation and pipeline practices. The necessity for high-rate production of protein crystals and structures has prevented the usage of more intellectual strategies and creative approaches in experimental executions. Fundamental principles and personal experiences in protein chemistry and crystallization are minimally exploited only to obtain "low-hanging fruit" protein structures. We review the practical aspects of today's high-throughput manipulations and discuss the challenges in fast pace protein crystallization and tools for crystallography. Structural genomic pipelines can be improved with information gained from low-throughput tactics that may help us reach the higher-bearing fruits. Examples of recent developments in this area are reported from the efforts of the Southeast Collaboratory for Structural Genomics (SECSG).
Life in the Fast Lane for Protein Crystallization and X-Ray Crystallography
NASA Technical Reports Server (NTRS)
Pusey, Marc L.; Liu, Zhi-Jie; Tempel, Wolfram; Praissman, Jeremy; Lin, Dawei; Wang, Bi-Cheng; Gavira, Jose A.; Ng, Joseph D.
2004-01-01
The common goal for structural genomic centers and consortiums is to decipher as quickly as possible the three-dimensional structures for a multitude of recombinant proteins derived from known genomic sequences. Since X-ray crystallography is the foremost method to acquire atomic resolution for macromolecules, the limiting step is obtaining protein crystals that can be useful of structure determination. High-throughput methods have been developed in recent years to clone, express, purify, crystallize and determine the three-dimensional structure of a protein gene product rapidly using automated devices, commercialized kits and consolidated protocols. However, the average number of protein structures obtained for most structural genomic groups has been very low compared to the total number of proteins purified. As more entire genomic sequences are obtained for different organisms from the three kingdoms of life, only the proteins that can be crystallized and whose structures can be obtained easily are studied. Consequently, an astonishing number of genomic proteins remain unexamined. In the era of high-throughput processes, traditional methods in molecular biology, protein chemistry and crystallization are eclipsed by automation and pipeline practices. The necessity for high rate production of protein crystals and structures has prevented the usage of more intellectual strategies and creative approaches in experimental executions. Fundamental principles and personal experiences in protein chemistry and crystallization are minimally exploited only to obtain "low-hanging fruit" protein structures. We review the practical aspects of today s high-throughput manipulations and discuss the challenges in fast pace protein crystallization and tools for crystallography. Structural genomic pipelines can be improved with information gained from low-throughput tactics that may help us reach the higher-bearing fruits. Examples of recent developments in this area are reported from the efforts of the Southeast Collaboratory for Structural Genomics (SECSG).
Kojetin, Douglas J.; McLaughlin, Patrick D.; Thompson, Richele J.; Dubnau, David; Prepiak, Peter; Rance, Mark; Cavanagh, John
2009-01-01
Summary The AAA+ superfamily protein ClpC is a key regulator of cell development in Bacillus subtilis. As part of a large oligomeric complex, ClpC controls an array of cellular processes by recognizing, unfolding, and providing misfolded and aggregated proteins as substrates for the ClpP peptidase. ClpC is unique compared to other HSP100/Clp proteins, as it requires an adaptor protein for all fundamental activities. The NMR solution structure of the N-terminal repeat domain of ClpC (N-ClpCR) comprises two structural repeats of a four-helix motif. NMR experiments used to map the MecA adaptor protein interaction surface of N-ClpCR reveal that regions involved in the interaction possess conformational flexibility, as well as conformational exchange on the μs-ms time-scale. The electrostatic surface of N-ClpCR differs substantially compared to the N-domain of Escherichia coli ClpA and ClpB, suggesting that the electrostatic surface characteristics of HSP100/Clp N-domains may play a role in adaptor protein and substrate interaction specificity, and perhaps contribute to the unique adaptor protein requirement of ClpC. PMID:19361434
A Particle Swarm Optimization-Based Approach with Local Search for Predicting Protein Folding.
Yang, Cheng-Hong; Lin, Yu-Shiun; Chuang, Li-Yeh; Chang, Hsueh-Wei
2017-10-01
The hydrophobic-polar (HP) model is commonly used for predicting protein folding structures and hydrophobic interactions. This study developed a particle swarm optimization (PSO)-based algorithm combined with local search algorithms; specifically, the high exploration PSO (HEPSO) algorithm (which can execute global search processes) was combined with three local search algorithms (hill-climbing algorithm, greedy algorithm, and Tabu table), yielding the proposed HE-L-PSO algorithm. By using 20 known protein structures, we evaluated the performance of the HE-L-PSO algorithm in predicting protein folding in the HP model. The proposed HE-L-PSO algorithm exhibited favorable performance in predicting both short and long amino acid sequences with high reproducibility and stability, compared with seven reported algorithms. The HE-L-PSO algorithm yielded optimal solutions for all predicted protein folding structures. All HE-L-PSO-predicted protein folding structures possessed a hydrophobic core that is similar to normal protein folding.
Fast large-scale clustering of protein structures using Gauss integrals.
Harder, Tim; Borg, Mikael; Boomsma, Wouter; Røgen, Peter; Hamelryck, Thomas
2012-02-15
Clustering protein structures is an important task in structural bioinformatics. De novo structure prediction, for example, often involves a clustering step for finding the best prediction. Other applications include assigning proteins to fold families and analyzing molecular dynamics trajectories. We present Pleiades, a novel approach to clustering protein structures with a rigorous mathematical underpinning. The method approximates clustering based on the root mean square deviation by first mapping structures to Gauss integral vectors--which were introduced by Røgen and co-workers--and subsequently performing K-means clustering. Compared to current methods, Pleiades dramatically improves on the time needed to perform clustering, and can cluster a significantly larger number of structures, while providing state-of-the-art results. The number of low energy structures generated in a typical folding study, which is in the order of 50,000 structures, can be clustered within seconds to minutes.
Ghouzam, Yassine; Postic, Guillaume; Guerin, Pierre-Edouard; de Brevern, Alexandre G.; Gelly, Jean-Christophe
2016-01-01
Protein structure prediction based on comparative modeling is the most efficient way to produce structural models when it can be performed. ORION is a dedicated webserver based on a new strategy that performs this task. The identification by ORION of suitable templates is performed using an original profile-profile approach that combines sequence and structure evolution information. Structure evolution information is encoded into profiles using structural features, such as solvent accessibility and local conformation —with Protein Blocks—, which give an accurate description of the local protein structure. ORION has recently been improved, increasing by 5% the quality of its results. The ORION web server accepts a single protein sequence as input and searches homologous protein structures within minutes. Various databases such as PDB, SCOP and HOMSTRAD can be mined to find an appropriate structural template. For the modeling step, a protein 3D structure can be directly obtained from the selected template by MODELLER and displayed with global and local quality model estimation measures. The sequence and the predicted structure of 4 examples from the CAMEO server and a recent CASP11 target from the ‘Hard’ category (T0818-D1) are shown as pertinent examples. Our web server is accessible at http://www.dsimb.inserm.fr/ORION/. PMID:27319297
Ghouzam, Yassine; Postic, Guillaume; Guerin, Pierre-Edouard; de Brevern, Alexandre G; Gelly, Jean-Christophe
2016-06-20
Protein structure prediction based on comparative modeling is the most efficient way to produce structural models when it can be performed. ORION is a dedicated webserver based on a new strategy that performs this task. The identification by ORION of suitable templates is performed using an original profile-profile approach that combines sequence and structure evolution information. Structure evolution information is encoded into profiles using structural features, such as solvent accessibility and local conformation -with Protein Blocks-, which give an accurate description of the local protein structure. ORION has recently been improved, increasing by 5% the quality of its results. The ORION web server accepts a single protein sequence as input and searches homologous protein structures within minutes. Various databases such as PDB, SCOP and HOMSTRAD can be mined to find an appropriate structural template. For the modeling step, a protein 3D structure can be directly obtained from the selected template by MODELLER and displayed with global and local quality model estimation measures. The sequence and the predicted structure of 4 examples from the CAMEO server and a recent CASP11 target from the 'Hard' category (T0818-D1) are shown as pertinent examples. Our web server is accessible at http://www.dsimb.inserm.fr/ORION/.
Rostamian, Mosayeb; Mousavy, Seyed Jafar; Ebrahimi, Firouz; Ghadami, Seyyed Abolghasem; Sheibani, Nader; Minaei, Mohammad Ebrahim; Arefpour Torabi, Mohammad Ali
2012-01-01
Recently, botulinum neurotoxin (BoNT)-derived recombinant proteins have been suggested as potential botulism vaccines. Here, with concentrating on BoNT type E (BoNT/E), we studied two of these binding domain-based recombinant proteins: a multivalent chimer protein, which is composed of BoNT serotypes A, B and E binding subdomains, and a monovalent recombinant protein, which contains 93 amino acid residues from recombinant C-terminal heavy chain of BoNT/E (rBoNT/E-HCC). Both proteins have an identical region (48 aa) that contains one of the most important BoNT/E epitopes (YLTHMRD sequence). The recombinant protein efficiency in antibody production, their structural differences, and their BoNT/E-epitope location were compared by using ELISA, circular dichroism, computational modeling, and hydrophobicity predictions. Immunological studies indicated that the antibody yield against rBoNT/E-HCC was higher than chimer protein. Cross ELISA confirmed that the antibodies against the chimer protein recognized rBoNT/E-HCC more efficiently. However, both antibody groups (anti-chimer and anti-rBoNT/E-HCC antibodies) were able to recognize other proteins. Structural studies with circular dichroism showed that chimer proteins have slightly more secondary structures than rBoNT/E-HCC. The immunological results suggested that the above-mentioned identical region in rBoNT/E-HCC is more exposed. Circular dichroism, computational protein modeling and hydrophobicity predictions indicated a more exposed location for the identical region in rBoNT/E-HCC than the chimer protein, which is strongly in agreement with immunological results.
Kadumuri, Rajashekar Varma; Vadrevu, Ramakrishna
2017-10-01
Due to their crucial role in function, folding, and stability, protein loops are being targeted for grafting/designing to create novel or alter existing functionality and improve stability and foldability. With a view to facilitate a thorough analysis and effectual search options for extracting and comparing loops for sequence and structural compatibility, we developed, LoopX a comprehensively compiled library of sequence and conformational features of ∼700,000 loops from protein structures. The database equipped with a graphical user interface is empowered with diverse query tools and search algorithms, with various rendering options to visualize the sequence- and structural-level information along with hydrogen bonding patterns, backbone φ, ψ dihedral angles of both the target and candidate loops. Two new features (i) conservation of the polar/nonpolar environment and (ii) conservation of sequence and conformation of specific residues within the loops have also been incorporated in the search and retrieval of compatible loops for a chosen target loop. Thus, the LoopX server not only serves as a database and visualization tool for sequence and structural analysis of protein loops but also aids in extracting and comparing candidate loops for a given target loop based on user-defined search options.
An Evolution-Based Approach to De Novo Protein Design and Case Study on Mycobacterium tuberculosis
Brender, Jeffrey R.; Czajka, Jeff; Marsh, David; Gray, Felicia; Cierpicki, Tomasz; Zhang, Yang
2013-01-01
Computational protein design is a reverse procedure of protein folding and structure prediction, where constructing structures from evolutionarily related proteins has been demonstrated to be the most reliable method for protein 3-dimensional structure prediction. Following this spirit, we developed a novel method to design new protein sequences based on evolutionarily related protein families. For a given target structure, a set of proteins having similar fold are identified from the PDB library by structural alignments. A structural profile is then constructed from the protein templates and used to guide the conformational search of amino acid sequence space, where physicochemical packing is accommodated by single-sequence based solvation, torsion angle, and secondary structure predictions. The method was tested on a computational folding experiment based on a large set of 87 protein structures covering different fold classes, which showed that the evolution-based design significantly enhances the foldability and biological functionality of the designed sequences compared to the traditional physics-based force field methods. Without using homologous proteins, the designed sequences can be folded with an average root-mean-square-deviation of 2.1 Å to the target. As a case study, the method is extended to redesign all 243 structurally resolved proteins in the pathogenic bacteria Mycobacterium tuberculosis, which is the second leading cause of death from infectious disease. On a smaller scale, five sequences were randomly selected from the design pool and subjected to experimental validation. The results showed that all the designed proteins are soluble with distinct secondary structure and three have well ordered tertiary structure, as demonstrated by circular dichroism and NMR spectroscopy. Together, these results demonstrate a new avenue in computational protein design that uses knowledge of evolutionary conservation from protein structural families to engineer new protein molecules of improved fold stability and biological functionality. PMID:24204234
Evolution of a protein folding nucleus.
Xia, Xue; Longo, Liam M; Sutherland, Mason A; Blaber, Michael
2016-07-01
The folding nucleus (FN) is a cryptic element within protein primary structure that enables an efficient folding pathway and is the postulated heritable element in the evolution of protein architecture; however, almost nothing is known regarding how the FN structurally changes as complex protein architecture evolves from simpler peptide motifs. We report characterization of the FN of a designed purely symmetric β-trefoil protein by ϕ-value analysis. We compare the structure and folding properties of key foldable intermediates along the evolutionary trajectory of the β-trefoil. The results show structural acquisition of the FN during gene fusion events, incorporating novel turn structure created by gene fusion. Furthermore, the FN is adjusted by circular permutation in response to destabilizing functional mutation. FN plasticity by way of circular permutation is made possible by the intrinsic C3 cyclic symmetry of the β-trefoil architecture, identifying a possible selective advantage that helps explain the prevalence of cyclic structural symmetry in the proteome. © 2015 The Protein Society.
Marti, Alessandra; Bock, Jayne E; Pagani, Maria Ambrogina; Ismail, Baraem; Seetharaman, Koushik
2016-03-01
The high protein and fiber content of intermediate wheatgrass (IWG) - together with its interesting agronomic traits and environment-related benefits - make this perennial crop attractive also for human consumption. Structural characteristics of the proteins in IWG/hard wheat flour (HWF) doughs (at IWG:HWF ratios of 0:100, 50:50, 75:25 and 100:0) - including aggregate formation, thiols availability, and secondary structure changes during dough mixing - were investigated. Proteins in IWG-doughs had higher solubility and thiol content - as function of IWG content - suggesting that protein network was mostly based on non-covalent interactions. While 50% IWG-enrichment gave an increase in random structures, enrichment at ⩾75% resulted in a decrease in β-sheets with an increase in random structures, indicating a decrease in structural order. The observed differences in protein molecular configuration and interactions in HWF compared to IWG doughs necessitate further investigation to establish their impact on the quality of IWG-enriched bread. Copyright © 2015 Elsevier Ltd. All rights reserved.
CALCOM: a software for calculating the center of mass of proteins.
Costantini, Susan; Paladino, Antonella; Facchiano, Angelo M
2008-02-09
The center of mass of a protein is an artificial point useful for detecting important and simple features of proteins structure, shape and association.CALCOM is a software which calculates the center of mass of a protein, starting from PDB protein structure files. In the case of protein complexes and of protein-small ligand complexes, the position of protein residues or of ligand atoms respect to each protein subunit can be evaluated, as well as the distance among the center of mass of the protein subunits, in order to compare different conformations and evaluate the relative motion of subunits. THE SERVICE IS AVAILABLE AT THE URL: http://bioinformatica.isa.cnr.it/CALCOM/.
XAS Characterization of the Zn Site of Non-structural Protein 3 (NS3) from Hepatitis C Virus
NASA Astrophysics Data System (ADS)
Ascone, I.; Nobili, G.; Benfatto, M.; Congiu-Castellano, A.
2007-02-01
XANES spectra of non structural protein 3 (NS3) have been calculated using 4 Zn coordination models from three crystallographic structures in the Protein Data Base (PDB): 1DY9, subunit B, 1CU1 subunit A and B, and 1JXP subunit B. Results indicate that XANES is an appropriate tool to distinguish among them. Experimental XANES spectra have been simulated refining crystallographic data. The model obtained by XAS is compared with the PDB models.
Johnson, Derrick E.; Xue, Bin; Sickmeier, Megan D.; Meng, Jingwei; Cortese, Marc S.; Oldfield, Christopher J.; Le Gall, Tanguy; Dunker, A. Keith; Uversky, Vladimir N.
2012-01-01
The identification of intrinsically disordered proteins (IDPs) among the targets that fail to form satisfactory crystal structures in the Protein Structure Initiative represent a key to reducing the costs and time for determining three-dimensional structures of proteins. To help in this endeavor, several Protein Structure Initiative Centers were asked to send samples of both crystallizable proteins and proteins that failed to crystallize. The abundance of intrinsic disorder in these proteins was evaluated via computational analysis using Predictors of Natural Disordered Regions (PONDR®) and the potential cleavage sites and corresponding fragments were determined. Then, the target proteins were analyzed for intrinsic disorder by their resistance to limited proteolysis. The rates of tryptic digestion of sample target proteins were compared to those of lysozyme/myoglobin, apo-myoglobin and α-casein as standards of ordered, partially disordered and completely disordered proteins, respectively. At the next stage, the protein samples were subjected to both far-UV and near-UV circular dichroism (CD) analysis. For most of the samples, a good agreement between CD data, predictions of disorder and the rates of limited tryptic digestion was established. Further experimentation is being performed on a smaller subset of these samples in order to obtain more detailed information on the ordered/disordered nature of the proteins. PMID:22651963
Probing binding hot spots at protein-RNA recognition sites.
Barik, Amita; Nithin, Chandran; Karampudi, Naga Bhushana Rao; Mukherjee, Sunandan; Bahadur, Ranjit Prasad
2016-01-29
We use evolutionary conservation derived from structure alignment of polypeptide sequences along with structural and physicochemical attributes of protein-RNA interfaces to probe the binding hot spots at protein-RNA recognition sites. We find that the degree of conservation varies across the RNA binding proteins; some evolve rapidly compared to others. Additionally, irrespective of the structural class of the complexes, residues at the RNA binding sites are evolutionary better conserved than those at the solvent exposed surfaces. For recognitions involving duplex RNA, residues interacting with the major groove are better conserved than those interacting with the minor groove. We identify multi-interface residues participating simultaneously in protein-protein and protein-RNA interfaces in complexes where more than one polypeptide is involved in RNA recognition, and show that they are better conserved compared to any other RNA binding residues. We find that the residues at water preservation site are better conserved than those at hydrated or at dehydrated sites. Finally, we develop a Random Forests model using structural and physicochemical attributes for predicting binding hot spots. The model accurately predicts 80% of the instances of experimental ΔΔG values in a particular class, and provides a stepping-stone towards the engineering of protein-RNA recognition sites with desired affinity. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Zhou, Peng; Wang, Congcong; Tian, Feifei; Ren, Yanrong; Yang, Chao; Huang, Jian
2013-01-01
Quantitative structure-activity relationship (QSAR), a regression modeling methodology that establishes statistical correlation between structure feature and apparent behavior for a series of congeneric molecules quantitatively, has been widely used to evaluate the activity, toxicity and property of various small-molecule compounds such as drugs, toxicants and surfactants. However, it is surprising to see that such useful technique has only very limited applications to biomacromolecules, albeit the solved 3D atom-resolution structures of proteins, nucleic acids and their complexes have accumulated rapidly in past decades. Here, we present a proof-of-concept paradigm for the modeling, prediction and interpretation of the binding affinity of 144 sequence-nonredundant, structure-available and affinity-known protein complexes (Kastritis et al. Protein Sci 20:482-491, 2011) using a biomacromolecular QSAR (BioQSAR) scheme. We demonstrate that the modeling performance and predictive power of BioQSAR are comparable to or even better than that of traditional knowledge-based strategies, mechanism-type methods and empirical scoring algorithms, while BioQSAR possesses certain additional features compared to the traditional methods, such as adaptability, interpretability, deep-validation and high-efficiency. The BioQSAR scheme could be readily modified to infer the biological behavior and functions of other biomacromolecules, if their X-ray crystal structures, NMR conformation assemblies or computationally modeled structures are available.
Huber, Roland G.; Bond, Peter J.
2017-01-01
An improved knowledge of protein-protein interactions is essential for better understanding of metabolic and signaling networks, and cellular function. Progress tends to be based on structure determination and predictions using known structures, along with computational methods based on evolutionary information or detailed atomistic descriptions. We hypothesized that for the case of interactions across a common interface, between proteins from a pair of paralogue families or within a family of paralogues, a relatively simple interface description could distinguish between binding and non-binding pairs. Using binding data for several systems, and large-scale comparative modeling based on known template complex structures, it is found that charge-charge interactions (for groups bearing net charge) are generally a better discriminant than buried non-polar surface. This is particularly the case for paralogue families that are less divergent, with more reliable comparative modeling. We suggest that electrostatic interactions are major determinants of specificity in such systems, an observation that could be used to predict binding partners. PMID:29016650
Ivanov, Stefan M; Cawley, Andrew; Huber, Roland G; Bond, Peter J; Warwicker, Jim
2017-01-01
An improved knowledge of protein-protein interactions is essential for better understanding of metabolic and signaling networks, and cellular function. Progress tends to be based on structure determination and predictions using known structures, along with computational methods based on evolutionary information or detailed atomistic descriptions. We hypothesized that for the case of interactions across a common interface, between proteins from a pair of paralogue families or within a family of paralogues, a relatively simple interface description could distinguish between binding and non-binding pairs. Using binding data for several systems, and large-scale comparative modeling based on known template complex structures, it is found that charge-charge interactions (for groups bearing net charge) are generally a better discriminant than buried non-polar surface. This is particularly the case for paralogue families that are less divergent, with more reliable comparative modeling. We suggest that electrostatic interactions are major determinants of specificity in such systems, an observation that could be used to predict binding partners.
Jowitt, Thomas A; Murdoch, Alan D; Baldock, Clair; Berry, Richard; Day, Joanna M; Hardingham, Timothy E
2010-01-01
Structural investigation of proteins containing large stretches of sequences without predicted secondary structure is the focus of much increased attention. Here, we have produced an unglycosylated 30 kDa peptide from the chondroitin sulphate (CS)-attachment region of human aggrecan (CS-peptide), which was predicted to be intrinsically disordered and compared its structure with the adjacent aggrecan G3 domain. Biophysical analyses, including analytical ultracentrifugation, light scattering, and circular dichroism showed that the CS-peptide had an elongated and stiffened conformation in contrast to the globular G3 domain. The results suggested that it contained significant secondary structure, which was sensitive to urea, and we propose that the CS-peptide forms an elongated wormlike molecule based on a dynamic range of energetically equivalent secondary structures stabilized by hydrogen bonds. The dimensions of the structure predicted from small-angle X-ray scattering analysis were compatible with EM images of fully glycosylated aggrecan and a partly glycosylated aggrecan CS2-G3 construct. The semiordered structure identified in CS-peptide was not predicted by common structural algorithms and identified a potentially distinct class of semiordered structure within sequences currently identified as disordered. Sequence comparisons suggested some evidence for comparable structures in proteins encoded by other genes (PRG4, MUC5B, and CBP). The function of these semiordered sequences may serve to spatially position attached folded modules and/or to present polypeptides for modification, such as glycosylation, and to provide templates for the multiple pleiotropic interactions proposed for disordered proteins. Proteins 2010. © 2010 Wiley-Liss, Inc. PMID:20806220
DOE Office of Scientific and Technical Information (OSTI.GOV)
Biswas, Shyamasri; Buhrman, Greg; Gagnon, Keith
2012-07-11
Box C/D ribonucleoproteins (RNP) guide the 2'-O-methylation of targeted nucleotides in archaeal and eukaryotic rRNAs. The archaeal L7Ae and eukaryotic 15.5kD box C/D RNP core protein homologues initiate RNP assembly by recognizing kink-turn (K-turn) motifs. The crystal structure of the 15.5kD core protein from the primitive eukaryote Giardia lamblia is described here to a resolution of 1.8 {angstrom}. The Giardia 15.5kD protein exhibits the typical {alpha}-{beta}-{alpha} sandwich fold exhibited by both archaeal L7Ae and eukaryotic 15.5kD proteins. Characteristic of eukaryotic homologues, the Giardia 15.5kD protein binds the K-turn motif but not the variant K-loop motif. The highly conserved residues ofmore » loop 9, critical for RNA binding, also exhibit conformations similar to those of the human 15.5kD protein when bound to the K-turn motif. However, comparative sequence analysis indicated a distinct evolutionary position between Archaea and Eukarya. Indeed, assessment of the Giardia 15.5kD protein in denaturing experiments demonstrated an intermediate stability in protein structure when compared with that of the eukaryotic mouse 15.5kD and archaeal Methanocaldococcus jannaschii L7Ae proteins. Most notable was the ability of the Giardia 15.5kD protein to assemble in vitro a catalytically active chimeric box C/D RNP utilizing the archaeal M. jannaschii Nop56/58 and fibrillarin core proteins. In contrast, a catalytically competent chimeric RNP could not be assembled using the mouse 15.5kD protein. Collectively, these analyses suggest that the G. lamblia 15.5kD protein occupies a unique position in the evolution of this box C/D RNP core protein retaining structural and functional features characteristic of both archaeal L7Ae and higher eukaryotic 15.5kD homologues.« less
NegGOA: negative GO annotations selection using ontology structure.
Fu, Guangyuan; Wang, Jun; Yang, Bo; Yu, Guoxian
2016-10-01
Predicting the biological functions of proteins is one of the key challenges in the post-genomic era. Computational models have demonstrated the utility of applying machine learning methods to predict protein function. Most prediction methods explicitly require a set of negative examples-proteins that are known not carrying out a particular function. However, Gene Ontology (GO) almost always only provides the knowledge that proteins carry out a particular function, and functional annotations of proteins are incomplete. GO structurally organizes more than tens of thousands GO terms and a protein is annotated with several (or dozens) of these terms. For these reasons, the negative examples of a protein can greatly help distinguishing true positive examples of the protein from such a large candidate GO space. In this paper, we present a novel approach (called NegGOA) to select negative examples. Specifically, NegGOA takes advantage of the ontology structure, available annotations and potentiality of additional annotations of a protein to choose negative examples of the protein. We compare NegGOA with other negative examples selection algorithms and find that NegGOA produces much fewer false negatives than them. We incorporate the selected negative examples into an efficient function prediction model to predict the functions of proteins in Yeast, Human, Mouse and Fly. NegGOA also demonstrates improved accuracy than these comparing algorithms across various evaluation metrics. In addition, NegGOA is less suffered from incomplete annotations of proteins than these comparing methods. The Matlab and R codes are available at https://sites.google.com/site/guoxian85/neggoa gxyu@swu.edu.cn Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
AlQuraishi, Mohammed; Tang, Shengdong; Xia, Xide
2015-11-19
Molecular interactions between proteins and DNA molecules underlie many cellular processes, including transcriptional regulation, chromosome replication, and nucleosome positioning. Computational analyses of protein-DNA interactions rely on experimental data characterizing known protein-DNA interactions structurally and biochemically. While many databases exist that contain either structural or biochemical data, few integrate these two data sources in a unified fashion. Such integration is becoming increasingly critical with the rapid growth of structural and biochemical data, and the emergence of algorithms that rely on the synthesis of multiple data types to derive computational models of molecular interactions. We have developed an integrated affinity-structure database in which the experimental and quantitative DNA binding affinities of helix-turn-helix proteins are mapped onto the crystal structures of the corresponding protein-DNA complexes. This database provides access to: (i) protein-DNA structures, (ii) quantitative summaries of protein-DNA binding affinities using position weight matrices, and (iii) raw experimental data of protein-DNA binding instances. Critically, this database establishes a correspondence between experimental structural data and quantitative binding affinity data at the single basepair level. Furthermore, we present a novel alignment algorithm that structurally aligns the protein-DNA complexes in the database and creates a unified residue-level coordinate system for comparing the physico-chemical environments at the interface between complexes. Using this unified coordinate system, we compute the statistics of atomic interactions at the protein-DNA interface of helix-turn-helix proteins. We provide an interactive website for visualization, querying, and analyzing this database, and a downloadable version to facilitate programmatic analysis. This database will facilitate the analysis of protein-DNA interactions and the development of programmatic computational methods that capitalize on integration of structural and biochemical datasets. The database can be accessed at http://ProteinDNA.hms.harvard.edu.
A generalized analysis of hydrophobic and loop clusters within globular protein sequences
Eudes, Richard; Le Tuan, Khanh; Delettré, Jean; Mornon, Jean-Paul; Callebaut, Isabelle
2007-01-01
Background Hydrophobic Cluster Analysis (HCA) is an efficient way to compare highly divergent sequences through the implicit secondary structure information directly derived from hydrophobic clusters. However, its efficiency and application are currently limited by the need of user expertise. In order to help the analysis of HCA plots, we report here the structural preferences of hydrophobic cluster species, which are frequently encountered in globular domains of proteins. These species are characterized only by their hydrophobic/non-hydrophobic dichotomy. This analysis has been extended to loop-forming clusters, using an appropriate loop alphabet. Results The structural behavior of hydrophobic cluster species, which are typical of protein globular domains, was investigated within banks of experimental structures, considered at different levels of sequence redundancy. The 294 more frequent hydrophobic cluster species were analyzed with regard to their association with the different secondary structures (frequencies of association with secondary structures and secondary structure propensities). Hydrophobic cluster species are predominantly associated with regular secondary structures, and a large part (60 %) reveals preferences for α-helices or β-strands. Moreover, the analysis of the hydrophobic cluster amino acid composition generally allows for finer prediction of the regular secondary structure associated with the considered cluster within a cluster species. We also investigated the behavior of loop forming clusters, using a "PGDNS" alphabet. These loop clusters do not overlap with hydrophobic clusters and are highly associated with coils. Finally, the structural information contained in the hydrophobic structural words, as deduced from experimental structures, was compared to the PSI-PRED predictions, revealing that β-strands and especially α-helices are generally over-predicted within the limits of typical β and α hydrophobic clusters. Conclusion The dictionary of hydrophobic clusters described here can help the HCA user to interpret and compare the HCA plots of globular protein sequences, as well as provides an original fundamental insight into the structural bricks of protein folds. Moreover, the novel loop cluster analysis brings additional information for secondary structure prediction on the whole sequence through a generalized cluster analysis (GCA), and not only on regular secondary structures. Such information lays the foundations for developing a new and original tool for secondary structure prediction. PMID:17210072
Optimization of protein-protein docking for predicting Fc-protein interactions.
Agostino, Mark; Mancera, Ricardo L; Ramsland, Paul A; Fernández-Recio, Juan
2016-11-01
The antibody crystallizable fragment (Fc) is recognized by effector proteins as part of the immune system. Pathogens produce proteins that bind Fc in order to subvert or evade the immune response. The structural characterization of the determinants of Fc-protein association is essential to improve our understanding of the immune system at the molecular level and to develop new therapeutic agents. Furthermore, Fc-binding peptides and proteins are frequently used to purify therapeutic antibodies. Although several structures of Fc-protein complexes are available, numerous others have not yet been determined. Protein-protein docking could be used to investigate Fc-protein complexes; however, improved approaches are necessary to efficiently model such cases. In this study, a docking-based structural bioinformatics approach is developed for predicting the structures of Fc-protein complexes. Based on the available set of X-ray structures of Fc-protein complexes, three regions of the Fc, loosely corresponding to three turns within the structure, were defined as containing the essential features for protein recognition and used as restraints to filter the initial docking search. Rescoring the filtered poses with an optimal scoring strategy provided a success rate of approximately 80% of the test cases examined within the top ranked 20 poses, compared to approximately 20% by the initial unrestrained docking. The developed docking protocol provides a significant improvement over the initial unrestrained docking and will be valuable for predicting the structures of currently undetermined Fc-protein complexes, as well as in the design of peptides and proteins that target Fc. Copyright © 2016 John Wiley & Sons, Ltd.
Lee, Hui Sun; Im, Wonpil
2016-04-01
Molecular recognition by protein mostly occurs in a local region on the protein surface. Thus, an efficient computational method for accurate characterization of protein local structural conservation is necessary to better understand biology and drug design. We present a novel local structure alignment tool, G-LoSA. G-LoSA aligns protein local structures in a sequence order independent way and provides a GA-score, a chemical feature-based and size-independent structure similarity score. Our benchmark validation shows the robust performance of G-LoSA to the local structures of diverse sizes and characteristics, demonstrating its universal applicability to local structure-centric comparative biology studies. In particular, G-LoSA is highly effective in detecting conserved local regions on the entire surface of a given protein. In addition, the applications of G-LoSA to identifying template ligands and predicting ligand and protein binding sites illustrate its strong potential for computer-aided drug design. We hope that G-LoSA can be a useful computational method for exploring interesting biological problems through large-scale comparison of protein local structures and facilitating drug discovery research and development. G-LoSA is freely available to academic users at http://im.compbio.ku.edu/GLoSA/. © 2016 The Protein Society.
NASA Astrophysics Data System (ADS)
Rossi, Barbara; Giarola, Marco; Mariotto, Gino; Ambrosi, Emmanuele; Monaco, Hugo L.
2010-05-01
Protein SOUL is a new member of the recently discovered putative heme-binding protein family called SOUL/HEBP and, to date, no structural information exists for this protein. Here, micro-Raman spectroscopy is used to study the vibrational properties of single crystals obtained from recombinant protein SOUL by means of two different optimization routes. This spectroscopic approach offers the valuable advantage of the in-situ collection of experimental data from protein crystals, placed onto a hanging-drop plate, under the same conditions used to grow the crystals. By focusing on the regions of amides I and III bands, some secondary structure characteristic features have been recognized. Moreover, some side-chain marker bands were observed in the Raman spectra of SOUL crystals and the unambiguous assignment of these peaks inferred by comparing the experimental Raman spectra of pure amino acids and their Raman intensities computed using quantum chemical calculations. Our comparative analysis allows to get a deeper understanding of the side-chain environments and of the interactions involving these specific amino acids in the two different SOUL crystals.
Kundu, Sangeeta; Roy, Debjani
2012-09-01
Comparative molecular dynamics simulations of Ca²⁺ dependent psychrophilic type II antifreeze protein (AFP) from herring (Clupea harengus) (hAFP) and Ca²⁺ dependent type II antifreeze protein from long snout poacher (Brachyopsis rostratus) (lpAFP) have been performed for 10 ns each at five different temperatures. We have tried to investigate whether the Ca²⁺ dependent protein obtains any advantage in nature over the independent one. To this end the dynamic properties of these two proteins have been compared in terms of secondary structure content, molecular flexibility, solvent accessibility, intra molecular hydrogen bonds and protein-solvent interactions. At 298 and 373 K the flexibility of the Ca²⁺ independent molecule is higher which indicates that Ca²⁺ could contribute to stabilize the structure. The thermal unfolding pathways of the two proteins have also been monitored. The rate of unfolding is similar up to 373 K, beyond that hAFP shows faster unfolding than lpAFP. The essential subspaces explored by the simulations of hAFP and lpAFP at different temperatures are significantly different as revealed from principal component analysis. Our results may help in understanding the role of Ca²⁺ for hAFP to express antifreeze activity. Furthermore our study may also help in elucidating the molecular basis of thermostability of two structurally similar proteins, which perform the same function in different manner, one in presence of Ca²⁺, and the other in absence of the same. Copyright © 2012 Elsevier Inc. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gao, Jianzhao; Wu, Zhonghua; Hu, Gang
Selection of proper targets for the X-ray crystallography will benefit biological research community immensely. Several computational models were proposed to predict propensity of successful protein production and diffraction quality crystallization from protein sequences. We reviewed a comprehensive collection of 22 such predictors that were developed in the last decade. We found that almost all of these models are easily accessible as webservers and/or standalone software and we demonstrated that some of them are widely used by the research community. We empirically evaluated and compared the predictive performance of seven representative methods. The analysis suggests that these methods produce quite accuratemore » propensities for the diffraction-quality crystallization. We also summarized results of the first study of the relation between these predictive propensities and the resolution of the crystallizable proteins. We found that the propensities predicted by several methods are significantly higher for proteins that have high resolution structures compared to those with the low resolution structures. Moreover, we tested a new meta-predictor, MetaXXC, which averages the propensities generated by the three most accurate predictors of the diffraction-quality crystallization. MetaXXC generates putative values of resolution that have modest levels of correlation with the experimental resolutions and it offers the lowest mean absolute error when compared to the seven considered methods. We conclude that protein sequences can be used to fairly accurately predict whether their corresponding protein structures can be solved using X-ray crystallography. Moreover, we also ascertain that sequences can be used to reasonably well predict the resolution of the resulting protein crystals.« less
Vasala, A; Dupont, L; Baumann, M; Ritzenthaler, P; Alatossava, T
1993-01-01
Virulent phage LL-H and temperate phage mv4 are two related bacteriophages of Lactobacillus delbrueckii. The gene clusters encoding structural proteins of these two phages have been sequenced and further analyzed. Six open reading frames (ORF-1 to ORF-6) were detected. Protein sequencing and Western immunoblotting experiments confirmed that ORF-3 (g34) encoded the main capsid protein Gp34. The presence of a putative late promoter in front of the phage LL-H g34 gene was suggested by primer extension experiments. Comparative sequence analysis between phage LL-H and phage mv4 revealed striking similarities in the structure and organization of this gene cluster, suggesting that the genes encoding phage structural proteins belong to a highly conservative module. Images PMID:8497043
The crystal structure of human GDP-L-fucose synthase.
Zhou, Huan; Sun, Lihua; Li, Jian; Xu, Chunyan; Yu, Feng; Liu, Yahui; Ji, Chaoneng; He, Jianhua
2013-09-01
Human GDP-l-fucose synthase, also known as FX protein, synthesizes GDP-l-fucose from its substrate GDP-4-keto-6-deoxy-d-mannose. The reaction involves epimerization at both C-3 and C-5 followed by an NADPH-dependent reduction of the carbonyl at C-4. In this paper, the first crystal structure of human FX protein was determined at 2.37 Å resolution. The asymmetric unit of the crystal structure contains four molecules which form two homodimers. Each molecule consists of two domains, a Rossmann-fold NADPH-binding motif and a carboxyl terminal domain. Compared with the Escherichia coli GDP-l-fucose synthase, the overall structures of these two enzymes have four major differences. There are four loops in the structure of human FX protein corresponding to two α-helices and two β-sheets in that of the E. coli enzyme. Besides, there are seven different amino acid residues binding with NAPDH comparing human FX protein with that from E. coli. The structure of human FX reveals the key catalytic residues and could be useful for the design of drugs for the treatment of inflammation, auto-immune diseases, and possibly certain types of cancer.
Kavianpour, Hamidreza; Vasighi, Mahdi
2017-02-01
Nowadays, having knowledge about cellular attributes of proteins has an important role in pharmacy, medical science and molecular biology. These attributes are closely correlated with the function and three-dimensional structure of proteins. Knowledge of protein structural class is used by various methods for better understanding the protein functionality and folding patterns. Computational methods and intelligence systems can have an important role in performing structural classification of proteins. Most of protein sequences are saved in databanks as characters and strings and a numerical representation is essential for applying machine learning methods. In this work, a binary representation of protein sequences is introduced based on reduced amino acids alphabets according to surrounding hydrophobicity index. Many important features which are hidden in these long binary sequences can be clearly displayed through their cellular automata images. The extracted features from these images are used to build a classification model by support vector machine. Comparing to previous studies on the several benchmark datasets, the promising classification rates obtained by tenfold cross-validation imply that the current approach can help in revealing some inherent features deeply hidden in protein sequences and improve the quality of predicting protein structural class.
Modeling the Structure of Helical Assemblies with Experimental Constraints in Rosetta.
André, Ingemar
2018-01-01
Determining high-resolution structures of proteins with helical symmetry can be challenging due to limitations in experimental data. In such instances, structure-based protein simulations driven by experimental data can provide a valuable approach for building models of helical assemblies. This chapter describes how the Rosetta macromolecular package can be used to model homomeric protein assemblies with helical symmetry in a range of modeling scenarios including energy refinement, symmetrical docking, comparative modeling, and de novo structure prediction. Data-guided structure modeling of helical assemblies with experimental information from electron density, X-ray fiber diffraction, solid-state NMR, and chemical cross-linking mass spectrometry is also described.
Yang, Jian-Yi; Peng, Zhen-Ling; Yu, Zu-Guo; Zhang, Rui-Jie; Anh, Vo; Wang, Desheng
2009-04-21
In this paper, we intend to predict protein structural classes (alpha, beta, alpha+beta, or alpha/beta) for low-homology data sets. Two data sets were used widely, 1189 (containing 1092 proteins) and 25PDB (containing 1673 proteins) with sequence homology being 40% and 25%, respectively. We propose to decompose the chaos game representation of proteins into two kinds of time series. Then, a novel and powerful nonlinear analysis technique, recurrence quantification analysis (RQA), is applied to analyze these time series. For a given protein sequence, a total of 16 characteristic parameters can be calculated with RQA, which are treated as feature representation of protein sequences. Based on such feature representation, the structural class for each protein is predicted with Fisher's linear discriminant algorithm. The jackknife test is used to test and compare our method with other existing methods. The overall accuracies with step-by-step procedure are 65.8% and 64.2% for 1189 and 25PDB data sets, respectively. With one-against-others procedure used widely, we compare our method with five other existing methods. Especially, the overall accuracies of our method are 6.3% and 4.1% higher for the two data sets, respectively. Furthermore, only 16 parameters are used in our method, which is less than that used by other methods. This suggests that the current method may play a complementary role to the existing methods and is promising to perform the prediction of protein structural classes.
Computational modeling of membrane proteins
Leman, Julia Koehler; Ulmschneider, Martin B.; Gray, Jeffrey J.
2014-01-01
The determination of membrane protein (MP) structures has always trailed that of soluble proteins due to difficulties in their overexpression, reconstitution into membrane mimetics, and subsequent structure determination. The percentage of MP structures in the protein databank (PDB) has been at a constant 1-2% for the last decade. In contrast, over half of all drugs target MPs, only highlighting how little we understand about drug-specific effects in the human body. To reduce this gap, researchers have attempted to predict structural features of MPs even before the first structure was experimentally elucidated. In this review, we present current computational methods to predict MP structure, starting with secondary structure prediction, prediction of trans-membrane spans, and topology. Even though these methods generate reliable predictions, challenges such as predicting kinks or precise beginnings and ends of secondary structure elements are still waiting to be addressed. We describe recent developments in the prediction of 3D structures of both α-helical MPs as well as β-barrels using comparative modeling techniques, de novo methods, and molecular dynamics (MD) simulations. The increase of MP structures has (1) facilitated comparative modeling due to availability of more and better templates, and (2) improved the statistics for knowledge-based scoring functions. Moreover, de novo methods have benefitted from the use of correlated mutations as restraints. Finally, we outline current advances that will likely shape the field in the forthcoming decade. PMID:25355688
Watching proteins function with picosecond X-ray crystallography and molecular dynamics simulations.
NASA Astrophysics Data System (ADS)
Anfinrud, Philip
2006-03-01
Time-resolved electron density maps of myoglobin, a ligand-binding heme protein, have been stitched together into movies that unveil with < 2-å spatial resolution and 150-ps time-resolution the correlated protein motions that accompany and/or mediate ligand migration within the hydrophobic interior of a protein. A joint analysis of all-atom molecular dynamics (MD) calculations and picosecond time-resolved X-ray structures provides single-molecule insights into mechanisms of protein function. Ensemble-averaged MD simulations of the L29F mutant of myoglobin following ligand dissociation reproduce the direction, amplitude, and timescales of crystallographically-determined structural changes. This close agreement with experiments at comparable resolution in space and time validates the individual MD trajectories, which identify and structurally characterize a conformational switch that directs dissociated ligands to one of two nearby protein cavities. This unique combination of simulation and experiment unveils functional protein motions and illustrates at an atomic level relationships among protein structure, dynamics, and function. In collaboration with Friedrich Schotte and Gerhard Hummer, NIH.
Predicting nucleic acid binding interfaces from structural models of proteins.
Dror, Iris; Shazman, Shula; Mukherjee, Srayanta; Zhang, Yang; Glaser, Fabian; Mandel-Gutfreund, Yael
2012-02-01
The function of DNA- and RNA-binding proteins can be inferred from the characterization and accurate prediction of their binding interfaces. However, the main pitfall of various structure-based methods for predicting nucleic acid binding function is that they are all limited to a relatively small number of proteins for which high-resolution three-dimensional structures are available. In this study, we developed a pipeline for extracting functional electrostatic patches from surfaces of protein structural models, obtained using the I-TASSER protein structure predictor. The largest positive patches are extracted from the protein surface using the patchfinder algorithm. We show that functional electrostatic patches extracted from an ensemble of structural models highly overlap the patches extracted from high-resolution structures. Furthermore, by testing our pipeline on a set of 55 known nucleic acid binding proteins for which I-TASSER produces high-quality models, we show that the method accurately identifies the nucleic acids binding interface on structural models of proteins. Employing a combined patch approach we show that patches extracted from an ensemble of models better predicts the real nucleic acid binding interfaces compared with patches extracted from independent models. Overall, these results suggest that combining information from a collection of low-resolution structural models could be a valuable approach for functional annotation. We suggest that our method will be further applicable for predicting other functional surfaces of proteins with unknown structure. Copyright © 2011 Wiley Periodicals, Inc.
DROIDS 1.20: A GUI-Based Pipeline for GPU-Accelerated Comparative Protein Dynamics.
Babbitt, Gregory A; Mortensen, Jamie S; Coppola, Erin E; Adams, Lily E; Liao, Justin K
2018-03-13
Traditional informatics in comparative genomics work only with static representations of biomolecules (i.e., sequence and structure), thereby ignoring the molecular dynamics (MD) of proteins that define function in the cell. A comparative approach applied to MD would connect this very short timescale process, defined in femtoseconds, to one of the longest in the universe: molecular evolution measured in millions of years. Here, we leverage advances in graphics-processing-unit-accelerated MD simulation software to develop a comparative method of MD analysis and visualization that can be applied to any two homologous Protein Data Bank structures. Our open-source pipeline, DROIDS (Detecting Relative Outlier Impacts in Dynamic Simulations), works in conjunction with existing molecular modeling software to convert any Linux gaming personal computer into a "comparative computational microscope" for observing the biophysical effects of mutations and other chemical changes in proteins. DROIDS implements structural alignment and Benjamini-Hochberg-corrected Kolmogorov-Smirnov statistics to compare nanosecond-scale atom bond fluctuations on the protein backbone, color mapping the significant differences identified in protein MD with single-amino-acid resolution. DROIDS is simple to use, incorporating graphical user interface control for Amber16 MD simulations, cpptraj analysis, and the final statistical and visual representations in R graphics and UCSF Chimera. We demonstrate that DROIDS can be utilized to visually investigate molecular evolution and disease-related functional changes in MD due to genetic mutation and epigenetic modification. DROIDS can also be used to potentially investigate binding interactions of pharmaceuticals, toxins, or other biomolecules in a functional evolutionary context as well. Copyright © 2018 Biophysical Society. Published by Elsevier Inc. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ali, Ananya; Ghosh, Semanti; Bagchi, Angshuman
Protein-Protein Interactions (PPIs) are crucial in most of the biological processes and PPI dysfunctions are known to be associated with the onsets of various diseases. One of such diseases is the auto-immune disease. Auto-immune diseases are one among the less studied group of diseases with very high mortality rates. Thus, we tried to correlate the appearances of mutations with their probable biochemical basis of the molecular mechanisms leading to the onset of the disease phenotypes. We compared the effects of the Single Amino Acid Variants (SAVs) in the wild type and mutated proteins to identify any structural deformities that mightmore » lead to altered PPIs leading ultimately to disease onset. For this we used Relative Solvent Accessibility (RSA) as a spatial parameter to compare the structural perturbation in mutated and wild type proteins. We observed that the mutations were capable to increase intra-chain PPIs whereas inter-chain PPIs would remain mostly unaltered. This might lead to more intra-molecular friction causing a deleterious alteration of protein's normal function. A Lyapunov exponent analysis, using the altered RSA values due to polymorphic and disease causing mutations, revealed polymorphic mutations have a positive mean value for the Lyapunov exponent while disease causing mutations have a negative mean value. Thus, local spatial stochasticity has been lost due to disease causing mutations, indicating a loss of structural fluidity. The amino acid conversion plot also showed a clear tendency of altered surface patch residue conversion propensity than polymorphic conversions. So far, this is the first report that compares the effects of different kinds of mutations (disease and non-disease causing polymorphic mutations) in the onset of autoimmune diseases. - Highlights: • Protein-Protein Interaction. • Changes in Relative Solvent Accessibility (RSA). • Amino acid conversion matrix. • Polymorphic mutations. • Disease causing mutations.« less
A Generative Angular Model of Protein Structure Evolution
Golden, Michael; García-Portugués, Eduardo; Sørensen, Michael; Mardia, Kanti V.; Hamelryck, Thomas; Hein, Jotun
2017-01-01
Abstract Recently described stochastic models of protein evolution have demonstrated that the inclusion of structural information in addition to amino acid sequences leads to a more reliable estimation of evolutionary parameters. We present a generative, evolutionary model of protein structure and sequence that is valid on a local length scale. The model concerns the local dependencies between sequence and structure evolution in a pair of homologous proteins. The evolutionary trajectory between the two structures in the protein pair is treated as a random walk in dihedral angle space, which is modeled using a novel angular diffusion process on the two-dimensional torus. Coupling sequence and structure evolution in our model allows for modeling both “smooth” conformational changes and “catastrophic” conformational jumps, conditioned on the amino acid changes. The model has interpretable parameters and is comparatively more realistic than previous stochastic models, providing new insights into the relationship between sequence and structure evolution. For example, using the trained model we were able to identify an apparent sequence–structure evolutionary motif present in a large number of homologous protein pairs. The generative nature of our model enables us to evaluate its validity and its ability to simulate aspects of protein evolution conditioned on an amino acid sequence, a related amino acid sequence, a related structure or any combination thereof. PMID:28453724
Protein based Block Copolymers
Rabotyagova, Olena S.; Cebe, Peggy; Kaplan, David L.
2011-01-01
Advances in genetic engineering have led to the synthesis of protein-based block copolymers with control of chemistry and molecular weight, resulting in unique physical and biological properties. The benefits from incorporating peptide blocks into copolymer designs arise from the fundamental properties of proteins to adopt ordered conformations and to undergo self-assembly, providing control over structure formation at various length scales when compared to conventional block copolymers. This review covers the synthesis, structure, assembly, properties, and applications of protein-based block copolymers. PMID:21235251
Amezcua, Carlos A; Szabo, Christina M
2013-06-01
In this work, we applied nuclear magnetic resonance (NMR) spectroscopy to rapidly assess higher order structure (HOS) comparability in protein samples. Using a variation of the NMR fingerprinting approach described by Panjwani et al. [2010. J Pharm Sci 99(8):3334-3342], three nonglycosylated proteins spanning a molecular weight range of 6.5-67 kDa were analyzed. A simple statistical method termed easy comparability of HOS by NMR (ECHOS-NMR) was developed. In this method, HOS similarity between two samples is measured via the correlation coefficient derived from linear regression analysis of binned NMR spectra. Applications of this method include HOS comparability assessment during new product development, manufacturing process changes, supplier changes, next-generation products, and the development of biosimilars to name just a few. We foresee ECHOS-NMR becoming a routine technique applied to comparability exercises used to complement data from other analytical techniques. Copyright © 2013 Wiley Periodicals, Inc.
Multi-Conformer Ensemble Docking to Difficult Protein Targets
Ellingson, Sally R.; Miao, Yinglong; Baudry, Jerome; ...
2014-09-08
We investigate large-scale ensemble docking using five proteins from the Directory of Useful Decoys (DUD, dud.docking.org) for which docking to crystal structures has proven difficult. Molecular dynamics trajectories are produced for each protein and an ensemble of representative conformational structures extracted from the trajectories. Docking calculations are performed on these selected simulation structures and ensemble-based enrichment factors compared with those obtained using docking in crystal structures of the same protein targets or random selection of compounds. We also found simulation-derived snapshots with improved enrichment factors that increased the chemical diversity of docking hits for four of the five selected proteins.more » A combination of all the docking results obtained from molecular dynamics simulation followed by selection of top-ranking compounds appears to be an effective strategy for increasing the number and diversity of hits when using docking to screen large libraries of chemicals against difficult protein targets.« less
Construction of ontology augmented networks for protein complex prediction.
Zhang, Yijia; Lin, Hongfei; Yang, Zhihao; Wang, Jian
2013-01-01
Protein complexes are of great importance in understanding the principles of cellular organization and function. The increase in available protein-protein interaction data, gene ontology and other resources make it possible to develop computational methods for protein complex prediction. Most existing methods focus mainly on the topological structure of protein-protein interaction networks, and largely ignore the gene ontology annotation information. In this article, we constructed ontology augmented networks with protein-protein interaction data and gene ontology, which effectively unified the topological structure of protein-protein interaction networks and the similarity of gene ontology annotations into unified distance measures. After constructing ontology augmented networks, a novel method (clustering based on ontology augmented networks) was proposed to predict protein complexes, which was capable of taking into account the topological structure of the protein-protein interaction network, as well as the similarity of gene ontology annotations. Our method was applied to two different yeast protein-protein interaction datasets and predicted many well-known complexes. The experimental results showed that (i) ontology augmented networks and the unified distance measure can effectively combine the structure closeness and gene ontology annotation similarity; (ii) our method is valuable in predicting protein complexes and has higher F1 and accuracy compared to other competing methods.
Theodoridou, Katerina; Yu, Peiqiang
2013-06-12
Protein quality relies not only on total protein but also on protein inherent structures. The most commonly occurring protein secondary structures (α-helix and β-sheet) may influence protein quality, nutrient utilization, and digestive behavior. The objectives of this study were to reveal the protein molecular structures of canola meal (yellow and brown) and presscake as affected by the heat-processing methods and to investigate the relationship between structure changes and protein rumen degradations kinetics, estimated protein intestinal digestibility, degraded protein balance, and metabolizable protein. Heat-processing conditions resulted in a higher value for α-helix and β-sheet for brown canola presscake compared to brown canola meal. The multivariate molecular spectral analyses (PCA, CLA) showed that there were significant molecular structural differences in the protein amide I and II fingerprint region (ca. 1700-1480 cm(-1)) between the brown canola meal and presscake. The in situ degradation parameters, amide I and II, and α-helix to β-sheet ratio (R_a_β) were positively correlated with the degradable fraction and the degradation rate. Modeling results showed that α-helix was positively correlated with the truly absorbed rumen synthesized microbial protein in the small intestine when using both the Dutch DVE/OEB system and the NRC-2001 model. Concerning the protein profiles, R_a_β was a better predictor for crude protein (79%) and for neutral detergent insoluble crude protein (68%). In conclusion, ATR-FT/IR molecular spectroscopy may be used to rapidly characterize feed structures at the molecular level and also as a potential predictor of feed functionality, digestive behavior, and nutrient utilization of canola feed.
Experimental Protein Structure Verification by Scoring with a Single, Unassigned NMR Spectrum.
Courtney, Joseph M; Ye, Qing; Nesbitt, Anna E; Tang, Ming; Tuttle, Marcus D; Watt, Eric D; Nuzzio, Kristin M; Sperling, Lindsay J; Comellas, Gemma; Peterson, Joseph R; Morrissey, James H; Rienstra, Chad M
2015-10-06
Standard methods for de novo protein structure determination by nuclear magnetic resonance (NMR) require time-consuming data collection and interpretation efforts. Here we present a qualitatively distinct and novel approach, called Comparative, Objective Measurement of Protein Architectures by Scoring Shifts (COMPASS), which identifies the best structures from a set of structural models by numerical comparison with a single, unassigned 2D (13)C-(13)C NMR spectrum containing backbone and side-chain aliphatic signals. COMPASS does not require resonance assignments. It is particularly well suited for interpretation of magic-angle spinning solid-state NMR spectra, but also applicable to solution NMR spectra. We demonstrate COMPASS with experimental data from four proteins--GB1, ubiquitin, DsbA, and the extracellular domain of human tissue factor--and with reconstructed spectra from 11 additional proteins. For all these proteins, with molecular mass up to 25 kDa, COMPASS distinguished the correct fold, most often within 1.5 Å root-mean-square deviation of the reference structure. Copyright © 2015 Elsevier Ltd. All rights reserved.
Ali, Ananya; Ghosh, Semanti; Bagchi, Angshuman
2017-02-26
Protein-Protein Interactions (PPIs) are crucial in most of the biological processes and PPI dysfunctions are known to be associated with the onsets of various diseases. One of such diseases is the auto-immune disease. Auto-immune diseases are one among the less studied group of diseases with very high mortality rates. Thus, we tried to correlate the appearances of mutations with their probable biochemical basis of the molecular mechanisms leading to the onset of the disease phenotypes. We compared the effects of the Single Amino Acid Variants (SAVs) in the wild type and mutated proteins to identify any structural deformities that might lead to altered PPIs leading ultimately to disease onset. For this we used Relative Solvent Accessibility (RSA) as a spatial parameter to compare the structural perturbation in mutated and wild type proteins. We observed that the mutations were capable to increase intra-chain PPIs whereas inter-chain PPIs would remain mostly unaltered. This might lead to more intra-molecular friction causing a deleterious alteration of protein's normal function. A Lyapunov exponent analysis, using the altered RSA values due to polymorphic and disease causing mutations, revealed polymorphic mutations have a positive mean value for the Lyapunov exponent while disease causing mutations have a negative mean value. Thus, local spatial stochasticity has been lost due to disease causing mutations, indicating a loss of structural fluidity. The amino acid conversion plot also showed a clear tendency of altered surface patch residue conversion propensity than polymorphic conversions. So far, this is the first report that compares the effects of different kinds of mutations (disease and non-disease causing polymorphic mutations) in the onset of autoimmune diseases. Copyright © 2017 Elsevier Inc. All rights reserved.
Protein Structure Classification and Loop Modeling Using Multiple Ramachandran Distributions.
Najibi, Seyed Morteza; Maadooliat, Mehdi; Zhou, Lan; Huang, Jianhua Z; Gao, Xin
2017-01-01
Recently, the study of protein structures using angular representations has attracted much attention among structural biologists. The main challenge is how to efficiently model the continuous conformational space of the protein structures based on the differences and similarities between different Ramachandran plots. Despite the presence of statistical methods for modeling angular data of proteins, there is still a substantial need for more sophisticated and faster statistical tools to model the large-scale circular datasets. To address this need, we have developed a nonparametric method for collective estimation of multiple bivariate density functions for a collection of populations of protein backbone angles. The proposed method takes into account the circular nature of the angular data using trigonometric spline which is more efficient compared to existing methods. This collective density estimation approach is widely applicable when there is a need to estimate multiple density functions from different populations with common features. Moreover, the coefficients of adaptive basis expansion for the fitted densities provide a low-dimensional representation that is useful for visualization, clustering, and classification of the densities. The proposed method provides a novel and unique perspective to two important and challenging problems in protein structure research: structure-based protein classification and angular-sampling-based protein loop structure prediction.
NASA Astrophysics Data System (ADS)
Canino, Lawrence S.; Shen, Tongye; McCammon, J. Andrew
2002-12-01
We extend the self-consistent pair contact probability method to the evaluation of the partition function for a protein complex at thermodynamic equilibrium. Specifically, we adapt the method for multichain models and introduce a parametrization for amino acid-specific pairwise interactions. This method is similar to the Gaussian network model but allows for the adjusting of the strengths of native state contacts. The method is first validated on a high resolution x-ray crystal structure of bovine Pancreatic Phospholipase A2 by comparing calculated B-factors with reported values. We then examine binding-induced changes in flexibility in protein-protein complexes, comparing computed results with those obtained from x-ray crystal structures and molecular dynamics simulations. In particular, we focus on the mouse acetylcholinesterase:fasciculin II and the human α-thrombin:thrombomodulin complexes.
Walia, Rasna R; Caragea, Cornelia; Lewis, Benjamin A; Towfic, Fadi; Terribilini, Michael; El-Manzalawy, Yasser; Dobbs, Drena; Honavar, Vasant
2012-05-10
RNA molecules play diverse functional and structural roles in cells. They function as messengers for transferring genetic information from DNA to proteins, as the primary genetic material in many viruses, as catalysts (ribozymes) important for protein synthesis and RNA processing, and as essential and ubiquitous regulators of gene expression in living organisms. Many of these functions depend on precisely orchestrated interactions between RNA molecules and specific proteins in cells. Understanding the molecular mechanisms by which proteins recognize and bind RNA is essential for comprehending the functional implications of these interactions, but the recognition 'code' that mediates interactions between proteins and RNA is not yet understood. Success in deciphering this code would dramatically impact the development of new therapeutic strategies for intervening in devastating diseases such as AIDS and cancer. Because of the high cost of experimental determination of protein-RNA interfaces, there is an increasing reliance on statistical machine learning methods for training predictors of RNA-binding residues in proteins. However, because of differences in the choice of datasets, performance measures, and data representations used, it has been difficult to obtain an accurate assessment of the current state of the art in protein-RNA interface prediction. We provide a review of published approaches for predicting RNA-binding residues in proteins and a systematic comparison and critical assessment of protein-RNA interface residue predictors trained using these approaches on three carefully curated non-redundant datasets. We directly compare two widely used machine learning algorithms (Naïve Bayes (NB) and Support Vector Machine (SVM)) using three different data representations in which features are encoded using either sequence- or structure-based windows. Our results show that (i) Sequence-based classifiers that use a position-specific scoring matrix (PSSM)-based representation (PSSMSeq) outperform those that use an amino acid identity based representation (IDSeq) or a smoothed PSSM (SmoPSSMSeq); (ii) Structure-based classifiers that use smoothed PSSM representation (SmoPSSMStr) outperform those that use PSSM (PSSMStr) as well as sequence identity based representation (IDStr). PSSMSeq classifiers, when tested on an independent test set of 44 proteins, achieve performance that is comparable to that of three state-of-the-art structure-based predictors (including those that exploit geometric features) in terms of Matthews Correlation Coefficient (MCC), although the structure-based methods achieve substantially higher Specificity (albeit at the expense of Sensitivity) compared to sequence-based methods. We also find that the expected performance of the classifiers on a residue level can be markedly different from that on a protein level. Our experiments show that the classifiers trained on three different non-redundant protein-RNA interface datasets achieve comparable cross-validation performance. However, we find that the results are significantly affected by differences in the distance threshold used to define interface residues. Our results demonstrate that protein-RNA interface residue predictors that use a PSSM-based encoding of sequence windows outperform classifiers that use other encodings of sequence windows. While structure-based methods that exploit geometric features can yield significant increases in the Specificity of protein-RNA interface residue predictions, such increases are offset by decreases in Sensitivity. These results underscore the importance of comparing alternative methods using rigorous statistical procedures, multiple performance measures, and datasets that are constructed based on several alternative definitions of interface residues and redundancy cutoffs as well as including evaluations on independent test sets into the comparisons.
Efficient Multicriteria Protein Structure Comparison on Modern Processor Architectures
Manolakos, Elias S.
2015-01-01
Fast increasing computational demand for all-to-all protein structures comparison (PSC) is a result of three confounding factors: rapidly expanding structural proteomics databases, high computational complexity of pairwise protein comparison algorithms, and the trend in the domain towards using multiple criteria for protein structures comparison (MCPSC) and combining results. We have developed a software framework that exploits many-core and multicore CPUs to implement efficient parallel MCPSC in modern processors based on three popular PSC methods, namely, TMalign, CE, and USM. We evaluate and compare the performance and efficiency of the two parallel MCPSC implementations using Intel's experimental many-core Single-Chip Cloud Computer (SCC) as well as Intel's Core i7 multicore processor. We show that the 48-core SCC is more efficient than the latest generation Core i7, achieving a speedup factor of 42 (efficiency of 0.9), making many-core processors an exciting emerging technology for large-scale structural proteomics. We compare and contrast the performance of the two processors on several datasets and also show that MCPSC outperforms its component methods in grouping related domains, achieving a high F-measure of 0.91 on the benchmark CK34 dataset. The software implementation for protein structure comparison using the three methods and combined MCPSC, along with the developed underlying rckskel algorithmic skeletons library, is available via GitHub. PMID:26605332
Efficient Multicriteria Protein Structure Comparison on Modern Processor Architectures.
Sharma, Anuj; Manolakos, Elias S
2015-01-01
Fast increasing computational demand for all-to-all protein structures comparison (PSC) is a result of three confounding factors: rapidly expanding structural proteomics databases, high computational complexity of pairwise protein comparison algorithms, and the trend in the domain towards using multiple criteria for protein structures comparison (MCPSC) and combining results. We have developed a software framework that exploits many-core and multicore CPUs to implement efficient parallel MCPSC in modern processors based on three popular PSC methods, namely, TMalign, CE, and USM. We evaluate and compare the performance and efficiency of the two parallel MCPSC implementations using Intel's experimental many-core Single-Chip Cloud Computer (SCC) as well as Intel's Core i7 multicore processor. We show that the 48-core SCC is more efficient than the latest generation Core i7, achieving a speedup factor of 42 (efficiency of 0.9), making many-core processors an exciting emerging technology for large-scale structural proteomics. We compare and contrast the performance of the two processors on several datasets and also show that MCPSC outperforms its component methods in grouping related domains, achieving a high F-measure of 0.91 on the benchmark CK34 dataset. The software implementation for protein structure comparison using the three methods and combined MCPSC, along with the developed underlying rckskel algorithmic skeletons library, is available via GitHub.
Geometrical comparison of two protein structures using Wigner-D functions.
Saberi Fathi, S M; White, Diana T; Tuszynski, Jack A
2014-10-01
In this article, we develop a quantitative comparison method for two arbitrary protein structures. This method uses a root-mean-square deviation characterization and employs a series expansion of the protein's shape function in terms of the Wigner-D functions to define a new criterion, which is called a "similarity value." We further demonstrate that the expansion coefficients for the shape function obtained with the help of the Wigner-D functions correspond to structure factors. Our method addresses the common problem of comparing two proteins with different numbers of atoms. We illustrate it with a worked example. © 2014 Wiley Periodicals, Inc.
GOSSIP: a method for fast and accurate global alignment of protein structures.
Kifer, I; Nussinov, R; Wolfson, H J
2011-04-01
The database of known protein structures (PDB) is increasing rapidly. This results in a growing need for methods that can cope with the vast amount of structural data. To analyze the accumulating data, it is important to have a fast tool for identifying similar structures and clustering them by structural resemblance. Several excellent tools have been developed for the comparison of protein structures. These usually address the task of local structure alignment, an important yet computationally intensive problem due to its complexity. It is difficult to use such tools for comparing a large number of structures to each other at a reasonable time. Here we present GOSSIP, a novel method for a global all-against-all alignment of any set of protein structures. The method detects similarities between structures down to a certain cutoff (a parameter of the program), hence allowing it to detect similar structures at a much higher speed than local structure alignment methods. GOSSIP compares many structures in times which are several orders of magnitude faster than well-known available structure alignment servers, and it is also faster than a database scanning method. We evaluate GOSSIP both on a dataset of short structural fragments and on two large sequence-diverse structural benchmarks. Our conclusions are that for a threshold of 0.6 and above, the speed of GOSSIP is obtained with no compromise of the accuracy of the alignments or of the number of detected global similarities. A server, as well as an executable for download, are available at http://bioinfo3d.cs.tau.ac.il/gossip/.
Leite, Wellington C; Galvão, Carolina W; Saab, Sérgio C; Iulek, Jorge; Etto, Rafael M; Steffens, Maria B R; Chitteni-Pattu, Sindhu; Stanage, Tyler; Keck, James L; Cox, Michael M
2016-01-01
The bacterial RecA protein plays a role in the complex system of DNA damage repair. Here, we report the functional and structural characterization of the Herbaspirillum seropedicae RecA protein (HsRecA). HsRecA protein is more efficient at displacing SSB protein from ssDNA than Escherichia coli RecA protein. HsRecA also promotes DNA strand exchange more efficiently. The three dimensional structure of HsRecA-ADP/ATP complex has been solved to 1.7 Å resolution. HsRecA protein contains a small N-terminal domain, a central core ATPase domain and a large C-terminal domain, that are similar to homologous bacterial RecA proteins. Comparative structural analysis showed that the N-terminal polymerization motif of archaeal and eukaryotic RecA family proteins are also present in bacterial RecAs. Reconstruction of electrostatic potential from the hexameric structure of HsRecA-ADP/ATP revealed a high positive charge along the inner side, where ssDNA is bound inside the filament. The properties of this surface may explain the greater capacity of HsRecA protein to bind ssDNA, forming a contiguous nucleoprotein filament, displace SSB and promote DNA exchange relative to EcRecA. Our functional and structural analyses provide insight into the molecular mechanisms of polymerization of bacterial RecA as a helical nucleoprotein filament.
Heinz, Eva; Lithgow, Trevor
2014-01-01
Members of the Omp85/TpsB protein superfamily are ubiquitously distributed in Gram-negative bacteria, and function in protein translocation (e.g., FhaC) or the assembly of outer membrane proteins (e.g., BamA). Several recent findings are suggestive of a further level of variation in the superfamily, including the identification of the novel membrane protein assembly factor TamA and protein translocase PlpD. To investigate the diversity and the causal evolutionary events, we undertook a comprehensive comparative sequence analysis of the Omp85/TpsB proteins. A total of 10 protein subfamilies were apparent, distinguished in their domain structure and sequence signatures. In addition to the proteins FhaC, BamA, and TamA, for which structural and functional information is available, are families of proteins with so far undescribed domain architectures linked to the Omp85 β-barrel domain. This study brings a classification structure to a dynamic protein superfamily of high interest given its essential function for Gram-negative bacteria as well as its diverse domain architecture, and we discuss several scenarios of putative functions of these so far undescribed proteins. PMID:25101071
VoroMQA: Assessment of protein structure quality using interatomic contact areas.
Olechnovič, Kliment; Venclovas, Česlovas
2017-06-01
In the absence of experimentally determined protein structure many biological questions can be addressed using computational structural models. However, the utility of protein structural models depends on their quality. Therefore, the estimation of the quality of predicted structures is an important problem. One of the approaches to this problem is the use of knowledge-based statistical potentials. Such methods typically rely on the statistics of distances and angles of residue-residue or atom-atom interactions collected from experimentally determined structures. Here, we present VoroMQA (Voronoi tessellation-based Model Quality Assessment), a new method for the estimation of protein structure quality. Our method combines the idea of statistical potentials with the use of interatomic contact areas instead of distances. Contact areas, derived using Voronoi tessellation of protein structure, are used to describe and seamlessly integrate both explicit interactions between protein atoms and implicit interactions of protein atoms with solvent. VoroMQA produces scores at atomic, residue, and global levels, all in the fixed range from 0 to 1. The method was tested on the CASP data and compared to several other single-model quality assessment methods. VoroMQA showed strong performance in the recognition of the native structure and in the structural model selection tests, thus demonstrating the efficacy of interatomic contact areas in estimating protein structure quality. The software implementation of VoroMQA is freely available as a standalone application and as a web server at http://bioinformatics.lt/software/voromqa. Proteins 2017; 85:1131-1145. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.
Designing and benchmarking the MULTICOM protein structure prediction system
2013-01-01
Background Predicting protein structure from sequence is one of the most significant and challenging problems in bioinformatics. Numerous bioinformatics techniques and tools have been developed to tackle almost every aspect of protein structure prediction ranging from structural feature prediction, template identification and query-template alignment to structure sampling, model quality assessment, and model refinement. How to synergistically select, integrate and improve the strengths of the complementary techniques at each prediction stage and build a high-performance system is becoming a critical issue for constructing a successful, competitive protein structure predictor. Results Over the past several years, we have constructed a standalone protein structure prediction system MULTICOM that combines multiple sources of information and complementary methods at all five stages of the protein structure prediction process including template identification, template combination, model generation, model assessment, and model refinement. The system was blindly tested during the ninth Critical Assessment of Techniques for Protein Structure Prediction (CASP9) in 2010 and yielded very good performance. In addition to studying the overall performance on the CASP9 benchmark, we thoroughly investigated the performance and contributions of each component at each stage of prediction. Conclusions Our comprehensive and comparative study not only provides useful and practical insights about how to select, improve, and integrate complementary methods to build a cutting-edge protein structure prediction system but also identifies a few new sources of information that may help improve the design of a protein structure prediction system. Several components used in the MULTICOM system are available at: http://sysbio.rnet.missouri.edu/multicom_toolbox/. PMID:23442819
Estimating structure quality trends in the Protein Data Bank by equivalent resolution.
Bagaria, Anurag; Jaravine, Victor; Güntert, Peter
2013-10-01
The quality of protein structures obtained by different experimental and ab-initio calculation methods varies considerably. The methods have been evolving over time by improving both experimental designs and computational techniques, and since the primary aim of these developments is the procurement of reliable and high-quality data, better techniques resulted on average in an evolution toward higher quality structures in the Protein Data Bank (PDB). Each method leaves a specific quantitative and qualitative "trace" in the PDB entry. Certain information relevant to one method (e.g. dynamics for NMR) may be lacking for another method. Furthermore, some standard measures of quality for one method cannot be calculated for other experimental methods, e.g. crystal resolution or NMR bundle RMSD. Consequently, structures are classified in the PDB by the method used. Here we introduce a method to estimate a measure of equivalent X-ray resolution (e-resolution), expressed in units of Å, to assess the quality of any type of monomeric, single-chain protein structure, irrespective of the experimental structure determination method. We showed and compared the trends in the quality of structures in the Protein Data Bank over the last two decades for five different experimental techniques, excluding theoretical structure predictions. We observed that as new methods are introduced, they undergo a rapid method development evolution: within several years the e-resolution score becomes similar for structures obtained from the five methods and they improve from initially poor performance to acceptable quality, comparable with previously established methods, the performance of which is essentially stable. Copyright © 2013 Elsevier Ltd. All rights reserved.
Xin, Hangshu; Zhang, Xuewei; Yu, Peiqiang
2013-01-01
This study was conducted to compare: (1) protein chemical characteristics, including the amide I and II region, as well as protein secondary structure; and (2) carbohydrate internal structure and functional groups spectral intensities between the frost damaged wheat and normal wheat using synchrotron radiation-based Fourier transform infrared microspectroscopy (SR-FTIRM). Fingerprint regions of specific interest in our study involved protein and carbohydrate functional group band assignments, including protein amide I and II (ca. 1774–1475 cm−1), structural carbohydrates (SCHO, ca. 1498–1176 cm−1), cellulosic compounds (CELC, ca. 1295–1176 cm−1), total carbohydrates (CHO, ca. 1191–906 cm−1) and non-structural carbohydrates (NSCHO, ca. 954–809 cm−1). The results showed that frost did cause variations in spectral profiles in wheat grains. Compared with healthy wheat grains, frost damaged wheat had significantly lower (p < 0.05) spectral intensities in height and area ratios of amide I to II and almost all the spectral parameters of carbohydrate-related functional groups, including SCHO, CHO and NSCHO. Furthermore, the height ratio of protein amide I to the third peak of CHO and the area ratios of protein amide (amide I + II) to carbohydrate compounds (CHO and SCHO) were also changed (p < 0.05) in damaged wheat grains. It was concluded that the SR-FTIR microspectroscopic technique was able to examine inherent molecular structure features at an ultra-spatial resolution (10 × 10 μm) between different wheat grains samples. The structural characterization of wheat was influenced by climate conditions, such as frost damage, and these structural variations might be a major reason for the decreases in nutritive values, nutrients availability and milling and baking quality in wheat grains. PMID:23949633
Biological and functional relevance of CASP predictions
Liu, Tianyun; Ish‐Shalom, Shirbi; Torng, Wen; Lafita, Aleix; Bock, Christian; Mort, Matthew; Cooper, David N; Bliven, Spencer; Capitani, Guido; Mooney, Sean D.
2017-01-01
Abstract Our goal is to answer the question: compared with experimental structures, how useful are predicted models for functional annotation? We assessed the functional utility of predicted models by comparing the performances of a suite of methods for functional characterization on the predictions and the experimental structures. We identified 28 sites in 25 protein targets to perform functional assessment. These 28 sites included nine sites with known ligand binding (holo‐sites), nine sites that are expected or suggested by experimental authors for small molecule binding (apo‐sites), and Ten sites containing important motifs, loops, or key residues with important disease‐associated mutations. We evaluated the utility of the predictions by comparing their microenvironments to the experimental structures. Overall structural quality correlates with functional utility. However, the best‐ranked predictions (global) may not have the best functional quality (local). Our assessment provides an ability to discriminate between predictions with high structural quality. When assessing ligand‐binding sites, most prediction methods have higher performance on apo‐sites than holo‐sites. Some servers show consistently high performance for certain types of functional sites. Finally, many functional sites are associated with protein‐protein interaction. We also analyzed biologically relevant features from the protein assemblies of two targets where the active site spanned the protein‐protein interface. For the assembly targets, we find that the features in the models are mainly determined by the choice of template. PMID:28975675
Jo, Sunhwan; Lee, Hui Sun; Skolnick, Jeffrey; Im, Wonpil
2013-01-01
Understanding glycan structure and dynamics is central to understanding protein-carbohydrate recognition and its role in protein-protein interactions. Given the difficulties in obtaining the glycan's crystal structure in glycoconjugates due to its flexibility and heterogeneity, computational modeling could play an important role in providing glycosylated protein structure models. To address if glycan structures available in the PDB can be used as templates or fragments for glycan modeling, we present a survey of the N-glycan structures of 35 different sequences in the PDB. Our statistical analysis shows that the N-glycan structures found on homologous glycoproteins are significantly conserved compared to the random background, suggesting that N-glycan chains can be confidently modeled with template glycan structures whose parent glycoproteins share sequence similarity. On the other hand, N-glycan structures found on non-homologous glycoproteins do not show significant global structural similarity. Nonetheless, the internal substructures of these N-glycans, particularly, the substructures that are closer to the protein, show significantly similar structures, suggesting that such substructures can be used as fragments in glycan modeling. Increased interactions with protein might be responsible for the restricted conformational space of N-glycan chains. Our results suggest that structure prediction/modeling of N-glycans of glycoconjugates using structure database could be effective and different modeling approaches would be needed depending on the availability of template structures.
Restricted N-glycan Conformational Space in the PDB and Its Implication in Glycan Structure Modeling
Jo, Sunhwan; Lee, Hui Sun; Skolnick, Jeffrey; Im, Wonpil
2013-01-01
Understanding glycan structure and dynamics is central to understanding protein-carbohydrate recognition and its role in protein-protein interactions. Given the difficulties in obtaining the glycan's crystal structure in glycoconjugates due to its flexibility and heterogeneity, computational modeling could play an important role in providing glycosylated protein structure models. To address if glycan structures available in the PDB can be used as templates or fragments for glycan modeling, we present a survey of the N-glycan structures of 35 different sequences in the PDB. Our statistical analysis shows that the N-glycan structures found on homologous glycoproteins are significantly conserved compared to the random background, suggesting that N-glycan chains can be confidently modeled with template glycan structures whose parent glycoproteins share sequence similarity. On the other hand, N-glycan structures found on non-homologous glycoproteins do not show significant global structural similarity. Nonetheless, the internal substructures of these N-glycans, particularly, the substructures that are closer to the protein, show significantly similar structures, suggesting that such substructures can be used as fragments in glycan modeling. Increased interactions with protein might be responsible for the restricted conformational space of N-glycan chains. Our results suggest that structure prediction/modeling of N-glycans of glycoconjugates using structure database could be effective and different modeling approaches would be needed depending on the availability of template structures. PMID:23516343
Structural analysis of a set of proteins resulting from a bacterial genomics project.
Badger, J; Sauder, J M; Adams, J M; Antonysamy, S; Bain, K; Bergseid, M G; Buchanan, S G; Buchanan, M D; Batiyenko, Y; Christopher, J A; Emtage, S; Eroshkina, A; Feil, I; Furlong, E B; Gajiwala, K S; Gao, X; He, D; Hendle, J; Huber, A; Hoda, K; Kearins, P; Kissinger, C; Laubert, B; Lewis, H A; Lin, J; Loomis, K; Lorimer, D; Louie, G; Maletic, M; Marsh, C D; Miller, I; Molinari, J; Muller-Dieckmann, H J; Newman, J M; Noland, B W; Pagarigan, B; Park, F; Peat, T S; Post, K W; Radojicic, S; Ramos, A; Romero, R; Rutter, M E; Sanderson, W E; Schwinn, K D; Tresser, J; Winhoven, J; Wright, T A; Wu, L; Xu, J; Harris, T J R
2005-09-01
The targets of the Structural GenomiX (SGX) bacterial genomics project were proteins conserved in multiple prokaryotic organisms with no obvious sequence homolog in the Protein Data Bank of known structures. The outcome of this work was 80 structures, covering 60 unique sequences and 49 different genes. Experimental phase determination from proteins incorporating Se-Met was carried out for 45 structures with most of the remainder solved by molecular replacement using members of the experimentally phased set as search models. An automated tool was developed to deposit these structures in the Protein Data Bank, along with the associated X-ray diffraction data (including refined experimental phases) and experimentally confirmed sequences. BLAST comparisons of the SGX structures with structures that had appeared in the Protein Data Bank over the intervening 3.5 years since the SGX target list had been compiled identified homologs for 49 of the 60 unique sequences represented by the SGX structures. This result indicates that, for bacterial structures that are relatively easy to express, purify, and crystallize, the structural coverage of gene space is proceeding rapidly. More distant sequence-structure relationships between the SGX and PDB structures were investigated using PDB-BLAST and Combinatorial Extension (CE). Only one structure, SufD, has a truly unique topology compared to all folds in the PDB. Copyright 2005 Wiley-Liss, Inc.
Aggregation of alpha-synuclein by a coarse-grained Monte Carlo simulation
NASA Astrophysics Data System (ADS)
Farmer, Barry; Pandey, Ras
Alpha-synuclein, an intrinsic protein abundant in neurons, is believed to be a major cause of neurodegenerative diseases (e.g. Alzheimer, Parkinson's disease). Abnormal aggregation of ASN leads to Lewy bodies with specific morphologies. We investigate the self-organizing structures in a crowded environment of ASN proteins by a coarse-grained Monte Carlo simulation. ASN is a chain of 140 residues. Structure detail of residues is neglected but its specificity is captured via unique knowledge-based residue-residue interactions. Large-scale simulations are performed to analyze a number local and global physical quantities (e.g. mobility profile, contact map, radius of gyration, structure factor) as a function of temperature and protein concentration. Trend in multi-scale structural variations of the protein in a crowded environment is compared with that of a free protein chain.
Membrane-spanning α-helical barrels as tractable protein-design targets.
Niitsu, Ai; Heal, Jack W; Fauland, Kerstin; Thomson, Andrew R; Woolfson, Derek N
2017-08-05
The rational ( de novo ) design of membrane-spanning proteins lags behind that for water-soluble globular proteins. This is due to gaps in our knowledge of membrane-protein structure, and experimental difficulties in studying such proteins compared to water-soluble counterparts. One limiting factor is the small number of experimentally determined three-dimensional structures for transmembrane proteins. By contrast, many tens of thousands of globular protein structures provide a rich source of 'scaffolds' for protein design, and the means to garner sequence-to-structure relationships to guide the design process. The α-helical coiled coil is a protein-structure element found in both globular and membrane proteins, where it cements a variety of helix-helix interactions and helical bundles. Our deep understanding of coiled coils has enabled a large number of successful de novo designs. For one class, the α-helical barrels-that is, symmetric bundles of five or more helices with central accessible channels-there are both water-soluble and membrane-spanning examples. Recent computational designs of water-soluble α-helical barrels with five to seven helices have advanced the design field considerably. Here we identify and classify analogous and more complicated membrane-spanning α-helical barrels from the Protein Data Bank. These provide tantalizing but tractable targets for protein engineering and de novo protein design.This article is part of the themed issue 'Membrane pores: from structure and assembly, to medicine and technology'. © 2017 The Author(s).
Characterizing monoclonal antibody structure by carbodiimide/GEE footprinting
Kaur, Parminder; Tomechko, Sara; Kiselar, Janna; Shi, Wuxian; Deperalta, Galahad; Wecksler, Aaron T; Gokulrangan, Giridharan; Ling, Victor; Chance, Mark R
2014-01-01
Amino acid-specific covalent labeling is well suited to probe protein structure and macromolecular interactions, especially for macromolecules and their complexes that are difficult to examine by alternative means, due to size, complexity, or instability. Here we present a detailed account of carbodiimide-based covalent labeling (with GEE tagging) applied to a glycosylated monoclonal antibody therapeutic, which represents an important class of biologic drugs. Characterization of such proteins and their antigen complexes is essential to development of new biologic-based medicines. In this study, the experiments were optimized to preserve the structural integrity of the protein, and experimental conditions were varied and replicated to establish the reproducibility and precision of the technique. Homology-based models were generated and used to compare the solvent accessibility of the labeled residues, which include D, E, and the C-terminus, against the experimental surface accessibility data in order to understand the accuracy of the approach in providing an unbiased assessment of structure. Data from the protein were also compared to reactivity measures of several model peptides to explain sequence or structure-based variations in reactivity. The results highlight several advantages of this approach. These include: the ease of use at the bench top, the linearity of the dose response plots at high levels of labeling (indicating that the label does not significantly perturb the structure of the protein), the high reproducibility of replicate experiments (<2 % variation in modification extent), the similar reactivity of the 3 target probe residues (as suggested by analysis of model peptides), and the overall positive and significant correlation of reactivity and solvent accessible surface area (the latter values predicted by the homology modeling). Attenuation of reactivity, in otherwise solvent accessible probes, is documented as arising from the effects of positive charge or bond formation between adjacent amine and carboxyl groups, the latter accompanied by observed water loss. The results are also compared with data from hydroxyl radical-mediated oxidative footprinting on the same protein, showing that complementary information is gained from the 2 approaches, although the number of target residues in carbodiimide/GEE labeling is fewer. Overall, this approach is an accurate and precise method for assessing protein structure of biologic drugs. PMID:25484052
An Unusual Hydrophobic Core Confers Extreme Flexibility to HEAT Repeat Proteins
Kappel, Christian; Zachariae, Ulrich; Dölker, Nicole; Grubmüller, Helmut
2010-01-01
Alpha-solenoid proteins are suggested to constitute highly flexible macromolecules, whose structural variability and large surface area is instrumental in many important protein-protein binding processes. By equilibrium and nonequilibrium molecular dynamics simulations, we show that importin-β, an archetypical α-solenoid, displays unprecedentedly large and fully reversible elasticity. Our stretching molecular dynamics simulations reveal full elasticity over up to twofold end-to-end extensions compared to its bound state. Despite the absence of any long-range intramolecular contacts, the protein can return to its equilibrium structure to within 3 Å backbone RMSD after the release of mechanical stress. We find that this extreme degree of flexibility is based on an unusually flexible hydrophobic core that differs substantially from that of structurally similar but more rigid globular proteins. In that respect, the core of importin-β resembles molten globules. The elastic behavior is dominated by nonpolar interactions between HEAT repeats, combined with conformational entropic effects. Our results suggest that α-solenoid structures such as importin-β may bridge the molecular gap between completely structured and intrinsically disordered proteins. PMID:20816072
Gupta, Sayan; Feng, Jun; Chance, Mark; Ralston, Corie
2016-01-01
Synchrotron X-ray Footprinting is a powerful in situ hydroxyl radical labeling method for analysis of protein structure, interactions, folding and conformation change in solution. In this method, water is ionized by high flux density broad band synchrotron X-rays to produce a steady-state concentration of hydroxyl radicals, which then react with solvent accessible side-chains. The resulting stable modification products are analyzed by liquid chromatography coupled to mass spectrometry. A comparative reactivity rate between known and unknown states of a protein provides local as well as global information on structural changes, which is then used to develop structural models for protein function and dynamics. In this review we describe the XF-MS method, its unique capabilities and its recent technical advances at the Advanced Light Source. We provide a comparison of other hydroxyl radical and mass spectrometry based methods with XFMS. We also discuss some of the latest developments in its usage for studying bound water, transmembrane proteins and photosynthetic protein components, and the synergy of the method with other synchrotron based structural biology methods.
Can natural proteins designed with 'inverted' peptide sequences adopt native-like protein folds?
Sridhar, Settu; Guruprasad, Kunchur
2014-01-01
We have carried out a systematic computational analysis on a representative dataset of proteins of known three-dimensional structure, in order to evaluate whether it would possible to 'swap' certain short peptide sequences in naturally occurring proteins with their corresponding 'inverted' peptides and generate 'artificial' proteins that are predicted to retain native-like protein fold. The analysis of 3,967 representative proteins from the Protein Data Bank revealed 102,677 unique identical inverted peptide sequence pairs that vary in sequence length between 5-12 and 18 amino acid residues. Our analysis illustrates with examples that such 'artificial' proteins may be generated by identifying peptides with 'similar structural environment' and by using comparative protein modeling and validation studies. Our analysis suggests that natural proteins may be tolerant to accommodating such peptides.
3D Complex: A Structural Classification of Protein Complexes
Levy, Emmanuel D; Pereira-Leal, Jose B; Chothia, Cyrus; Teichmann, Sarah A
2006-01-01
Most of the proteins in a cell assemble into complexes to carry out their function. It is therefore crucial to understand the physicochemical properties as well as the evolution of interactions between proteins. The Protein Data Bank represents an important source of information for such studies, because more than half of the structures are homo- or heteromeric protein complexes. Here we propose the first hierarchical classification of whole protein complexes of known 3-D structure, based on representing their fundamental structural features as a graph. This classification provides the first overview of all the complexes in the Protein Data Bank and allows nonredundant sets to be derived at different levels of detail. This reveals that between one-half and two-thirds of known structures are multimeric, depending on the level of redundancy accepted. We also analyse the structures in terms of the topological arrangement of their subunits and find that they form a small number of arrangements compared with all theoretically possible ones. This is because most complexes contain four subunits or less, and the large majority are homomeric. In addition, there is a strong tendency for symmetry in complexes, even for heteromeric complexes. Finally, through comparison of Biological Units in the Protein Data Bank with the Protein Quaternary Structure database, we identified many possible errors in quaternary structure assignments. Our classification, available as a database and Web server at http://www.3Dcomplex.org, will be a starting point for future work aimed at understanding the structure and evolution of protein complexes. PMID:17112313
Finding Correlation between Protein Protein Interaction Modules Using Semantic Web Techniques
NASA Astrophysics Data System (ADS)
Kargar, Mehdi; Moaven, Shahrouz; Abolhassani, Hassan
Many complex networks such as social networks and computer show modular structures, where edges between nodes are much denser within modules than between modules. It is strongly believed that cellular networks are also modular, reflecting the relative independence and coherence of different functional units in a cell. In this paper we used a human curated dataset. In this paper we consider each module in the PPI network as ontology. Using techniques in ontology alignment, we compare each pair of modules in the network. We want to see that is there a correlation between the structure of each module or they have totally different structures. Our results show that there is no correlation between proteins in a protein protein interaction network.
Functional classification of protein structures by local structure matching in graph representation.
Mills, Caitlyn L; Garg, Rohan; Lee, Joslynn S; Tian, Liang; Suciu, Alexandru; Cooperman, Gene; Beuning, Penny J; Ondrechen, Mary Jo
2018-03-31
As a result of high-throughput protein structure initiatives, over 14,400 protein structures have been solved by structural genomics (SG) centers and participating research groups. While the totality of SG data represents a tremendous contribution to genomics and structural biology, reliable functional information for these proteins is generally lacking. Better functional predictions for SG proteins will add substantial value to the structural information already obtained. Our method described herein, Graph Representation of Active Sites for Prediction of Function (GRASP-Func), predicts quickly and accurately the biochemical function of proteins by representing residues at the predicted local active site as graphs rather than in Cartesian coordinates. We compare the GRASP-Func method to our previously reported method, structurally aligned local sites of activity (SALSA), using the ribulose phosphate binding barrel (RPBB), 6-hairpin glycosidase (6-HG), and Concanavalin A-like Lectins/Glucanase (CAL/G) superfamilies as test cases. In each of the superfamilies, SALSA and the much faster method GRASP-Func yield similar correct classification of previously characterized proteins, providing a validated benchmark for the new method. In addition, we analyzed SG proteins using our SALSA and GRASP-Func methods to predict function. Forty-one SG proteins in the RPBB superfamily, nine SG proteins in the 6-HG superfamily, and one SG protein in the CAL/G superfamily were successfully classified into one of the functional families in their respective superfamily by both methods. This improved, faster, validated computational method can yield more reliable predictions of function that can be used for a wide variety of applications by the community. © 2018 The Authors Protein Science published by Wiley Periodicals, Inc. on behalf of The Protein Society.
Dewhurst, Henry M.; Choudhury, Shilpa; Torres, Matthew P.
2015-01-01
Predicting the biological function potential of post-translational modifications (PTMs) is becoming increasingly important in light of the exponential increase in available PTM data from high-throughput proteomics. We developed structural analysis of PTM hotspots (SAPH-ire)—a quantitative PTM ranking method that integrates experimental PTM observations, sequence conservation, protein structure, and interaction data to allow rank order comparisons within or between protein families. Here, we applied SAPH-ire to the study of PTMs in diverse G protein families, a conserved and ubiquitous class of proteins essential for maintenance of intracellular structure (tubulins) and signal transduction (large and small Ras-like G proteins). A total of 1728 experimentally verified PTMs from eight unique G protein families were clustered into 451 unique hotspots, 51 of which have a known and cited biological function or response. Using customized software, the hotspots were analyzed in the context of 598 unique protein structures. By comparing distributions of hotspots with known versus unknown function, we show that SAPH-ire analysis is predictive for PTM biological function. Notably, SAPH-ire revealed high-ranking hotspots for which a functional impact has not yet been determined, including phosphorylation hotspots in the N-terminal tails of G protein gamma subunits—conserved protein structures never before reported as regulators of G protein coupled receptor signaling. To validate this prediction we used the yeast model system for G protein coupled receptor signaling, revealing that gamma subunit–N-terminal tail phosphorylation is activated in response to G protein coupled receptor stimulation and regulates protein stability in vivo. These results demonstrate the utility of integrating protein structural and sequence features into PTM prioritization schemes that can improve the analysis and functional power of modification-specific proteomics data. PMID:26070665
Exhaustive comparison and classification of ligand-binding surfaces in proteins
Murakami, Yoichi; Kinoshita, Kengo; Kinjo, Akira R; Nakamura, Haruki
2013-01-01
Many proteins function by interacting with other small molecules (ligands). Identification of ligand-binding sites (LBS) in proteins can therefore help to infer their molecular functions. A comprehensive comparison among local structures of LBSs was previously performed, in order to understand their relationships and to classify their structural motifs. However, similar exhaustive comparison among local surfaces of LBSs (patches) has never been performed, due to computational complexity. To enhance our understanding of LBSs, it is worth performing such comparisons among patches and classifying them based on similarities of their surface configurations and electrostatic potentials. In this study, we first developed a rapid method to compare two patches. We then clustered patches corresponding to the same PDB chemical component identifier for a ligand, and selected a representative patch from each cluster. We subsequently exhaustively as compared the representative patches and clustered them using similarity score, PatSim. Finally, the resultant PatSim scores were compared with similarities of atomic structures of the LBSs and those of the ligand-binding protein sequences and functions. Consequently, we classified the patches into ∼2000 well-characterized clusters. We found that about 63% of these clusters are used in identical protein folds, although about 25% of the clusters are conserved in distantly related proteins and even in proteins with cross-fold similarity. Furthermore, we showed that patches with higher PatSim score have potential to be involved in similar biological processes. PMID:23934772
NASA Astrophysics Data System (ADS)
Shtykova, E. V.; Bogacheva, E. N.; Dadinova, L. A.; Jeffries, C. M.; Fedorova, N. V.; Golovko, A. O.; Baratova, L. A.; Batishchev, O. V.
2017-11-01
A complex structural analysis of nuclear export protein NS2 (NEP) of influenza virus A has been performed using bioinformatics predictive methods and small-angle X-ray scattering data. The behavior of NEP molecules in a solution (their aggregation, oligomerization, and dissociation, depending on the buffer composition) has been investigated. It was shown that stable associates are formed even in a conventional aqueous salt solution at physiological pH value. For the first time we have managed to get NEP dimers in solution, to analyze their structure, and to compare the models obtained using the method of the molecular tectonics with the spatial protein structure predicted by us using the bioinformatics methods. The results of the study provide a new insight into the structural features of nuclear export protein NS2 (NEP) of the influenza virus A, which is very important for viral infection development.
Landscape of Pleiotropic Proteins Causing Human Disease: Structural and System Biology Insights.
Ittisoponpisan, Sirawit; Alhuzimi, Eman; Sternberg, Michael J E; David, Alessia
2017-03-01
Pleiotropy is the phenomenon by which the same gene can result in multiple phenotypes. Pleiotropic proteins are emerging as important contributors to rare and common disorders. Nevertheless, little is known on the mechanisms underlying pleiotropy and the characteristic of pleiotropic proteins. We analyzed disease-causing proteins reported in UniProt and observed that 12% are pleiotropic (variants in the same protein cause more than one disease). Pleiotropic proteins were enriched in deleterious and rare variants, but not in common variants. Pleiotropic proteins were more likely to be involved in the pathogenesis of neoplasms, neurological, and circulatory diseases and congenital malformations, whereas non-pleiotropic proteins in endocrine and metabolic disorders. Pleiotropic proteins were more essential and had a higher number of interacting partners compared with non-pleiotropic proteins. Significantly more pleiotropic than non-pleiotropic proteins contained at least one intrinsically long disordered region (P < 0.001). Deleterious variants occurring in structurally disordered regions were more commonly found in pleiotropic, rather than non-pleiotropic proteins. In conclusion, pleiotropic proteins are an important contributor to human disease. They represent a biologically different class of proteins compared with non-pleiotropic proteins and a better understanding of their characteristics and genetic variants can greatly aid in the interpretation of genetic studies and drug design. © 2016 WILEY PERIODICALS, INC.
Jahangeer, S; Rodbell, M
1993-10-01
We have compared the sedimentation rates on sucrose gradients of the heterotrimeric GTP-binding regulatory (G) proteins Gs, G(o), Gi, and Gq extracted from rat brain synaptoneurosomes with Lubrol and digitonin. The individual alpha and beta subunits were monitored with specific antisera. In all cases, both subunits cosedimented, indicating that the subunits are likely complexed as heterotrimers. When extracted with Lubrol all of the G proteins sedimented with rates of about 4.5 S (consistent with heterotrimers) whereas digitonin extracted 60% of the G proteins with peaks at 11 S; 40% pelleted as larger structures. Digitonin-extracted Gi was cross-linked by p-phenylenedimaleimide, yielding structures too large to enter polyacrylamide gels. No cross-linking of Lubrol-extracted Gi occurred. Treatment of the membranes with guanosine 5'-[gamma-thio]triphosphate and Mg2+ yielded digitonin-extracted structures with peak sedimentation values of 8.5 S--i.e., comparable to that of purified G(o) in digitonin and considerably larger than the Lubrol-extracted 2S structures representing the separated alpha and beta gamma subunits formed by the actions of guanosine 5'-[gamma-thio]triphosphate. It is concluded that the multimeric structures of G proteins in brain membranes are at least partially preserved in digitonin and that activation of these structures in membranes yields monomers of G proteins rather than the disaggregated products (alpha and beta gamma complexes) observed in Lubrol. It is proposed that hormones and GTP affect the dynamic interplay between multimeric G proteins and receptors in a fashion analogous to the actions of ATP on the dynamic interactions between myosin and actin filaments. Signal transduction is mediated by activated monomers released from the multimers during the activation process.
Jahangeer, S; Rodbell, M
1993-01-01
We have compared the sedimentation rates on sucrose gradients of the heterotrimeric GTP-binding regulatory (G) proteins Gs, G(o), Gi, and Gq extracted from rat brain synaptoneurosomes with Lubrol and digitonin. The individual alpha and beta subunits were monitored with specific antisera. In all cases, both subunits cosedimented, indicating that the subunits are likely complexed as heterotrimers. When extracted with Lubrol all of the G proteins sedimented with rates of about 4.5 S (consistent with heterotrimers) whereas digitonin extracted 60% of the G proteins with peaks at 11 S; 40% pelleted as larger structures. Digitonin-extracted Gi was cross-linked by p-phenylenedimaleimide, yielding structures too large to enter polyacrylamide gels. No cross-linking of Lubrol-extracted Gi occurred. Treatment of the membranes with guanosine 5'-[gamma-thio]triphosphate and Mg2+ yielded digitonin-extracted structures with peak sedimentation values of 8.5 S--i.e., comparable to that of purified G(o) in digitonin and considerably larger than the Lubrol-extracted 2S structures representing the separated alpha and beta gamma subunits formed by the actions of guanosine 5'-[gamma-thio]triphosphate. It is concluded that the multimeric structures of G proteins in brain membranes are at least partially preserved in digitonin and that activation of these structures in membranes yields monomers of G proteins rather than the disaggregated products (alpha and beta gamma complexes) observed in Lubrol. It is proposed that hormones and GTP affect the dynamic interplay between multimeric G proteins and receptors in a fashion analogous to the actions of ATP on the dynamic interactions between myosin and actin filaments. Signal transduction is mediated by activated monomers released from the multimers during the activation process. Images Fig. 1 Fig. 2 PMID:8415607
Alu'datt, Muhammad H; Gammoh, Sana; Rababah, Taha; Almomani, Mohammed; Alhamad, Mohammad N; Ereifej, Khalil; Almajwal, Ali; Tahat, Asma; Hussein, Neveen M; Nasser, Sura Abou
2018-02-01
This investigation was performed to assess the effects of sonication on the structure of protein, extractability of phenolics, and biological properties of isolated proteins and protein co-precipitates prepared from brewers' spent grain and soybean flour. Scanning electron micrographs revealed that the sonicated protein isolates and co-precipitates had different microstructures with fewer aggregates and smaller particles down to the nanometer scale compared to non-sonicated samples. However, the levels of free and bound phenolics extracted from non-sonicated protein isolates and protein co-precipitates increased compared to sonicated samples. The bound phenolics extracted after acid hydrolysis of sonicated protein co-precipitates showed improved ACE inhibitory activity and diminished antioxidant potency compared to non-sonicated samples. However, the free phenolics extracted from sonicated protein co-precipitates showed decreased ACE inhibitory activity and increased antioxidant activities compared to non-sonicated samples. The free and bound phenolics extracted from sonicated protein co-precipitates showed increased alpha-amylase inhibitory activity compared to non-sonicated samples. Copyright © 2017 Elsevier Ltd. All rights reserved.
Devassy, Jessay G; Wojcik, Jennifer L; Ibrahim, Naser H M; Zahradka, Peter; Taylor, Carla G; Aukema, Harold M
2017-02-01
Questions remain regarding the potential negative effects of dietary high protein (HP) on kidney health, particularly in the context of obesity in which the risk for renal disease is already increased. To examine whether some of the variability in HP effects on kidney health may be due to source of protein, obese fa/fa Zucker rats were given HP (35% of energy from protein) diets containing either casein, soy protein, or a mixed source of animal and plant proteins for 12 weeks. Control lean and obese rats were given diets containing casein at normal protein (15% of energy from protein) levels. Body weight and blood pressure were measured, and markers of renal structural changes, damage, and function were assessed. Obesity alone resulted in mild renal changes, as evidenced by higher kidney weights, proteinuria, and glomerular volumes. In obese rats, increasing the protein level using the single, but not mixed, protein sources resulted in higher renal fibrosis compared with the lean rats. The mixed-protein HP group also had lower levels of serum monocyte chemoattractant protein-1, even though this diet further increased kidney and glomerular size. Soy and mixed-protein HP diets also resulted in a small number of damaged glomeruli, while soy compared with mixed-protein HP diet delayed the increase in blood pressure over time. Since obesity itself confers added risk of renal disease, an HP diet from mixed-protein sources that enables weight loss but has fewer risks to renal health may be advantageous.
Zhou, Carol L Ecale
2015-01-01
In order to better define regions of similarity among related protein structures, it is useful to identify the residue-residue correspondences among proteins. Few codes exist for constructing a one-to-many multiple sequence alignment derived from a set of structure or sequence alignments, and a need was evident for creating such a tool for combining pairwise structure alignments that would allow for insertion of gaps in the reference structure. This report describes a new Python code, CombAlign, which takes as input a set of pairwise sequence alignments (which may be structure based) and generates a one-to-many, gapped, multiple structure- or sequence-based sequence alignment (MSSA). The use and utility of CombAlign was demonstrated by generating gapped MSSAs using sets of pairwise structure-based sequence alignments between structure models of the matrix protein (VP40) and pre-small/secreted glycoprotein (sGP) of Reston Ebolavirus and the corresponding proteins of several other filoviruses. The gapped MSSAs revealed structure-based residue-residue correspondences, which enabled identification of structurally similar versus differing regions in the Reston proteins compared to each of the other corresponding proteins. CombAlign is a new Python code that generates a one-to-many, gapped, multiple structure- or sequence-based sequence alignment (MSSA) given a set of pairwise sequence alignments (which may be structure based). CombAlign has utility in assisting the user in distinguishing structurally conserved versus divergent regions on a reference protein structure relative to other closely related proteins. CombAlign was developed in Python 2.6, and the source code is available for download from the GitHub code repository.
The TIM Barrel Architecture Facilitated the Early Evolution of Protein-Mediated Metabolism.
Goldman, Aaron David; Beatty, Joshua T; Landweber, Laura F
2016-01-01
The triosephosphate isomerase (TIM) barrel protein fold is a structurally repetitive architecture that is present in approximately 10% of all enzymes. It is generally assumed that this ubiquity in modern proteomes reflects an essential historical role in early protein-mediated metabolism. Here, we provide quantitative and comparative analyses to support several hypotheses about the early importance of the TIM barrel architecture. An information theoretical analysis of protein structures supports the hypothesis that the TIM barrel architecture could arise more easily by duplication and recombination compared to other mixed α/β structures. We show that TIM barrel enzymes corresponding to the most taxonomically broad superfamilies also have the broadest range of functions, often aided by metal and nucleotide-derived cofactors that are thought to reflect an earlier stage of metabolic evolution. By comparison to other putatively ancient protein architectures, we find that the functional diversity of TIM barrel proteins cannot be explained simply by their antiquity. Instead, the breadth of TIM barrel functions can be explained, in part, by the incorporation of a broad range of cofactors, a trend that does not appear to be shared by proteins in general. These results support the hypothesis that the simple and functionally general TIM barrel architecture may have arisen early in the evolution of protein biosynthesis and provided an ideal scaffold to facilitate the metabolic transition from ribozymes, peptides, and geochemical catalysts to modern protein enzymes.
Membrane proteins bind lipids selectively to modulate their structure and function.
Laganowsky, Arthur; Reading, Eamonn; Allison, Timothy M; Ulmschneider, Martin B; Degiacomi, Matteo T; Baldwin, Andrew J; Robinson, Carol V
2014-06-05
Previous studies have established that the folding, structure and function of membrane proteins are influenced by their lipid environments and that lipids can bind to specific sites, for example, in potassium channels. Fundamental questions remain however regarding the extent of membrane protein selectivity towards lipids. Here we report a mass spectrometry approach designed to determine the selectivity of lipid binding to membrane protein complexes. We investigate the mechanosensitive channel of large conductance (MscL) from Mycobacterium tuberculosis and aquaporin Z (AqpZ) and the ammonia channel (AmtB) from Escherichia coli, using ion mobility mass spectrometry (IM-MS), which reports gas-phase collision cross-sections. We demonstrate that folded conformations of membrane protein complexes can exist in the gas phase. By resolving lipid-bound states, we then rank bound lipids on the basis of their ability to resist gas phase unfolding and thereby stabilize membrane protein structure. Lipids bind non-selectively and with high avidity to MscL, all imparting comparable stability; however, the highest-ranking lipid is phosphatidylinositol phosphate, in line with its proposed functional role in mechanosensation. AqpZ is also stabilized by many lipids, with cardiolipin imparting the most significant resistance to unfolding. Subsequently, through functional assays we show that cardiolipin modulates AqpZ function. Similar experiments identify AmtB as being highly selective for phosphatidylglycerol, prompting us to obtain an X-ray structure in this lipid membrane-like environment. The 2.3 Å resolution structure, when compared with others obtained without lipid bound, reveals distinct conformational changes that re-position AmtB residues to interact with the lipid bilayer. Our results demonstrate that resistance to unfolding correlates with specific lipid-binding events, enabling a distinction to be made between lipids that merely bind from those that modulate membrane protein structure and/or function. We anticipate that these findings will be important not only for defining the selectivity of membrane proteins towards lipids, but also for understanding the role of lipids in modulating protein function or drug binding.
Protein structure refinement using a quantum mechanics-based chemical shielding predictor.
Bratholm, Lars A; Jensen, Jan H
2017-03-01
The accurate prediction of protein chemical shifts using a quantum mechanics (QM)-based method has been the subject of intense research for more than 20 years but so far empirical methods for chemical shift prediction have proven more accurate. In this paper we show that a QM-based predictor of a protein backbone and CB chemical shifts (ProCS15, PeerJ , 2016, 3, e1344) is of comparable accuracy to empirical chemical shift predictors after chemical shift-based structural refinement that removes small structural errors. We present a method by which quantum chemistry based predictions of isotropic chemical shielding values (ProCS15) can be used to refine protein structures using Markov Chain Monte Carlo (MCMC) simulations, relating the chemical shielding values to the experimental chemical shifts probabilistically. Two kinds of MCMC structural refinement simulations were performed using force field geometry optimized X-ray structures as starting points: simulated annealing of the starting structure and constant temperature MCMC simulation followed by simulated annealing of a representative ensemble structure. Annealing of the CHARMM structure changes the CA-RMSD by an average of 0.4 Å but lowers the chemical shift RMSD by 1.0 and 0.7 ppm for CA and N. Conformational averaging has a relatively small effect (0.1-0.2 ppm) on the overall agreement with carbon chemical shifts but lowers the error for nitrogen chemical shifts by 0.4 ppm. If an amino acid specific offset is included the ProCS15 predicted chemical shifts have RMSD values relative to experiments that are comparable to popular empirical chemical shift predictors. The annealed representative ensemble structures differ in CA-RMSD relative to the initial structures by an average of 2.0 Å, with >2.0 Å difference for six proteins. In four of the cases, the largest structural differences arise in structurally flexible regions of the protein as determined by NMR, and in the remaining two cases, the large structural change may be due to force field deficiencies. The overall accuracy of the empirical methods are slightly improved by annealing the CHARMM structure with ProCS15, which may suggest that the minor structural changes introduced by ProCS15-based annealing improves the accuracy of the protein structures. Having established that QM-based chemical shift prediction can deliver the same accuracy as empirical shift predictors we hope this can help increase the accuracy of related approaches such as QM/MM or linear scaling approaches or interpreting protein structural dynamics from QM-derived chemical shift.
X-ray scattering data and structural genomics
NASA Astrophysics Data System (ADS)
Doniach, Sebastian
2003-03-01
High throughput structural genomics has the ambitious goal of determining the structure of all, or a very large number of protein folds using the high-resolution techniques of protein crystallography and NMR. However, the program is facing significant bottlenecks in reaching this goal, which include problems of protein expression and crystallization. In this talk, some preliminary results on how the low-resolution technique of small-angle X-ray solution scattering (SAXS) can help ameliorate some of these bottlenecks will be presented. One of the most significant bottlenecks arises from the difficulty of crystallizing integral membrane proteins, where only a handful of structures are available compared to thousands of structures for soluble proteins. By 3-dimensional reconstruction from SAXS data, the size and shape of detergent-solubilized integral membrane proteins can be characterized. This information can then be used to classify membrane proteins which constitute some 25% of all genomes. SAXS may also be used to study the dependence of interparticle interference scattering on solvent conditions so that regions of the protein solution phase diagram which favor crystallization can be elucidated. As a further application, SAXS may be used to provide physical constraints on computational methods for protein structure prediction based on primary sequence information. This in turn can help in identifying structural homologs of a given protein, which can then give clues to its function. D. Walther, F. Cohen and S. Doniach. "Reconstruction of low resolution three-dimensional density maps from one-dimensional small angle x-ray scattering data for biomolecules." J. Appl. Cryst. 33(2):350-363 (2000). Protein structure prediction constrained by solution X-ray scattering data and structural homology identification Zheng WJ, Doniach S JOURNAL OF MOLECULAR BIOLOGY , v. 316(#1) pp. 173-187 FEB 8, 2002
PDBStat: a universal restraint converter and restraint analysis software package for protein NMR.
Tejero, Roberto; Snyder, David; Mao, Binchen; Aramini, James M; Montelione, Gaetano T
2013-08-01
The heterogeneous array of software tools used in the process of protein NMR structure determination presents organizational challenges in the structure determination and validation processes, and creates a learning curve that limits the broader use of protein NMR in biology. These challenges, including accurate use of data in different data formats required by software carrying out similar tasks, continue to confound the efforts of novices and experts alike. These important issues need to be addressed robustly in order to standardize protein NMR structure determination and validation. PDBStat is a C/C++ computer program originally developed as a universal coordinate and protein NMR restraint converter. Its primary function is to provide a user-friendly tool for interconverting between protein coordinate and protein NMR restraint data formats. It also provides an integrated set of computational methods for protein NMR restraint analysis and structure quality assessment, relabeling of prochiral atoms with correct IUPAC names, as well as multiple methods for analysis of the consistency of atomic positions indicated by their convergence across a protein NMR ensemble. In this paper we provide a detailed description of the PDBStat software, and highlight some of its valuable computational capabilities. As an example, we demonstrate the use of the PDBStat restraint converter for restrained CS-Rosetta structure generation calculations, and compare the resulting protein NMR structure models with those generated from the same NMR restraint data using more traditional structure determination methods. These results demonstrate the value of a universal restraint converter in allowing the use of multiple structure generation methods with the same restraint data for consensus analysis of protein NMR structures and the underlying restraint data.
PDBStat: A Universal Restraint Converter and Restraint Analysis Software Package for Protein NMR
Tejero, Roberto; Snyder, David; Mao, Binchen; Aramini, James M.; Montelione, Gaetano T
2013-01-01
The heterogeneous array of software tools used in the process of protein NMR structure determination presents organizational challenges in the structure determination and validation processes, and creates a learning curve that limits the broader use of protein NMR in biology. These challenges, including accurate use of data in different data formats required by software carrying out similar tasks, continue to confound the efforts of novices and experts alike. These important issues need to be addressed robustly in order to standardize protein NMR structure determination and validation. PDBStat is a C/C++ computer program originally developed as a universal coordinate and protein NMR restraint converter. Its primary function is to provide a user-friendly tool for interconverting between protein coordinate and protein NMR restraint data formats. It also provides an integrated set of computational methods for protein NMR restraint analysis and structure quality assessment, relabeling of prochiral atoms with correct IUPAC names, as well as multiple methods for analysis of the consistency of atomic positions indicated by their convergence across a protein NMR ensemble. In this paper we provide a detailed description of the PDBStat software, and highlight some of its valuable computational capabilities. As an example, we demonstrate the use of the PDBStat restraint converter for restrained CS-Rosetta structure generation calculations, and compare the resulting protein NMR structure models with those generated from the same NMR restraint data using more traditional structure determination methods. These results demonstrate the value of a universal restraint converter in allowing the use of multiple structure generation methods with the same restraint data for consensus analysis of protein NMR structures and the underlying restraint data. PMID:23897031
Budowski-Tal, Inbal; Nov, Yuval; Kolodny, Rachel
2010-02-23
Fast identification of protein structures that are similar to a specified query structure in the entire Protein Data Bank (PDB) is fundamental in structure and function prediction. We present FragBag: An ultrafast and accurate method for comparing protein structures. We describe a protein structure by the collection of its overlapping short contiguous backbone segments, and discretize this set using a library of fragments. Then, we succinctly represent the protein as a "bags-of-fragments"-a vector that counts the number of occurrences of each fragment-and measure the similarity between two structures by the similarity between their vectors. Our representation has two additional benefits: (i) it can be used to construct an inverted index, for implementing a fast structural search engine of the entire PDB, and (ii) one can specify a structure as a collection of substructures, without combining them into a single structure; this is valuable for structure prediction, when there are reliable predictions only of parts of the protein. We use receiver operating characteristic curve analysis to quantify the success of FragBag in identifying neighbor candidate sets in a dataset of over 2,900 structures. The gold standard is the set of neighbors found by six state of the art structural aligners. Our best FragBag library finds more accurate candidate sets than the three other filter methods: The SGM, PRIDE, and a method by Zotenko et al. More interestingly, FragBag performs on a par with the computationally expensive, yet highly trusted structural aligners STRUCTAL and CE.
Quality assessment of protein model-structures based on structural and functional similarities.
Konopka, Bogumil M; Nebel, Jean-Christophe; Kotulska, Malgorzata
2012-09-21
Experimental determination of protein 3D structures is expensive, time consuming and sometimes impossible. A gap between number of protein structures deposited in the World Wide Protein Data Bank and the number of sequenced proteins constantly broadens. Computational modeling is deemed to be one of the ways to deal with the problem. Although protein 3D structure prediction is a difficult task, many tools are available. These tools can model it from a sequence or partial structural information, e.g. contact maps. Consequently, biologists have the ability to generate automatically a putative 3D structure model of any protein. However, the main issue becomes evaluation of the model quality, which is one of the most important challenges of structural biology. GOBA--Gene Ontology-Based Assessment is a novel Protein Model Quality Assessment Program. It estimates the compatibility between a model-structure and its expected function. GOBA is based on the assumption that a high quality model is expected to be structurally similar to proteins functionally similar to the prediction target. Whereas DALI is used to measure structure similarity, protein functional similarity is quantified using standardized and hierarchical description of proteins provided by Gene Ontology combined with Wang's algorithm for calculating semantic similarity. Two approaches are proposed to express the quality of protein model-structures. One is a single model quality assessment method, the other is its modification, which provides a relative measure of model quality. Exhaustive evaluation is performed on data sets of model-structures submitted to the CASP8 and CASP9 contests. The validation shows that the method is able to discriminate between good and bad model-structures. The best of tested GOBA scores achieved 0.74 and 0.8 as a mean Pearson correlation to the observed quality of models in our CASP8 and CASP9-based validation sets. GOBA also obtained the best result for two targets of CASP8, and one of CASP9, compared to the contest participants. Consequently, GOBA offers a novel single model quality assessment program that addresses the practical needs of biologists. In conjunction with other Model Quality Assessment Programs (MQAPs), it would prove useful for the evaluation of single protein models.
Lee, Juyong; Lee, Jinhyuk; Sasaki, Takeshi N; Sasai, Masaki; Seok, Chaok; Lee, Jooyoung
2011-08-01
Ab initio protein structure prediction is a challenging problem that requires both an accurate energetic representation of a protein structure and an efficient conformational sampling method for successful protein modeling. In this article, we present an ab initio structure prediction method which combines a recently suggested novel way of fragment assembly, dynamic fragment assembly (DFA) and conformational space annealing (CSA) algorithm. In DFA, model structures are scored by continuous functions constructed based on short- and long-range structural restraint information from a fragment library. Here, DFA is represented by the full-atom model by CHARMM with the addition of the empirical potential of DFIRE. The relative contributions between various energy terms are optimized using linear programming. The conformational sampling was carried out with CSA algorithm, which can find low energy conformations more efficiently than simulated annealing used in the existing DFA study. The newly introduced DFA energy function and CSA sampling algorithm are implemented into CHARMM. Test results on 30 small single-domain proteins and 13 template-free modeling targets of the 8th Critical Assessment of protein Structure Prediction show that the current method provides comparable and complementary prediction results to existing top methods. Copyright © 2011 Wiley-Liss, Inc.
Lentes, K U; Mathieu, E; Bischoff, R; Rasmussen, U B; Pavirani, A
1993-01-01
Current methods for comparative analyses of protein sequences are 1D-alignments of amino acid sequences based on the maximization of amino acid identity (homology) and the prediction of secondary structure elements. This method has a major drawback once the amino acid identity drops below 20-25%, since maximization of a homology score does not take into account any structural information. A new technique called Hydrophobic Cluster Analysis (HCA) has been developed by Lemesle-Varloot et al. (Biochimie 72, 555-574), 1990). This consists of comparing several sequences simultaneously and combining homology detection with secondary structure analysis. HCA is primarily based on the detection and comparison of structural segments constituting the hydrophobic core of globular protein domains, with or without transmembrane domains. We have applied HCA to the analysis of different families of G-protein coupled receptors, such as catecholamine receptors as well as peptide hormone receptors. Utilizing HCA the thrombin receptor, a new and as yet unique member of the family of G-protein coupled receptors, can be clearly classified as being closely related to the family of neuropeptide receptors rather than to the catecholamine receptors for which the shape of the hydrophobic clusters and the length of their third cytoplasmic loop are very different. Furthermore, the potential of HCA to predict relationships between new putative and already characterized members of this family of receptors will be presented.
From protein sequence to dynamics and disorder with DynaMine.
Cilia, Elisa; Pancsa, Rita; Tompa, Peter; Lenaerts, Tom; Vranken, Wim F
2013-01-01
Protein function and dynamics are closely related; however, accurate dynamics information is difficult to obtain. Here based on a carefully assembled data set derived from experimental data for proteins in solution, we quantify backbone dynamics properties on the amino-acid level and develop DynaMine--a fast, high-quality predictor of protein backbone dynamics. DynaMine uses only protein sequence information as input and shows great potential in distinguishing regions of different structural organization, such as folded domains, disordered linkers, molten globules and pre-structured binding motifs of different sizes. It also identifies disordered regions within proteins with an accuracy comparable to the most sophisticated existing predictors, without depending on prior disorder knowledge or three-dimensional structural information. DynaMine provides molecular biologists with an important new method that grasps the dynamical characteristics of any protein of interest, as we show here for human p53 and E1A from human adenovirus 5.
Impact of Protein-Metal Ion Interactions on the Crystallization of Silk Fibroin Protein
NASA Astrophysics Data System (ADS)
Hu, Xiao; Lu, Qiang; Kaplan, David; Cebe, Peggy
2009-03-01
Proteins can easily form bonds with a variety of metal ions, which provides many unique biological functions for the protein structures, and therefore controls the overall structural transformation of proteins. We use advanced thermal analysis methods such as temperature modulated differential scanning calorimetry and quasi-isothermal TMDSC, combined with Fourier transform infrared spectroscopy, and scanning electron microscopy, to investigate the protein-metallic ion interactions in Bombyx mori silk fibroin proteins. Silk samples were mixed with different metal ions (Ca^2+, K^+, Ma^2+, Na^+, Cu^2+, Mn^2+) with different mass ratios, and compared with the physical conditions in the silkworm gland. Results show that all metallic ions can directly affect the crystallization behavior and glass transition of silk fibroin. However, different ions tend to have different structural impact, including their role as plasticizer or anti-plasticizer. Detailed studies reveal important information allowing us better to understand the natural silk spinning and crystallization process.
An affinity-structure database of helix-turn-helix: DNA complexes with a universal coordinate system
DOE Office of Scientific and Technical Information (OSTI.GOV)
AlQuraishi, Mohammed; Tang, Shengdong; Xia, Xide
Molecular interactions between proteins and DNA molecules underlie many cellular processes, including transcriptional regulation, chromosome replication, and nucleosome positioning. Computational analyses of protein-DNA interactions rely on experimental data characterizing known protein-DNA interactions structurally and biochemically. While many databases exist that contain either structural or biochemical data, few integrate these two data sources in a unified fashion. Such integration is becoming increasingly critical with the rapid growth of structural and biochemical data, and the emergence of algorithms that rely on the synthesis of multiple data types to derive computational models of molecular interactions. We have developed an integrated affinity-structure database inmore » which the experimental and quantitative DNA binding affinities of helix-turn-helix proteins are mapped onto the crystal structures of the corresponding protein-DNA complexes. This database provides access to: (i) protein-DNA structures, (ii) quantitative summaries of protein-DNA binding affinities using position weight matrices, and (iii) raw experimental data of protein-DNA binding instances. Critically, this database establishes a correspondence between experimental structural data and quantitative binding affinity data at the single basepair level. Furthermore, we present a novel alignment algorithm that structurally aligns the protein-DNA complexes in the database and creates a unified residue-level coordinate system for comparing the physico-chemical environments at the interface between complexes. Using this unified coordinate system, we compute the statistics of atomic interactions at the protein-DNA interface of helix-turn-helix proteins. We provide an interactive website for visualization, querying, and analyzing this database, and a downloadable version to facilitate programmatic analysis. Lastly, this database will facilitate the analysis of protein-DNA interactions and the development of programmatic computational methods that capitalize on integration of structural and biochemical datasets. The database can be accessed at http://ProteinDNA.hms.harvard.edu.« less
An affinity-structure database of helix-turn-helix: DNA complexes with a universal coordinate system
AlQuraishi, Mohammed; Tang, Shengdong; Xia, Xide
2015-11-19
Molecular interactions between proteins and DNA molecules underlie many cellular processes, including transcriptional regulation, chromosome replication, and nucleosome positioning. Computational analyses of protein-DNA interactions rely on experimental data characterizing known protein-DNA interactions structurally and biochemically. While many databases exist that contain either structural or biochemical data, few integrate these two data sources in a unified fashion. Such integration is becoming increasingly critical with the rapid growth of structural and biochemical data, and the emergence of algorithms that rely on the synthesis of multiple data types to derive computational models of molecular interactions. We have developed an integrated affinity-structure database inmore » which the experimental and quantitative DNA binding affinities of helix-turn-helix proteins are mapped onto the crystal structures of the corresponding protein-DNA complexes. This database provides access to: (i) protein-DNA structures, (ii) quantitative summaries of protein-DNA binding affinities using position weight matrices, and (iii) raw experimental data of protein-DNA binding instances. Critically, this database establishes a correspondence between experimental structural data and quantitative binding affinity data at the single basepair level. Furthermore, we present a novel alignment algorithm that structurally aligns the protein-DNA complexes in the database and creates a unified residue-level coordinate system for comparing the physico-chemical environments at the interface between complexes. Using this unified coordinate system, we compute the statistics of atomic interactions at the protein-DNA interface of helix-turn-helix proteins. We provide an interactive website for visualization, querying, and analyzing this database, and a downloadable version to facilitate programmatic analysis. Lastly, this database will facilitate the analysis of protein-DNA interactions and the development of programmatic computational methods that capitalize on integration of structural and biochemical datasets. The database can be accessed at http://ProteinDNA.hms.harvard.edu.« less
Patent protection for structural genomics-related inventions.
Vinarov, Sara D
2003-01-01
Recently there have been some important developments with respect to the patentability of inventions in the field of structural genomics. The leaders of the European Patent Office (EPO), Japan Patent Office (JPO) and the United States Patent Office (USPTO) came together for a trilateral meeting to conduct a comparative study on protein 3-dimensional (3-D) structure related claims in an effort to come to a mutual understanding about the examination of such inventions. The three patent offices were presented with eight different cases: 1) 3-D structural data of a protein per se; 2) computer-readable storage medium encoded with structural data of a protein; 3) protein defined by its tertiary structure; 4) crystals of known proteins; 5) binding pockets and protein domains; 6) and 7) are both directed to in silico screening methods directed to a specific protein; and 8) pharmacophores. The preliminary conclusions reached at the trilateral meeting provide clarity regarding the types of inventions that may be patentable given a specific set of scientific facts in a patent application. Therefore, the guidance provided by this study will help inventors, attorneys and other patent practitioners who file for patent protection on structural genomics-based inventions both here and abroad comply with the patentability requirements of each office.
Mapping of ligand-binding cavities in proteins.
Andersson, C David; Chen, Brian Y; Linusson, Anna
2010-05-01
The complex interactions between proteins and small organic molecules (ligands) are intensively studied because they play key roles in biological processes and drug activities. Here, we present a novel approach to characterize and map the ligand-binding cavities of proteins without direct geometric comparison of structures, based on Principal Component Analysis of cavity properties (related mainly to size, polarity, and charge). This approach can provide valuable information on the similarities and dissimilarities, of binding cavities due to mutations, between-species differences and flexibility upon ligand-binding. The presented results show that information on ligand-binding cavity variations can complement information on protein similarity obtained from sequence comparisons. The predictive aspect of the method is exemplified by successful predictions of serine proteases that were not included in the model construction. The presented strategy to compare ligand-binding cavities of related and unrelated proteins has many potential applications within protein and medicinal chemistry, for example in the characterization and mapping of "orphan structures", selection of protein structures for docking studies in structure-based design, and identification of proteins for selectivity screens in drug design programs. 2009 Wiley-Liss, Inc.
Effects of heat on meat proteins - Implications on structure and quality of meat products.
Tornberg, E
2005-07-01
Globular and fibrous proteins are compared with regard to structural behaviour on heating, where the former expands and the latter contracts. The meat protein composition and structure is briefly described. The behaviour of the different meat proteins on heating is discussed. Most of the sarcoplasmic proteins aggregate between 40 and 60 °C, but for some of them the coagulation can extend up to 90°C. For myofibrillar proteins in solution unfolding starts at 30-32°C, followed by protein-protein association at 36-40°C and subsequent gelation at 45-50°C (conc.>0.5% by weight). At temperatures between 53 and 63°C the collagen denaturation occurs, followed by collagen fibre shrinkage. If the collagen fibres are not stabilised by heat-resistant intermolecular bonds, it dissolves and forms gelatine on further heating. The structural changes on cooking in whole meat and comminuted meat products, and the alterations in water-holding and texture of the meat product that it leads to, are then discussed.
Adsorption of GA module onto graphene and graphene oxide: A molecular dynamics simulation study
NASA Astrophysics Data System (ADS)
Chen, Junlang; Wang, Xiaogang; Dai, Chaoqing; Chen, Shude; Tu, Yusong
2014-08-01
Using all-atom molecular dynamics (MD) simulation, we have investigated the adsorption of protein GA module (GA53) onto graphene oxide (GO), compared with similar adsorption onto pristine graphene (PG). We find that: (1) the protein GA53 can be easily and firmly adsorbed onto the surface of GO and PG, but the binding sites are not specific; the main difference is that the secondary structure of GA53 can be well preserved in protein-GO system, while GA53 will partially lose its secondary structure after adsorbed on PG. (2) in protein-GO system, hydroxyl and epoxy groups increase the distance between protein and GO, which weaken their vdW interactions, meanwhile, hydrogen bonds and electrostatic interactions enhance their binding affinity. In protein-PG system, strong vdW interactions between residues of GA53 and PG have destroyed its secondary structure. (3) π-π stacking interactions still exist between aromatic residues and both the basal plane of GO and PG. In comparison with PG, our results suggest that GO presents better biocompatibility to preserve protein secondary structure when simultaneously absorbing protein.
A scoring function based on solvation thermodynamics for protein structure prediction
Du, Shiqiao; Harano, Yuichi; Kinoshita, Masahiro; Sakurai, Minoru
2012-01-01
We predict protein structure using our recently developed free energy function for describing protein stability, which is focused on solvation thermodynamics. The function is combined with the current most reliable sampling methods, i.e., fragment assembly (FA) and comparative modeling (CM). The prediction is tested using 11 small proteins for which high-resolution crystal structures are available. For 8 of these proteins, sequence similarities are found in the database, and the prediction is performed with CM. Fairly accurate models with average Cα root mean square deviation (RMSD) ∼ 2.0 Å are successfully obtained for all cases. For the rest of the target proteins, we perform the prediction following FA protocols. For 2 cases, we obtain predicted models with an RMSD ∼ 3.0 Å as the best-scored structures. For the other case, the RMSD remains larger than 7 Å. For all the 11 target proteins, our scoring function identifies the experimentally determined native structure as the best structure. Starting from the predicted structure, replica exchange molecular dynamics is performed to further refine the structures. However, we are unable to improve its RMSD toward the experimental structure. The exhaustive sampling by coarse-grained normal mode analysis around the native structures reveals that our function has a linear correlation with RMSDs < 3.0 Å. These results suggest that the function is quite reliable for the protein structure prediction while the sampling method remains one of the major limiting factors in it. The aspects through which the methodology could further be improved are discussed. PMID:27493529
Wang, Edina; Chinni, Suresh; Bhore, Subhash Janardhan
2014-01-01
Background: The fatty-acid profile of the vegetable oils determines its properties and nutritional value. Palm-oil obtained from the African oil-palm [Elaeis guineensis Jacq. (Tenera)] contains 44% palmitic acid (C16:0), but, palm-oil obtained from the American oilpalm [Elaeis oleifera] contains only 25% C16:0. In part, the b-ketoacyl-[ACP] synthase II (KASII) [EC: 2.3.1.179] protein is responsible for the high level of C16:0 in palm-oil derived from the African oil-palm. To understand more about E. guineensis KASII (EgKASII) and E. oleifera KASII (EoKASII) proteins, it is essential to know its structures. Hence, this study was undertaken. Objective: The objective of this study was to predict three-dimensional (3D) structure of EgKASII and EoKASII proteins using molecular modelling tools. Materials and Methods: The amino-acid sequences for KASII proteins were retrieved from the protein database of National Center for Biotechnology Information (NCBI), USA. The 3D structures were predicted for both proteins using homology modelling and ab-initio technique approach of protein structure prediction. The molecular dynamics (MD) simulation was performed to refine the predicted structures. The predicted structure models were evaluated and root mean square deviation (RMSD) and root mean square fluctuation (RMSF) values were calculated. Results: The homology modelling showed that EgKASII and EoKASII proteins are 78% and 74% similar with Streptococcus pneumonia KASII and Brucella melitensis KASII, respectively. The EgKASII and EoKASII structures predicted by using ab-initio technique approach shows 6% and 9% deviation to its structures predicted by homology modelling, respectively. The structure refinement and validation confirmed that the predicted structures are accurate. Conclusion: The 3D structures for EgKASII and EoKASII proteins were predicted. However, further research is essential to understand the interaction of EgKASII and EoKASII proteins with its substrates. PMID:24748752
Wang, Edina; Chinni, Suresh; Bhore, Subhash Janardhan
2014-01-01
The fatty-acid profile of the vegetable oils determines its properties and nutritional value. Palm-oil obtained from the African oil-palm [Elaeis guineensis Jacq. (Tenera)] contains 44% palmitic acid (C16:0), but, palm-oil obtained from the American oilpalm [Elaeis oleifera] contains only 25% C16:0. In part, the b-ketoacyl-[ACP] synthase II (KASII) [EC: 2.3.1.179] protein is responsible for the high level of C16:0 in palm-oil derived from the African oil-palm. To understand more about E. guineensis KASII (EgKASII) and E. oleifera KASII (EoKASII) proteins, it is essential to know its structures. Hence, this study was undertaken. The objective of this study was to predict three-dimensional (3D) structure of EgKASII and EoKASII proteins using molecular modelling tools. The amino-acid sequences for KASII proteins were retrieved from the protein database of National Center for Biotechnology Information (NCBI), USA. The 3D structures were predicted for both proteins using homology modelling and ab-initio technique approach of protein structure prediction. The molecular dynamics (MD) simulation was performed to refine the predicted structures. The predicted structure models were evaluated and root mean square deviation (RMSD) and root mean square fluctuation (RMSF) values were calculated. The homology modelling showed that EgKASII and EoKASII proteins are 78% and 74% similar with Streptococcus pneumonia KASII and Brucella melitensis KASII, respectively. The EgKASII and EoKASII structures predicted by using ab-initio technique approach shows 6% and 9% deviation to its structures predicted by homology modelling, respectively. The structure refinement and validation confirmed that the predicted structures are accurate. The 3D structures for EgKASII and EoKASII proteins were predicted. However, further research is essential to understand the interaction of EgKASII and EoKASII proteins with its substrates.
Wen, Meiling; Jin, Ya; Manabe, Takashi; Chen, Shumin; Tan, Wen
2017-12-01
MS identification has long been used for PAGE-separated protein bands, but global and systematic quantitation utilizing MS after PAGE has remained rare and not been reported for native PAGE. Here we reported on a new method combining native PAGE, whole-gel slicing and quantitative LC-MS/MS, aiming at comparative analysis on not only abundance, but also structures and interactions of proteins. A pair of human plasma and serum samples were used as test samples and separated on a native PAGE gel. Six lanes of each sample were cut, each lane was further sliced into thirty-five 1.1 mm × 1.1 mm squares and all the squares were subjected to standardized procedures of in-gel digestion and quantitative LC-MS/MS. The results comprised 958 data rows that each contained abundance values of a protein detected in one square in eleven gel lanes (one plasma lane excluded). The data were evaluated to have satisfactory reproducibility of assignment and quantitation. Totally 315 proteins were assigned, with each protein assigned in 1-28 squares. The abundance distributions in the plasma and serum gel lanes were reconstructed for each protein, named as "native MS-electropherograms". Comparison of the electropherograms revealed significant plasma-versus-serum differences on 33 proteins in 87 squares (fold difference > 2 or < 0.5, p < 0.05). Many of the differences matched with accumulated knowledge on protein interactions and proteolysis involved in blood coagulation, complement and wound healing processes. We expect this method would be useful to provide more comprehensive information in comparative proteomic analysis, on both quantities and structures/interactions. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Network based approaches reveal clustering in protein point patterns
NASA Astrophysics Data System (ADS)
Parker, Joshua; Barr, Valarie; Aldridge, Joshua; Samelson, Lawrence E.; Losert, Wolfgang
2014-03-01
Recent advances in super-resolution imaging have allowed for the sub-diffraction measurement of the spatial location of proteins on the surfaces of T-cells. The challenge is to connect these complex point patterns to the internal processes and interactions, both protein-protein and protein-membrane. We begin analyzing these patterns by forming a geometric network amongst the proteins and looking at network measures, such the degree distribution. This allows us to compare experimentally observed patterns to models. Specifically, we find that the experimental patterns differ from heterogeneous Poisson processes, highlighting an internal clustering structure. Further work will be to compare our results to simulated protein-protein interactions to determine clustering mechanisms.
Zhou, Wengang; Dickerson, Julie A
2012-01-01
Knowledge of protein subcellular locations can help decipher a protein's biological function. This work proposes new features: sequence-based: Hybrid Amino Acid Pair (HAAP) and two structure-based: Secondary Structural Element Composition (SSEC) and solvent accessibility state frequency. A multi-class Support Vector Machine is developed to predict the locations. Testing on two established data sets yields better prediction accuracies than the best available systems. Comparisons with existing methods show comparable results to ESLPred2. When StruLocPred is applied to the entire Arabidopsis proteome, over 77% of proteins with known locations match the prediction results. An implementation of this system is at http://wgzhou.ece. iastate.edu/StruLocPred/.
Ogawa, Seiji; Watanabe, Toshihide; Moriyuki, Kazumi; Goto, Yoshikazu; Yamane, Shinsaku; Watanabe, Akio; Tsuboi, Kazuma; Kinoshita, Atsushi; Okada, Takuya; Takeda, Hiroyuki; Tani, Kousuke; Maruyama, Toru
2016-05-15
The modification of the novel G protein-biased EP2 agonist 1 has been investigated to improve its G protein activity and develop a better understanding of its structure-functional selectivity relationship (SFSR). The optimization of the substituents on the phenyl ring of 1, followed by the inversion of the hydroxyl group on the cyclopentane moiety led to compound 9, which showed a 100-fold increase in its G protein activity compared with 1 without any increase in β-arrestin recruitment. Furthermore, SFSR studies revealed that the combination of meta and para substituents on the phenyl moiety was crucial to the functional selectivity. Copyright © 2016 Elsevier Ltd. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Leite, Wellington C.; Galvão, Carolina W.; Saab, Sérgio C.
The bacterial RecA protein plays a role in the complex system of DNA damage repair. Here, we report the functional and structural characterization of the Herbaspirillum seropedicae RecA protein (HsRecA). HsRecA protein is more efficient at displacing SSB protein from ssDNA than Escherichia coli RecA protein. HsRecA also promotes DNA strand exchange more efficiently. The three dimensional structure of HsRecA-ADP/ATP complex has been solved to 1.7 Å resolution. HsRecA protein contains a small N-terminal domain, a central core ATPase domain and a large C-terminal domain, that are similar to homologous bacterial RecA proteins. Comparative structural analysis showed that the N-terminalmore » polymerization motif of archaeal and eukaryotic RecA family proteins are also present in bacterial RecAs. Reconstruction of electrostatic potential from the hexameric structure of HsRecA-ADP/ATP revealed a high positive charge along the inner side, where ssDNA is bound inside the filament. The properties of this surface may explain the greater capacity of HsRecA protein to bind ssDNA, forming a contiguous nucleoprotein filament, displace SSB and promote DNA exchange relative to EcRecA. In conclusion, our functional and structural analyses provide insight into the molecular mechanisms of polymerization of bacterial RecA as a helical nucleoprotein filament.« less
Galvão, Carolina W.; Saab, Sérgio C.; Iulek, Jorge; Etto, Rafael M.; Steffens, Maria B. R.; Chitteni-Pattu, Sindhu; Stanage, Tyler; Keck, James L.; Cox, Michael M.
2016-01-01
The bacterial RecA protein plays a role in the complex system of DNA damage repair. Here, we report the functional and structural characterization of the Herbaspirillum seropedicae RecA protein (HsRecA). HsRecA protein is more efficient at displacing SSB protein from ssDNA than Escherichia coli RecA protein. HsRecA also promotes DNA strand exchange more efficiently. The three dimensional structure of HsRecA-ADP/ATP complex has been solved to 1.7 Å resolution. HsRecA protein contains a small N-terminal domain, a central core ATPase domain and a large C-terminal domain, that are similar to homologous bacterial RecA proteins. Comparative structural analysis showed that the N-terminal polymerization motif of archaeal and eukaryotic RecA family proteins are also present in bacterial RecAs. Reconstruction of electrostatic potential from the hexameric structure of HsRecA-ADP/ATP revealed a high positive charge along the inner side, where ssDNA is bound inside the filament. The properties of this surface may explain the greater capacity of HsRecA protein to bind ssDNA, forming a contiguous nucleoprotein filament, displace SSB and promote DNA exchange relative to EcRecA. Our functional and structural analyses provide insight into the molecular mechanisms of polymerization of bacterial RecA as a helical nucleoprotein filament. PMID:27447485
GRID: a high-resolution protein structure refinement algorithm.
Chitsaz, Mohsen; Mayo, Stephen L
2013-03-05
The energy-based refinement of protein structures generated by fold prediction algorithms to atomic-level accuracy remains a major challenge in structural biology. Energy-based refinement is mainly dependent on two components: (1) sufficiently accurate force fields, and (2) efficient conformational space search algorithms. Focusing on the latter, we developed a high-resolution refinement algorithm called GRID. It takes a three-dimensional protein structure as input and, using an all-atom force field, attempts to improve the energy of the structure by systematically perturbing backbone dihedrals and side-chain rotamer conformations. We compare GRID to Backrub, a stochastic algorithm that has been shown to predict a significant fraction of the conformational changes that occur with point mutations. We applied GRID and Backrub to 10 high-resolution (≤ 2.8 Å) crystal structures from the Protein Data Bank and measured the energy improvements obtained and the computation times required to achieve them. GRID resulted in energy improvements that were significantly better than those attained by Backrub while expending about the same amount of computational resources. GRID resulted in relaxed structures that had slightly higher backbone RMSDs compared to Backrub relative to the starting crystal structures. The average RMSD was 0.25 ± 0.02 Å for GRID versus 0.14 ± 0.04 Å for Backrub. These relatively minor deviations indicate that both algorithms generate structures that retain their original topologies, as expected given the nature of the algorithms. Copyright © 2012 Wiley Periodicals, Inc.
Kingsley, Laura J.; Lill, Markus A.
2014-01-01
Computational prediction of ligand entry and egress paths in proteins has become an emerging topic in computational biology and has proven useful in fields such as protein engineering and drug design. Geometric tunnel prediction programs, such as Caver3.0 and MolAxis, are computationally efficient methods to identify potential ligand entry and egress routes in proteins. Although many geometric tunnel programs are designed to accommodate a single input structure, the increasingly recognized importance of protein flexibility in tunnel formation and behavior has led to the more widespread use of protein ensembles in tunnel prediction. However, there has not yet been an attempt to directly investigate the influence of ensemble size and composition on geometric tunnel prediction. In this study, we compared tunnels found in a single crystal structure to ensembles of various sizes generated using different methods on both the apo and holo forms of cytochrome P450 enzymes CYP119, CYP2C9, and CYP3A4. Several protein structure clustering methods were tested in an attempt to generate smaller ensembles that were capable of reproducing the data from larger ensembles. Ultimately, we found that by including members from both the apo and holo data sets, we could produce ensembles containing less than 15 members that were comparable to apo or holo ensembles containing over 100 members. Furthermore, we found that, in the absence of either apo or holo crystal structure data, pseudo-apo or –holo ensembles (e.g. adding ligand to apo protein throughout MD simulations) could be used to resemble the structural ensembles of the corresponding apo and holo ensembles, respectively. Our findings not only further highlight the importance of including protein flexibility in geometric tunnel prediction, but also suggest that smaller ensembles can be as capable as larger ensembles at capturing many of the protein motions important for tunnel prediction at a lower computational cost. PMID:24956479
Experimental Protein Structure Verification by Scoring with a Single, Unassigned NMR Spectrum
Courtney, Joseph M.; Ye, Qing; Nesbitt, Anna E.; Tang, Ming; Tuttle, Marcus D.; Watt, Eric D.; Nuzzio, Kristin M.; Sperling, Lindsay J.; Comellas, Gemma; Peterson, Joseph R.; Morrissey, James H.; Rienstra, Chad M.
2016-01-01
Standard methods for de novo protein structure determination by nuclear magnetic resonance (NMR) require time-consuming data collection and interpretation efforts. Here we present a qualitatively distinct and novel approach, called Comparative, Objective Measurement of Protein Architectures by Scoring Shifts (COMPASS), which identifies the best structures from a set of structural models by numerical comparison with a single, unassigned 2D 13C-13C NMR spectrum containing backbone and side-chain aliphatic signals. COMPASS does not require resonance assignments. It is particularly well suited for interpretation of magic-angle spinning solid-state NMR spectra, but also applicable to solution NMR spectra. We demonstrate COMPASS with experimental data from four proteins—GB1, ubiquitin, DsbA, and the extracellular domain of human tissue factor—and with reconstructed spectra from 11 additional proteins. For all these proteins, with molecular mass up to 25 kDa, COMPASS distinguished the correct fold, most often within 1.5 Å root-mean-square deviation of the reference structure. PMID:26365800
ModeRNA: a tool for comparative modeling of RNA 3D structure
Rother, Magdalena; Rother, Kristian; Puton, Tomasz; Bujnicki, Janusz M.
2011-01-01
RNA is a large group of functionally important biomacromolecules. In striking analogy to proteins, the function of RNA depends on its structure and dynamics, which in turn is encoded in the linear sequence. However, while there are numerous methods for computational prediction of protein three-dimensional (3D) structure from sequence, with comparative modeling being the most reliable approach, there are very few such methods for RNA. Here, we present ModeRNA, a software tool for comparative modeling of RNA 3D structures. As an input, ModeRNA requires a 3D structure of a template RNA molecule, and a sequence alignment between the target to be modeled and the template. It must be emphasized that a good alignment is required for successful modeling, and for large and complex RNA molecules the development of a good alignment usually requires manual adjustments of the input data based on previous expertise of the respective RNA family. ModeRNA can model post-transcriptional modifications, a functionally important feature analogous to post-translational modifications in proteins. ModeRNA can also model DNA structures or use them as templates. It is equipped with many functions for merging fragments of different nucleic acid structures into a single model and analyzing their geometry. Windows and UNIX implementations of ModeRNA with comprehensive documentation and a tutorial are freely available. PMID:21300639
Kumar, Avishek; Butler, Brandon M.; Kumar, Sudhir; Ozkan, S. Banu
2016-01-01
Summary Sequencing technologies are revealing many new non-synonymous single nucleotide variants (nsSNVs) in each personal exome. To assess their functional impacts, comparative genomics is frequently employed to predict if they are benign or not. However, evolutionary analysis alone is insufficient, because it misdiagnoses many disease-associated nsSNVs, such as those at positions involved in protein interfaces, and because evolutionary predictions do not provide mechanistic insights into functional change or loss. Structural analyses can aid in overcoming both of these problems by incorporating conformational dynamics and allostery in nSNV diagnosis. Finally, protein-protein interaction networks using systems-level methodologies shed light onto disease etiology and pathogenesis. Bridging these network approaches with structurally resolved protein interactions and dynamics will advance genomic medicine. PMID:26684487
Kraner, Max E; Müller, Carmen; Sonnewald, Uwe
2017-11-01
In plants, intercellular communication and exchange are highly dependent on cell wall bridging structures between adhering cells, so-called plasmodesmata (PD). In our previous genetic screen for PD-deficient Arabidopsis mutants, we described choline transporter-like 1 (CHER1) being important for PD genesis and maturation. Leaves of cher1 mutant plants have up to 10 times less PD, which do not develop to complex structures. Here we utilize the T-DNA insertion mutant cher1-4 and report a deep comparative proteomic workflow for the identification of cell-wall-embedded PD-associated proteins. Analyzing triplicates of cell-wall-enriched fractions in depth by fractionation and quantitative high-resolution mass spectrometry, we compared > 5000 proteins obtained from fully developed leaves. Comparative data analysis and subsequent filtering generated a list of 61 proteins being significantly more abundant in Col-0. This list was enriched for previously described PD-associated proteins. To validate PD association of so far uncharacterized proteins, subcellular localization analyses were carried out by confocal laser-scanning microscopy. This study confirmed the association of PD for three out of four selected candidates, indicating that the comparative approach indeed allowed identification of so far undescribed PD-associated proteins. Performing comparative cell wall proteomics of Nicotiana benthamiana tissue, we observed an increase in abundance of these three selected candidates during sink to source transition. Taken together, our comparative proteomic approach revealed a valuable data set of potential PD-associated proteins, which can be used as a resource to unravel the molecular composition of complex PD and to investigate their function in cell-to-cell communication. © 2017 The Authors. The Plant Journal published by John Wiley & Sons Ltd and Society for Experimental Biology.
Building a Better Fragment Library for De Novo Protein Structure Prediction
de Oliveira, Saulo H. P.; Shi, Jiye; Deane, Charlotte M.
2015-01-01
Fragment-based approaches are the current standard for de novo protein structure prediction. These approaches rely on accurate and reliable fragment libraries to generate good structural models. In this work, we describe a novel method for structure fragment library generation and its application in fragment-based de novo protein structure prediction. The importance of correct testing procedures in assessing the quality of fragment libraries is demonstrated. In particular, the exclusion of homologs to the target from the libraries to correctly simulate a de novo protein structure prediction scenario, something which surprisingly is not always done. We demonstrate that fragments presenting different predominant predicted secondary structures should be treated differently during the fragment library generation step and that exhaustive and random search strategies should both be used. This information was used to develop a novel method, Flib. On a validation set of 41 structurally diverse proteins, Flib libraries presents both a higher precision and coverage than two of the state-of-the-art methods, NNMake and HHFrag. Flib also achieves better precision and coverage on the set of 275 protein domains used in the two previous experiments of the the Critical Assessment of Structure Prediction (CASP9 and CASP10). We compared Flib libraries against NNMake libraries in a structure prediction context. Of the 13 cases in which a correct answer was generated, Flib models were more accurate than NNMake models for 10. “Flib is available for download at: http://www.stats.ox.ac.uk/research/proteins/resources”. PMID:25901595
Objective identification of residue ranges for the superposition of protein structures
2011-01-01
Background The automation of objectively selecting amino acid residue ranges for structure superpositions is important for meaningful and consistent protein structure analyses. So far there is no widely-used standard for choosing these residue ranges for experimentally determined protein structures, where the manual selection of residue ranges or the use of suboptimal criteria remain commonplace. Results We present an automated and objective method for finding amino acid residue ranges for the superposition and analysis of protein structures, in particular for structure bundles resulting from NMR structure calculations. The method is implemented in an algorithm, CYRANGE, that yields, without protein-specific parameter adjustment, appropriate residue ranges in most commonly occurring situations, including low-precision structure bundles, multi-domain proteins, symmetric multimers, and protein complexes. Residue ranges are chosen to comprise as many residues of a protein domain that increasing their number would lead to a steep rise in the RMSD value. Residue ranges are determined by first clustering residues into domains based on the distance variance matrix, and then refining for each domain the initial choice of residues by excluding residues one by one until the relative decrease of the RMSD value becomes insignificant. A penalty for the opening of gaps favours contiguous residue ranges in order to obtain a result that is as simple as possible, but not simpler. Results are given for a set of 37 proteins and compared with those of commonly used protein structure validation packages. We also provide residue ranges for 6351 NMR structures in the Protein Data Bank. Conclusions The CYRANGE method is capable of automatically determining residue ranges for the superposition of protein structure bundles for a large variety of protein structures. The method correctly identifies ordered regions. Global structure superpositions based on the CYRANGE residue ranges allow a clear presentation of the structure, and unnecessary small gaps within the selected ranges are absent. In the majority of cases, the residue ranges from CYRANGE contain fewer gaps and cover considerably larger parts of the sequence than those from other methods without significantly increasing the RMSD values. CYRANGE thus provides an objective and automatic method for standardizing the choice of residue ranges for the superposition of protein structures. PMID:21592348
Kaur, Parminder; Kiselar, Janna; Yang, Sichun; Chance, Mark R.
2015-01-01
Hydroxyl radical footprinting based MS for protein structure assessment has the goal of understanding ligand induced conformational changes and macromolecular interactions, for example, protein tertiary and quaternary structure, but the structural resolution provided by typical peptide-level quantification is limiting. In this work, we present experimental strategies using tandem-MS fragmentation to increase the spatial resolution of the technique to the single residue level to provide a high precision tool for molecular biophysics research. Overall, in this study we demonstrated an eightfold increase in structural resolution compared with peptide level assessments. In addition, to provide a quantitative analysis of residue based solvent accessibility and protein topography as a basis for high-resolution structure prediction; we illustrate strategies of data transformation using the relative reactivity of side chains as a normalization strategy and predict side-chain surface area from the footprinting data. We tested the methods by examination of Ca+2-calmodulin showing highly significant correlations between surface area and side-chain contact predictions for individual side chains and the crystal structure. Tandem ion based hydroxyl radical footprinting-MS provides quantitative high-resolution protein topology information in solution that can fill existing gaps in structure determination for large proteins and macromolecular complexes. PMID:25687570
P-proteins in Arabidopsis are heteromeric structures involved in rapid sieve tube sealing.
Jekat, Stephan B; Ernst, Antonia M; von Bohl, Andreas; Zielonka, Sascia; Twyman, Richard M; Noll, Gundula A; Prüfer, Dirk
2013-01-01
Structural phloem proteins (P-proteins) are characteristic components of the sieve elements in all dicotyledonous and many monocotyledonous angiosperms. Tobacco P-proteins were recently confirmed to be encoded by the widespread sieve element occlusion (SEO) gene family, and tobacco SEO proteins were shown to be directly involved in sieve tube sealing thus preventing the loss of photosynthate. Analysis of the two Arabidopsis SEO proteins (AtSEOa and AtSEOb) indicated that the corresponding P-protein subunits do not act in a redundant manner. However, there are still pending questions regarding the interaction properties and specific functions of AtSEOa and AtSEOb as well as the general function of structural P-proteins in Arabidopsis. In this study, we characterized the Arabidopsis P-proteins in more detail. We used in planta bimolecular fluorescence complementation assays to confirm the predicted heteromeric interactions between AtSEOa and AtSEOb. Arabidopsis mutants depleted for one or both AtSEO proteins lacked the typical P-protein structures normally found in sieve elements, underlining the identity of AtSEO proteins as P-proteins and furthermore providing the means to determine the role of Arabidopsis P-proteins in sieve tube sealing. We therefore developed an assay based on phloem exudation. Mutants with reduced AtSEO expression levels lost twice as much photosynthate following injury as comparable wild-type plants, confirming that Arabidopsis P-proteins are indeed involved in sieve tube sealing.
Pavankumar, Asalapuram R; Kayathri, Rajarathinam; Murugan, Natarajan A; Zhang, Qiong; Srivastava, Vaibhav; Okoli, Chuka; Bulone, Vincent; Rajarao, Gunaratna K; Ågren, Hans
2014-01-01
Many proteins exist in dimeric and other oligomeric forms to gain stability and functional advantages. In this study, the dimerization property of a coagulant protein (MO2.1) from Moringa oleifera seeds was addressed through laboratory experiments, protein-protein docking studies and binding free energy calculations. The structure of MO2.1 was predicted by homology modelling, while binding free energy and residues-distance profile analyses provided insight into the energetics and structural factors for dimer formation. Since the coagulation activities of the monomeric and dimeric forms of MO2.1 were comparable, it was concluded that oligomerization does not affect the biological activity of the protein.
CORAL: aligning conserved core regions across domain families.
Fong, Jessica H; Marchler-Bauer, Aron
2009-08-01
Homologous protein families share highly conserved sequence and structure regions that are frequent targets for comparative analysis of related proteins and families. Many protein families, such as the curated domain families in the Conserved Domain Database (CDD), exhibit similar structural cores. To improve accuracy in aligning such protein families, we propose a profile-profile method CORAL that aligns individual core regions as gap-free units. CORAL computes optimal local alignment of two profiles with heuristics to preserve continuity within core regions. We benchmarked its performance on curated domains in CDD, which have pre-defined core regions, against COMPASS, HHalign and PSI-BLAST, using structure superpositions and comprehensive curator-optimized alignments as standards of truth. CORAL improves alignment accuracy on core regions over general profile methods, returning a balanced score of 0.57 for over 80% of all domain families in CDD, compared with the highest balanced score of 0.45 from other methods. Further, CORAL provides E-values to aid in detecting homologous protein families and, by respecting block boundaries, produces alignments with improved 'readability' that facilitate manual refinement. CORAL will be included in future versions of the NCBI Cn3D/CDTree software, which can be downloaded at http://www.ncbi.nlm.nih.gov/Structure/cdtree/cdtree.shtml. Supplementary data are available at Bioinformatics online.
Robustness of atomistic Gō models in predicting native-like folding intermediates
NASA Astrophysics Data System (ADS)
Estácio, S. G.; Fernandes, C. S.; Krobath, H.; Faísca, P. F. N.; Shakhnovich, E. I.
2012-08-01
Gō models are exceedingly popular tools in computer simulations of protein folding. These models are native-centric, i.e., they are directly constructed from the protein's native structure. Therefore, it is important to understand up to which extent the atomistic details of the native structure dictate the folding behavior exhibited by Gō models. Here we address this challenge by performing exhaustive discrete molecular dynamics simulations of a Gō potential combined with a full atomistic protein representation. In particular, we investigate the robustness of this particular type of Gō models in predicting the existence of intermediate states in protein folding. We focus on the N47G mutational form of the Spc-SH3 folding domain (x-ray structure) and compare its folding pathway with that of alternative native structures produced in silico. Our methodological strategy comprises equilibrium folding simulations, structural clustering, and principal component analysis.
The helical structure of DNA facilitates binding
NASA Astrophysics Data System (ADS)
Berg, Otto G.; Mahmutovic, Anel; Marklund, Emil; Elf, Johan
2016-09-01
The helical structure of DNA imposes constraints on the rate of diffusion-limited protein binding. Here we solve the reaction-diffusion equations for DNA-like geometries and extend with simulations when necessary. We find that the helical structure can make binding to the DNA more than twice as fast compared to a case where DNA would be reactive only along one side. We also find that this rate advantage remains when the contributions from steric constraints and rotational diffusion of the DNA-binding protein are included. Furthermore, we find that the association rate is insensitive to changes in the steric constraints on the DNA in the helix geometry, while it is much more dependent on the steric constraints on the DNA-binding protein. We conclude that the helical structure of DNA facilitates the nonspecific binding of transcription factors and structural DNA-binding proteins in general.
Schaeffer, E; Sninsky, J J
1984-01-01
Proteins that are related evolutionarily may have diverged at the level of primary amino acid sequence while maintaining similar secondary structures. Computer analysis has been used to compare the open reading frames of the hepatitis B virus to those of the woodchuck hepatitis virus at the level of amino acid sequence, and to predict the relative hydrophilic character and the secondary structure of putative polypeptides. Similarity is seen at the levels of relative hydrophilicity and secondary structure, in the absence of sequence homology. These data reinforce the proposal that these open reading frames encode viral proteins. Computer analysis of this type can be more generally used to establish structural similarities between proteins that do not share obvious sequence homology as well as to assess whether an open reading frame is fortuitous or codes for a protein. PMID:6585835
Computational analysis of human and mouse CREB3L4 Protein
Velpula, Kiran Kumar; Rehman, Azeem Abdul; Chigurupati, Soumya; Sanam, Ramadevi; Inampudi, Krishna Kishore; Akila, Chandra Sekhar
2012-01-01
CREB3L4 is a member of the CREB/ATF transcription factor family, characterized by their regulation of gene expression through the cAMP-responsive element. Previous studies identified this protein in mice and humans. Whereas CREB3L4 in mice (referred to as Tisp40) is found in the testes and functions in spermatogenesis, human CREB3L4 is primarily detected in the prostate and has been implicated in cancer. We conducted computational analyses to compare the structural homology between murine Tisp40α human CREB3L4. Our results reveal that the primary and secondary structures of the two proteins contain high similarity. Additionally, predicted helical transmembrane structure reveals that the proteins likely have similar structure and function. This study offers preliminary findings that support the translation of mouse Tisp40α findings into human models, based on structural homology. PMID:22829733
SiteBinder: an improved approach for comparing multiple protein structural motifs.
Sehnal, David; Vařeková, Radka Svobodová; Huber, Heinrich J; Geidl, Stanislav; Ionescu, Crina-Maria; Wimmerová, Michaela; Koča, Jaroslav
2012-02-27
There is a paramount need to develop new techniques and tools that will extract as much information as possible from the ever growing repository of protein 3D structures. We report here on the development of a software tool for the multiple superimposition of large sets of protein structural motifs. Our superimposition methodology performs a systematic search for the atom pairing that provides the best fit. During this search, the RMSD values for all chemically relevant pairings are calculated by quaternion algebra. The number of evaluated pairings is markedly decreased by using PDB annotations for atoms. This approach guarantees that the best fit will be found and can be applied even when sequence similarity is low or does not exist at all. We have implemented this methodology in the Web application SiteBinder, which is able to process up to thousands of protein structural motifs in a very short time, and which provides an intuitive and user-friendly interface. Our benchmarking analysis has shown the robustness, efficiency, and versatility of our methodology and its implementation by the successful superimposition of 1000 experimentally determined structures for each of 32 eukaryotic linear motifs. We also demonstrate the applicability of SiteBinder using three case studies. We first compared the structures of 61 PA-IIL sugar binding sites containing nine different sugars, and we found that the sugar binding sites of PA-IIL and its mutants have a conserved structure despite their binding different sugars. We then superimposed over 300 zinc finger central motifs and revealed that the molecular structure in the vicinity of the Zn atom is highly conserved. Finally, we superimposed 12 BH3 domains from pro-apoptotic proteins. Our findings come to support the hypothesis that there is a structural basis for the functional segregation of BH3-only proteins into activators and enablers.
Structural and evolutionary analysis of Leishmania Alba proteins.
da Costa, Kauê Santana; Galúcio, João Marcos Pereira; Leonardo, Elvis Santos; Cardoso, Guelber; Leal, Élcio; Conde, Guilherme; Lameira, Jerônimo
2017-10-01
The Alba superfamily proteins share a common RNA-binding domain. These proteins participate in a variety of regulatory pathways by controlling developmental gene expression. They also interact with ribosomal subunits, translation factors, and other RNA-binding proteins. The Leishmania infantum genome encodes two Alba-domain proteins, LiAlba1 and LiAlba3. In this work, we used homology modeling, protein-protein docking, and molecular dynamics (MD) simulations to explore the details of the Alba1-Alba3-RNA complex from Leishmania infantum at the molecular level. In addition, we compared the structure of LiAlba3 with the human ribonuclease P component, Rpp20. We also mapped the ligand-binding residues on the Alba3 surface to analyze its druggability and performed mutational analyses in Alba3 using alanine scanning to identify residues involved in its function and structural stability. These results suggest that the RGG-box motif of LiAlba1 is important for protein function and stability. Finally, we discuss the function of Alba proteins in the context of pathogen adaptation to host cells. The data provided herein will facilitate further translational research regarding Alba structure and function. Copyright © 2017 Elsevier B.V. All rights reserved.
Zhang, Yinfeng; Sun, Caijun; Feng, Liqiang; Xiao, Lijun; Chen, Ling
2012-04-01
Accessory and regulatory proteins (nonstructural proteins) have received increasing attention as components in novel HIV/SIV vaccine design. However, the complicated interactions between nonstructural proteins and structural proteins remain poorly understood, especially their effects on immunogenicity. In this study, the immunogenicity of structural proteins in the presence and absence of nonstructural proteins was compared. First, a series of recombinant plasmids and adenoviral vectors carrying various SIVmac239 nonstructural and structural genes was constructed. Then mice were primed with DNA plasmids and boosted with corresponding Ad5 vectors of different combinations, and the resulting immune responses were measured. Our results demonstrated that when the individual Gag, Pol, or Env gene products were coimmunized with the whole repertoire of nonstructural proteins, the Gag-specific CD8(+) T response was greatly enhanced, while the Env- and Pol-specific CD8(+) T responses were significantly reduced. The same pattern was not observed in CD4(+) T cell responses. Antibody responses against both the Gag and Env proteins were elicited more effectively when these structural antigens were immunized together with nonstructural antigens. These findings may provide helpful insights into the development of novel HIV/SIV vaccines.
2018-01-01
Membrane proteins perform a host of vital cellular functions. Deciphering the molecular mechanisms whereby they fulfill these functions requires detailed biophysical and structural investigations. Detergents have proven pivotal to extract the protein from its native surroundings. Yet, they provide a milieu that departs significantly from that of the biological membrane, to the extent that the structure, the dynamics, and the interactions of membrane proteins in detergents may considerably vary, as compared to the native environment. Understanding the impact of detergents on membrane proteins is, therefore, crucial to assess the biological relevance of results obtained in detergents. Here, we review the strengths and weaknesses of alkyl phosphocholines (or foscholines), the most widely used detergent in solution-NMR studies of membrane proteins. While this class of detergents is often successful for membrane protein solubilization, a growing list of examples points to destabilizing and denaturing properties, in particular for α-helical membrane proteins. Our comprehensive analysis stresses the importance of stringent controls when working with this class of detergents and when analyzing the structure and dynamics of membrane proteins in alkyl phosphocholine detergents. PMID:29488756
Protein-protein binding before and after photo-modification of albumin
NASA Astrophysics Data System (ADS)
Rozinek, Sarah C.; Glickman, Randolph D.; Thomas, Robert J.; Brancaleon, Lorenzo
2016-03-01
Bioeffects of directed-optical-energy encompass a wide range of applications. One aspect of photochemical interactions involves irradiating a photosensitizer with visible light in order to induce protein unfolding and consequent changes in function. In the past, irradiation of several dye-protein combinations has revealed effects on protein structure. Beta lactoglobulin, human serum albumin (HSA) and tubulin have all been photo-modified with meso-tetrakis(4- sulfonatophenyl)porphyrin (TSPP) bound, but only in the case of tubulin has binding caused a verified loss of biological function (loss of ability to form microtubules) as a result of this light-induced structural change. The current work questions if the photo-induced structural changes that occur to HSA, are sufficient to disable its biological function of binding to osteonectin. The albumin-binding protein, osteonectin, is about half the molecular weight of HSA, so the two proteins and their bound product can be separated and quantified by size exclusion high performance liquid chromatography. TSPP was first bound to HSA and irradiated, photo-modifying the structure of HSA. Then native HSA or photo-modified HSA (both with TSPP bound) were compared, to assess loss in HSA's innate binding ability as a result of light-induced structure modification.
Khvostichenko, Daria S.; Schieferstein, Jeremy M.; Pawate, Ashtamurthy S.; ...
2014-08-21
Crystallization from lipidic mesophase matrices is a promising route to diffraction-quality crystals and structures of membrane proteins. The microfluidic approach reported here eliminates two bottlenecks of the standard mesophase-based crystallization protocols: (i) manual preparation of viscous mesophases and (ii) manual harvesting of often small and fragile protein crystals. In the approach reported here, protein-loaded mesophases are formulated in an X-ray transparent microfluidic chip using only 60 nL of the protein solution per crystallization trial. The X-ray transparency of the chip enables diffraction data collection from multiple crystals residing in microfluidic wells, eliminating the normally required manual harvesting and mounting ofmore » individual crystals. In addition, we validated our approach by on-chip crystallization of photosynthetic reaction center, a membrane protein from Rhodobacter sphaeroides, followed by solving its structure to a resolution of 2.5 Å using X-ray diffraction data collected on-chip under ambient conditions. A moderate conformational change in hydrophilic chains of the protein was observed when comparing the on-chip, room temperature structure with known structures for which data were acquired under cryogenic conditions.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Khvostichenko, Daria S.; Schieferstein, Jeremy M.; Pawate, Ashtamurthy S.
2014-10-01
Crystallization from lipidic mesophase matrices is a promising route to diffraction-quality crystals and structures of membrane proteins. The microfluidic approach reported here eliminates two bottlenecks of the standard mesophase-based crystallization protocols: (i) manual preparation of viscous mesophases and (ii) manual harvesting of often small and fragile protein crystals. In the approach reported here, protein-loaded mesophases are formulated in an X-ray transparent microfluidic chip using only 60 nL of the protein solution per crystallization trial. The X-ray transparency of the chip enables diffraction data collection from multiple crystals residing in microfluidic wells, eliminating the normally required manual harvesting and mounting ofmore » individual crystals. We validated our approach by on-chip crystallization of photosynthetic reaction center, a membrane protein from Rhodobacter sphaeroides, followed by solving its structure to a resolution of 2.5 Å using X-ray diffraction data collected on-chip under ambient conditions. A moderate conformational change in hydrophilic chains of the protein was observed when comparing the on-chip, room temperature structure with known structures for which data were acquired under cryogenic conditions.« less
Composite Structural Motifs of Binding Sites for Delineating Biological Functions of Proteins
Kinjo, Akira R.; Nakamura, Haruki
2012-01-01
Most biological processes are described as a series of interactions between proteins and other molecules, and interactions are in turn described in terms of atomic structures. To annotate protein functions as sets of interaction states at atomic resolution, and thereby to better understand the relation between protein interactions and biological functions, we conducted exhaustive all-against-all atomic structure comparisons of all known binding sites for ligands including small molecules, proteins and nucleic acids, and identified recurring elementary motifs. By integrating the elementary motifs associated with each subunit, we defined composite motifs that represent context-dependent combinations of elementary motifs. It is demonstrated that function similarity can be better inferred from composite motif similarity compared to the similarity of protein sequences or of individual binding sites. By integrating the composite motifs associated with each protein function, we define meta-composite motifs each of which is regarded as a time-independent diagrammatic representation of a biological process. It is shown that meta-composite motifs provide richer annotations of biological processes than sequence clusters. The present results serve as a basis for bridging atomic structures to higher-order biological phenomena by classification and integration of binding site structures. PMID:22347478
Devine, Paul W A; Fisher, Henry C; Calabrese, Antonio N; Whelan, Fiona; Higazi, Daniel R; Potts, Jennifer R; Lowe, David C; Radford, Sheena E; Ashcroft, Alison E
2017-09-01
Collision cross-section (CCS) measurements obtained from ion mobility spectrometry-mass spectrometry (IMS-MS) analyses often provide useful information concerning a protein's size and shape and can be complemented by modeling procedures. However, there have been some concerns about the extent to which certain proteins maintain a native-like conformation during the gas-phase analysis, especially proteins with dynamic or extended regions. Here we have measured the CCSs of a range of biomolecules including non-globular proteins and RNAs of different sequence, size, and stability. Using traveling wave IMS-MS, we show that for the proteins studied, the measured CCS deviates significantly from predicted CCS values based upon currently available structures. The results presented indicate that these proteins collapse to different extents varying on their elongated structures upon transition into the gas-phase. Comparing two RNAs of similar mass but different solution structures, we show that these biomolecules may also be susceptible to gas-phase compaction. Together, the results suggest that caution is needed when predicting structural models based on CCS data for RNAs as well as proteins with non-globular folds. Graphical Abstract ᅟ.
Predicting Real-Valued Protein Residue Fluctuation Using FlexPred.
Peterson, Lenna; Jamroz, Michal; Kolinski, Andrzej; Kihara, Daisuke
2017-01-01
The conventional view of a protein structure as static provides only a limited picture. There is increasing evidence that protein dynamics are often vital to protein function including interaction with partners such as other proteins, nucleic acids, and small molecules. Considering flexibility is also important in applications such as computational protein docking and protein design. While residue flexibility is partially indicated by experimental measures such as the B-factor from X-ray crystallography and ensemble fluctuation from nuclear magnetic resonance (NMR) spectroscopy as well as computational molecular dynamics (MD) simulation, these techniques are resource-intensive. In this chapter, we describe the web server and stand-alone version of FlexPred, which rapidly predicts absolute per-residue fluctuation from a three-dimensional protein structure. On a set of 592 nonredundant structures, comparing the fluctuations predicted by FlexPred to the observed fluctuations in MD simulations showed an average correlation coefficient of 0.669 and an average root mean square error of 1.07 Å. FlexPred is available at http://kiharalab.org/flexPred/ .
A phylogenetic analysis of normal modes evolution in enzymes and its relationship to enzyme function
Lai, Jason; Jin, Jing; Kubelka, Jan; Liberles, David A.
2012-01-01
Since the dynamic nature of protein structures is essential for enzymatic function, it is expected that the functional evolution can be inferred from the changes in the protein dynamics. However, dynamics can also diverge neutrally with sequence substitution between enzymes without changes of function. In this study, a phylogenetic approach is implemented to explore the relationship between enzyme dynamics and function through evolutionary history. Protein dynamics are described by normal mode analysis based on a simplified harmonic potential force field applied to the reduced Cα representation of the protein structure while enzymatic function is described by Enzyme Commission (EC) numbers. Similarity of the binding pocket dynamics at each branch of the protein family’s phylogeny was analyzed in two ways: 1) explicitly by quantifying the normal mode overlap calculated for the reconstructed ancestral proteins at each end and 2) implicitly using a diffusion model to obtain the reconstructed lineage-specific changes in the normal modes. Both explicit and implicit ancestral reconstruction identified generally faster rates of change in dynamics compared with the expected change from neutral evolution at the branches of potential functional divergences for the alpha-amylase, D-isomer specific 2-hydroxyacid dehydrogenase, and copper-containing amine oxidase protein families. Normal modes analysis added additional information over just comparing the RMSD of static structures. However, the branch-specific changes were not statistically significant compared to background function-independent neutral rates of change of dynamic properties and blind application of the analysis would not enable prediction of changes in enzyme specificity. PMID:22651983
Lai, Jason; Jin, Jing; Kubelka, Jan; Liberles, David A
2012-09-21
Since the dynamic nature of protein structures is essential for enzymatic function, it is expected that functional evolution can be inferred from the changes in protein dynamics. However, dynamics can also diverge neutrally with sequence substitution between enzymes without changes of function. In this study, a phylogenetic approach is implemented to explore the relationship between enzyme dynamics and function through evolutionary history. Protein dynamics are described by normal mode analysis based on a simplified harmonic potential force field applied to the reduced C(α) representation of the protein structure while enzymatic function is described by Enzyme Commission numbers. Similarity of the binding pocket dynamics at each branch of the protein family's phylogeny was analyzed in two ways: (1) explicitly by quantifying the normal mode overlap calculated for the reconstructed ancestral proteins at each end and (2) implicitly using a diffusion model to obtain the reconstructed lineage-specific changes in the normal modes. Both explicit and implicit ancestral reconstruction identified generally faster rates of change in dynamics compared with the expected change from neutral evolution at the branches of potential functional divergences for the α-amylase, D-isomer-specific 2-hydroxyacid dehydrogenase, and copper-containing amine oxidase protein families. Normal mode analysis added additional information over just comparing the RMSD of static structures. However, the branch-specific changes were not statistically significant compared to background function-independent neutral rates of change of dynamic properties and blind application of the analysis would not enable prediction of changes in enzyme specificity. Copyright © 2012 Elsevier Ltd. All rights reserved.
Synthetic Biology of Proteins: Tuning GFPs Folding and Stability with Fluoroproline
Steiner, Thomas; Hess, Petra; Bae, Jae Hyun; Wiltschi, Birgit; Moroder, Luis; Budisa, Nediljko
2008-01-01
Background Proline residues affect protein folding and stability via cis/trans isomerization of peptide bonds and by the Cγ-exo or -endo puckering of their pyrrolidine rings. Peptide bond conformation as well as puckering propensity can be manipulated by proper choice of ring substituents, e.g. Cγ-fluorination. Synthetic chemistry has routinely exploited ring-substituted proline analogs in order to change, modulate or control folding and stability of peptides. Methodology/Principal Findings In order to transmit this synthetic strategy to complex proteins, the ten proline residues of enhanced green fluorescent protein (EGFP) were globally replaced by (4R)- and (4S)-fluoroprolines (FPro). By this approach, we expected to affect the cis/trans peptidyl-proline bond isomerization and pyrrolidine ring puckering, which are responsible for the slow folding of this protein. Expression of both protein variants occurred at levels comparable to the parent protein, but the (4R)-FPro-EGFP resulted in irreversibly unfolded inclusion bodies, whereas the (4S)-FPro-EGFP led to a soluble fluorescent protein. Upon thermal denaturation, refolding of this variant occurs at significantly higher rates than the parent EGFP. Comparative inspection of the X-ray structures of EGFP and (4S)-FPro-EGFP allowed to correlate the significantly improved refolding with the Cγ-endo puckering of the pyrrolidine rings, which is favored by 4S-fluorination, and to lesser extents with the cis/trans isomerization of the prolines. Conclusions/Significance We discovered that the folding rates and stability of GFP are affected to a lesser extent by cis/trans isomerization of the proline bonds than by the puckering of pyrrolidine rings. In the Cγ-endo conformation the fluorine atoms are positioned in the structural context of the GFP such that a network of favorable local interactions is established. From these results the combined use of synthetic amino acids along with detailed structural knowledge and existing protein engineering methods can be envisioned as a promising strategy for the design of complex tailor-made proteins and even cellular structures of superior properties compared to the native forms. PMID:18301757
Transmembrane helix prediction: a comparative evaluation and analysis.
Cuthbertson, Jonathan M; Doyle, Declan A; Sansom, Mark S P
2005-06-01
The prediction of transmembrane (TM) helices plays an important role in the study of membrane proteins, given the relatively small number (approximately 0.5% of the PDB) of high-resolution structures for such proteins. We used two datasets (one redundant and one non-redundant) of high-resolution structures of membrane proteins to evaluate and analyse TM helix prediction. The redundant (non-redundant) dataset contains structure of 434 (268) TM helices, from 112 (73) polypeptide chains. Of the 434 helices in the dataset, 20 may be classified as 'half-TM' as they are too short to span a lipid bilayer. We compared 13 TM helix prediction methods, evaluating each method using per segment, per residue and termini scores. Four methods consistently performed well: SPLIT4, TMHMM2, HMMTOP2 and TMAP. However, even the best methods were in error by, on average, about two turns of helix at the TM helix termini. The best and worst case predictions for individual proteins were analysed. In particular, the performance of the various methods and of a consensus prediction method, were compared for a number of proteins (e.g. SecY, ClC, KvAP) containing half-TM helices. The difficulties of predicting half-TM helices suggests that current prediction methods successfully embody the two-state model of membrane protein folding, but do not accommodate a third stage in which, e.g., short helices and re-entrant loops fold within a bundle of stable TM helices.
Crystal structure of bacillus subtilis YdaF protein : a putative ribosomal N-acetyltransferase.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brunzelle, J. S.; Wu, R.; Korolev, S. V.
2004-12-01
Comparative sequence analysis suggests that the ydaF gene encodes a protein (YdaF) that functions as an N-acetyltransferase, more specifically, a ribosomal N-acetyltransferase. Sequence analysis using basic local alignment search tool (BLAST) suggests that YdaF belongs to a large family of proteins (199 proteins found in 88 unique species of bacteria, archaea, and eukaryotes). YdaF also belongs to the COG1670, which includes the Escherichia coli RimL protein that is known to acetylate ribosomal protein L12. N-acetylation (NAT) has been found in all kingdoms. NAT enzymes catalyze the transfer of an acetyl group from acetyl-CoA (AcCoA) to a primary amino group. Formore » example, NATs can acetylate the N-terminal {alpha}-amino group, the {epsilon}-amino group of lysine residues, aminoglycoside antibiotics, spermine/speridine, or arylalkylamines such as serotonin. The crystal structure of the alleged ribosomal NAT protein, YdaF, from Bacillus subtilis presented here was determined as a part of the Midwest Center for Structural Genomics. The structure maintains the conserved tertiary structure of other known NATs and a high sequence similarity in the presumed AcCoA binding pocket in spite of a very low overall level of sequence identity to other NATs of known structure.« less
2016-01-01
Abstract Molecular recognition by protein mostly occurs in a local region on the protein surface. Thus, an efficient computational method for accurate characterization of protein local structural conservation is necessary to better understand biology and drug design. We present a novel local structure alignment tool, G‐LoSA. G‐LoSA aligns protein local structures in a sequence order independent way and provides a GA‐score, a chemical feature‐based and size‐independent structure similarity score. Our benchmark validation shows the robust performance of G‐LoSA to the local structures of diverse sizes and characteristics, demonstrating its universal applicability to local structure‐centric comparative biology studies. In particular, G‐LoSA is highly effective in detecting conserved local regions on the entire surface of a given protein. In addition, the applications of G‐LoSA to identifying template ligands and predicting ligand and protein binding sites illustrate its strong potential for computer‐aided drug design. We hope that G‐LoSA can be a useful computational method for exploring interesting biological problems through large‐scale comparison of protein local structures and facilitating drug discovery research and development. G‐LoSA is freely available to academic users at http://im.compbio.ku.edu/GLoSA/. PMID:26813336
Building proteins from C alpha coordinates using the dihedral probability grid Monte Carlo method.
Mathiowetz, A. M.; Goddard, W. A.
1995-01-01
Dihedral probability grid Monte Carlo (DPG-MC) is a general-purpose method of conformational sampling that can be applied to many problems in peptide and protein modeling. Here we present the DPG-MC method and apply it to predicting complete protein structures from C alpha coordinates. This is useful in such endeavors as homology modeling, protein structure prediction from lattice simulations, or fitting protein structures to X-ray crystallographic data. It also serves as an example of how DPG-MC can be applied to systems with geometric constraints. The conformational propensities for individual residues are used to guide conformational searches as the protein is built from the amino-terminus to the carboxyl-terminus. Results for a number of proteins show that both the backbone and side chain can be accurately modeled using DPG-MC. Backbone atoms are generally predicted with RMS errors of about 0.5 A (compared to X-ray crystal structure coordinates) and all atoms are predicted to an RMS error of 1.7 A or better. PMID:7549885
Poppe, Leszek; Jordan, John B; Rogers, Gary; Schnier, Paul D
2015-06-02
An important aspect in the analytical characterization of protein therapeutics is the comprehensive characterization of higher order structure (HOS). Nuclear magnetic resonance (NMR) is arguably the most sensitive method for fingerprinting HOS of a protein in solution. Traditionally, (1)H-(15)N or (1)H-(13)C correlation spectra are used as a "structural fingerprint" of HOS. Here, we demonstrate that protein fingerprint by line shape enhancement (PROFILE), a 1D (1)H NMR spectroscopy fingerprinting approach, is superior to traditional two-dimensional methods using monoclonal antibody samples and a heavily glycosylated protein therapeutic (Epoetin Alfa). PROFILE generates a high resolution structural fingerprint of a therapeutic protein in a fraction of the time required for a 2D NMR experiment. The cross-correlation analysis of PROFILE spectra allows one to distinguish contributions from HOS vs protein heterogeneity, which is difficult to accomplish by 2D NMR. We demonstrate that the major analytical limitation of two-dimensional methods is poor selectivity, which renders these approaches problematic for the purpose of fingerprinting large biological macromolecules.
De Jaco, Antonella; Comoletti, Davide; Dubi, Noga; Camp, Shelley; Taylor, Palmer
2016-01-01
The α/β hydrolase fold family is perhaps the largest group of proteins presenting significant structural homology with divergent functions, ranging from catalytic hydrolysis to heterophilic cell adhesive interactions to chaperones in hormone production. All the proteins of the family share a common three-dimensional core structure containing the α/β-hydrolase fold domain that is crucial for proper protein function. Several mutations associated with congenital diseases or disorders have been reported in conserved residues within the α/β-hydrolase fold domain of cholinesterase-like proteins, neuroligins, butyrylcholinesterase and thyroglobulin. These mutations are known to disrupt the architecture of the common structural domain either globally or locally. Characterization of the natural mutations affecting the α/β-hydrolase fold domain in these proteins has shown that they mainly impair processing and trafficking along the secretory pathway causing retention of the mutant protein in the endoplasmic reticulum. Studying the processing of α/β-hydrolase fold mutant proteins should uncover new functions for this domain, that in some cases require structural integrity for both export of the protein from the ER and for facilitating subunit dimerization. A comparative study of homologous mutations in proteins that are closely related family members, along with the definition of new three-dimensional crystal structures, will identify critical residues for the assembly of the α/β-hydrolase fold. PMID:21933121
Unexpected features of the dark proteome.
Perdigão, Nelson; Heinrich, Julian; Stolte, Christian; Sabir, Kenneth S; Buckley, Michael J; Tabor, Bruce; Signal, Beth; Gloss, Brian S; Hammang, Christopher J; Rost, Burkhard; Schafferhans, Andrea; O'Donoghue, Seán I
2015-12-29
We surveyed the "dark" proteome-that is, regions of proteins never observed by experimental structure determination and inaccessible to homology modeling. For 546,000 Swiss-Prot proteins, we found that 44-54% of the proteome in eukaryotes and viruses was dark, compared with only ∼14% in archaea and bacteria. Surprisingly, most of the dark proteome could not be accounted for by conventional explanations, such as intrinsic disorder or transmembrane regions. Nearly half of the dark proteome comprised dark proteins, in which the entire sequence lacked similarity to any known structure. Dark proteins fulfill a wide variety of functions, but a subset showed distinct and largely unexpected features, such as association with secretion, specific tissues, the endoplasmic reticulum, disulfide bonding, and proteolytic cleavage. Dark proteins also had short sequence length, low evolutionary reuse, and few known interactions with other proteins. These results suggest new research directions in structural and computational biology.
An object programming based environment for protein secondary structure prediction.
Giacomini, M; Ruggiero, C; Sacile, R
1996-01-01
The most frequently used methods for protein secondary structure prediction are empirical statistical methods and rule based methods. A consensus system based on object-oriented programming is presented, which integrates the two approaches with the aim of improving the prediction quality. This system uses an object-oriented knowledge representation based on the concepts of conformation, residue and protein, where the conformation class is the basis, the residue class derives from it and the protein class derives from the residue class. The system has been tested with satisfactory results on several proteins of the Brookhaven Protein Data Bank. Its results have been compared with the results of the most widely used prediction methods, and they show a higher prediction capability and greater stability. Moreover, the system itself provides an index of the reliability of its current prediction. This system can also be regarded as a basis structure for programs of this kind.
KFC Server: interactive forecasting of protein interaction hot spots.
Darnell, Steven J; LeGault, Laura; Mitchell, Julie C
2008-07-01
The KFC Server is a web-based implementation of the KFC (Knowledge-based FADE and Contacts) model-a machine learning approach for the prediction of binding hot spots, or the subset of residues that account for most of a protein interface's; binding free energy. The server facilitates the automated analysis of a user submitted protein-protein or protein-DNA interface and the visualization of its hot spot predictions. For each residue in the interface, the KFC Server characterizes its local structural environment, compares that environment to the environments of experimentally determined hot spots and predicts if the interface residue is a hot spot. After the computational analysis, the user can visualize the results using an interactive job viewer able to quickly highlight predicted hot spots and surrounding structural features within the protein structure. The KFC Server is accessible at http://kfc.mitchell-lab.org.
Unexpected features of the dark proteome
Perdigão, Nelson; Heinrich, Julian; Stolte, Christian; Sabir, Kenneth S.; Buckley, Michael J.; Tabor, Bruce; Signal, Beth; Gloss, Brian S.; Hammang, Christopher J.; Rost, Burkhard; Schafferhans, Andrea
2015-01-01
We surveyed the “dark” proteome–that is, regions of proteins never observed by experimental structure determination and inaccessible to homology modeling. For 546,000 Swiss-Prot proteins, we found that 44–54% of the proteome in eukaryotes and viruses was dark, compared with only ∼14% in archaea and bacteria. Surprisingly, most of the dark proteome could not be accounted for by conventional explanations, such as intrinsic disorder or transmembrane regions. Nearly half of the dark proteome comprised dark proteins, in which the entire sequence lacked similarity to any known structure. Dark proteins fulfill a wide variety of functions, but a subset showed distinct and largely unexpected features, such as association with secretion, specific tissues, the endoplasmic reticulum, disulfide bonding, and proteolytic cleavage. Dark proteins also had short sequence length, low evolutionary reuse, and few known interactions with other proteins. These results suggest new research directions in structural and computational biology. PMID:26578815
Ashford, Paul; Moss, David S; Alex, Alexander; Yeap, Siew K; Povia, Alice; Nobeli, Irene; Williams, Mark A
2012-03-14
Protein structures provide a valuable resource for rational drug design. For a protein with no known ligand, computational tools can predict surface pockets that are of suitable size and shape to accommodate a complementary small-molecule drug. However, pocket prediction against single static structures may miss features of pockets that arise from proteins' dynamic behaviour. In particular, ligand-binding conformations can be observed as transiently populated states of the apo protein, so it is possible to gain insight into ligand-bound forms by considering conformational variation in apo proteins. This variation can be explored by considering sets of related structures: computationally generated conformers, solution NMR ensembles, multiple crystal structures, homologues or homology models. It is non-trivial to compare pockets, either from different programs or across sets of structures. For a single structure, difficulties arise in defining particular pocket's boundaries. For a set of conformationally distinct structures the challenge is how to make reasonable comparisons between them given that a perfect structural alignment is not possible. We have developed a computational method, Provar, that provides a consistent representation of predicted binding pockets across sets of related protein structures. The outputs are probabilities that each atom or residue of the protein borders a predicted pocket. These probabilities can be readily visualised on a protein using existing molecular graphics software. We show how Provar simplifies comparison of the outputs of different pocket prediction algorithms, of pockets across multiple simulated conformations and between homologous structures. We demonstrate the benefits of use of multiple structures for protein-ligand and protein-protein interface analysis on a set of complexes and consider three case studies in detail: i) analysis of a kinase superfamily highlights the conserved occurrence of surface pockets at the active and regulatory sites; ii) a simulated ensemble of unliganded Bcl2 structures reveals extensions of a known ligand-binding pocket not apparent in the apo crystal structure; iii) visualisations of interleukin-2 and its homologues highlight conserved pockets at the known receptor interfaces and regions whose conformation is known to change on inhibitor binding. Through post-processing of the output of a variety of pocket prediction software, Provar provides a flexible approach to the analysis and visualization of the persistence or variability of pockets in sets of related protein structures.
PASS2: an automated database of protein alignments organised as structural superfamilies.
Bhaduri, Anirban; Pugalenthi, Ganesan; Sowdhamini, Ramanathan
2004-04-02
The functional selection and three-dimensional structural constraints of proteins in nature often relates to the retention of significant sequence similarity between proteins of similar fold and function despite poor sequence identity. Organization of structure-based sequence alignments for distantly related proteins, provides a map of the conserved and critical regions of the protein universe that is useful for the analysis of folding principles, for the evolutionary unification of protein families and for maximizing the information return from experimental structure determination. The Protein Alignment organised as Structural Superfamily (PASS2) database represents continuously updated, structural alignments for evolutionary related, sequentially distant proteins. An automated and updated version of PASS2 is, in direct correspondence with SCOP 1.63, consisting of sequences having identity below 40% among themselves. Protein domains have been grouped into 628 multi-member superfamilies and 566 single member superfamilies. Structure-based sequence alignments for the superfamilies have been obtained using COMPARER, while initial equivalencies have been derived from a preliminary superposition using LSQMAN or STAMP 4.0. The final sequence alignments have been annotated for structural features using JOY4.0. The database is supplemented with sequence relatives belonging to different genomes, conserved spatially interacting and structural motifs, probabilistic hidden markov models of superfamilies based on the alignments and useful links to other databases. Probabilistic models and sensitive position specific profiles obtained from reliable superfamily alignments aid annotation of remote homologues and are useful tools in structural and functional genomics. PASS2 presents the phylogeny of its members both based on sequence and structural dissimilarities. Clustering of members allows us to understand diversification of the family members. The search engine has been improved for simpler browsing of the database. The database resolves alignments among the structural domains consisting of evolutionarily diverged set of sequences. Availability of reliable sequence alignments of distantly related proteins despite poor sequence identity and single-member superfamilies permit better sampling of structures in libraries for fold recognition of new sequences and for the understanding of protein structure-function relationships of individual superfamilies. PASS2 is accessible at http://www.ncbs.res.in/~faculty/mini/campass/pass2.html
Structural study of the membrane protein MscL using cell-free expression and solid-state NMR
NASA Astrophysics Data System (ADS)
Abdine, Alaa; Verhoeven, Michiel A.; Park, Kyu-Ho; Ghazi, Alexandre; Guittet, Eric; Berrier, Catherine; Van Heijenoort, Carine; Warschawski, Dror E.
2010-05-01
High-resolution structures of membrane proteins have so far been obtained mostly by X-ray crystallography, on samples where the protein is surrounded by detergent. Recent developments of solid-state NMR have opened the way to a new approach for the study of integral membrane proteins inside a membrane. At the same time, the extension of cell-free expression to the production of membrane proteins allows for the production of proteins tailor made for NMR. We present here an in situ solid-state NMR study of a membrane protein selectively labeled through the use of cell-free expression. The sample consists of MscL (mechano-sensitive channel of large conductance), a 75 kDa pentameric α-helical ion channel from Escherichia coli, reconstituted in a hydrated lipid bilayer. Compared to a uniformly labeled protein sample, the spectral crowding is greatly reduced in the cell-free expressed protein sample. This approach may be a decisive step required for spectral assignment and structure determination of membrane proteins by solid-state NMR.
Strecker, Claas; Meyer, Bernd
2018-05-29
Protein flexibility poses a major challenge to docking of potential ligands in that the binding site can adopt different shapes. Docking algorithms usually keep the protein rigid and only allow the ligand to be treated as flexible. However, a wrong assessment of the shape of the binding pocket can prevent a ligand from adapting a correct pose. Ensemble docking is a simple yet promising method to solve this problem: Ligands are docked into multiple structures, and the results are subsequently merged. Selection of protein structures is a significant factor for this approach. In this work we perform a comprehensive and comparative study evaluating the impact of structure selection on ensemble docking. We perform ensemble docking with several crystal structures and with structures derived from molecular dynamics simulations of renin, an attractive target for antihypertensive drugs. Here, 500 ns of MD simulations revealed binding site shapes not found in any available crystal structure. We evaluate the importance of structure selection for ensemble docking by comparing binding pose prediction, ability to rank actives above nonactives (screening utility), and scoring accuracy. As a result, for ensemble definition k-means clustering appears to be better suited than hierarchical clustering with average linkage. The best performing ensemble consists of four crystal structures and is able to reproduce the native ligand poses better than any individual crystal structure. Moreover this ensemble outperforms 88% of all individual crystal structures in terms of screening utility as well as scoring accuracy. Similarly, ensembles of MD-derived structures perform on average better than 75% of any individual crystal structure in terms of scoring accuracy at all inspected ensembles sizes.
Yamaguchi, Akihiro; Go, Mitiko
2006-01-01
We have been developing FAMSBASE, a protein homology-modeling database of whole ORFs predicted from genome sequences. The latest update of FAMSBASE (http://daisy.nagahama-i-bio.ac.jp/Famsbase/), which is based on the protein three-dimensional (3D) structures released by November 2003, contains modeled 3D structures for 368,724 open reading frames (ORFs) derived from genomes of 276 species, namely 17 archaebacterial, 130 eubacterial, 18 eukaryotic and 111 phage genomes. Those 276 genomes are predicted to have 734,193 ORFs in total and the current FAMSBASE contains protein 3D structure of approximately 50% of the ORF products. However, cases that a modeled 3D structure covers the whole part of an ORF product are rare. When portion of an ORF with 3D structure is compared in three kingdoms of life, in archaebacteria and eubacteria, approximately 60% of the ORFs have modeled 3D structures covering almost the entire amino acid sequences, however, the percentage falls to about 30% in eukaryotes. When annual differences in the number of ORFs with modeled 3D structure are calculated, the fraction of modeled 3D structures of soluble protein for archaebacteria is increased by 5%, and that for eubacteria by 7% in the last 3 years. Assuming that this rate would be maintained and that determination of 3D structures for predicted disordered regions is unattainable, whole soluble protein model structures of prokaryotes without the putative disordered regions will be in hand within 15 years. For eukaryotic proteins, they will be in hand within 25 years. The 3D structures we will have at those times are not the 3D structure of the entire proteins encoded in single ORFs, but the 3D structures of separate structural domains. Measuring or predicting spatial arrangements of structural domains in an ORF will then be a coming issue of structural genomics. PMID:17146617
Geometry motivated alternative view on local protein backbone structures.
Zacharias, Jan; Knapp, Ernst Walter
2013-11-01
We present an alternative to the classical Ramachandran plot (R-plot) to display local protein backbone structure. Instead of the (φ, ψ)-backbone angles relating to the chemical architecture of polypeptides generic helical parameters are used. These are the rotation or twist angle ϑ and the helical rise parameter d. Plots with these parameters provide a different view on the nature of local protein backbone structures. It allows to display the local structures in polar (d, ϑ)-coordinates, which is not possible for an R-plot, where structural regimes connected by periodicity appear disconnected. But there are other advantages, like a clear discrimination of the handedness of a local structure, a larger spread of the different local structure domains--the latter can yield a better separation of different local secondary structure motives--and many more. Compared to the R-plot we are not aware of any major disadvantage to classify local polypeptide structures with the (d, ϑ)-plot, except that it requires some elementary computations. To facilitate usage of the new (d, ϑ)-plot for protein structures we provide a web application (http://agknapp.chemie.fu-berlin.de/secsass), which shows the (d, ϑ)-plot side-by-side with the R-plot. © 2013 The Protein Society.
Accelerating large-scale protein structure alignments with graphics processing units
2012-01-01
Background Large-scale protein structure alignment, an indispensable tool to structural bioinformatics, poses a tremendous challenge on computational resources. To ensure structure alignment accuracy and efficiency, efforts have been made to parallelize traditional alignment algorithms in grid environments. However, these solutions are costly and of limited accessibility. Others trade alignment quality for speedup by using high-level characteristics of structure fragments for structure comparisons. Findings We present ppsAlign, a parallel protein structure Alignment framework designed and optimized to exploit the parallelism of Graphics Processing Units (GPUs). As a general-purpose GPU platform, ppsAlign could take many concurrent methods, such as TM-align and Fr-TM-align, into the parallelized algorithm design. We evaluated ppsAlign on an NVIDIA Tesla C2050 GPU card, and compared it with existing software solutions running on an AMD dual-core CPU. We observed a 36-fold speedup over TM-align, a 65-fold speedup over Fr-TM-align, and a 40-fold speedup over MAMMOTH. Conclusions ppsAlign is a high-performance protein structure alignment tool designed to tackle the computational complexity issues from protein structural data. The solution presented in this paper allows large-scale structure comparisons to be performed using massive parallel computing power of GPU. PMID:22357132
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chakraborty, Sandeep; Rao, Basuthkar J.; Baker, Nathan A.
2013-04-01
Phylogenetic analysis of proteins using multiple sequence alignment (MSA) assumes an underlying evolutionary relationship in these proteins which occasionally remains undetected due to considerable sequence divergence. Structural alignment programs have been developed to unravel such fuzzy relationships. However, none of these structure based methods have used electrostatic properties to discriminate between spatially equivalent residues. We present a methodology for MSA of a set of related proteins with known structures using electrostatic properties as an additional discriminator (STEEP). STEEP first extracts a profile, then generates a multiple structural superimposition providing a consolidated spatial framework for comparing residues and finally emits themore » MSA. Residues that are aligned differently by including or excluding electrostatic properties can be targeted by directed evolution experiments to transform the enzymatic properties of one protein into another. We have compared STEEP results to those obtained from a MSA program (ClustalW) and a structural alignment method (MUSTANG) for chymotrypsin serine proteases. Subsequently, we used PhyML to generate phylogenetic trees for the serine and metallo-β-lactamase superfamilies from the STEEP generated MSA, and corroborated the accepted relationships in these superfamilies. We have observed that STEEP acts as a functional classifier when electrostatic congruence is used as a discriminator, and thus identifies potential targets for directed evolution experiments. In summary, STEEP is unique among phylogenetic methods for its ability to use electrostatic congruence to specify mutations that might be the source of the functional divergence in a protein family. Based on our results, we also hypothesize that the active site and its close vicinity contains enough information to infer the correct phylogeny for related proteins.« less
de Moraes, Fábio R; Neshich, Izabella A P; Mazoni, Ivan; Yano, Inácio H; Pereira, José G C; Salim, José A; Jardine, José G; Neshich, Goran
2014-01-01
Protein-protein interactions are involved in nearly all regulatory processes in the cell and are considered one of the most important issues in molecular biology and pharmaceutical sciences but are still not fully understood. Structural and computational biology contributed greatly to the elucidation of the mechanism of protein interactions. In this paper, we present a collection of the physicochemical and structural characteristics that distinguish interface-forming residues (IFR) from free surface residues (FSR). We formulated a linear discriminative analysis (LDA) classifier to assess whether chosen descriptors from the BlueStar STING database (http://www.cbi.cnptia.embrapa.br/SMS/) are suitable for such a task. Receiver operating characteristic (ROC) analysis indicates that the particular physicochemical and structural descriptors used for building the linear classifier perform much better than a random classifier and in fact, successfully outperform some of the previously published procedures, whose performance indicators were recently compared by other research groups. The results presented here show that the selected set of descriptors can be utilized to predict IFRs, even when homologue proteins are missing (particularly important for orphan proteins where no homologue is available for comparative analysis/indication) or, when certain conformational changes accompany interface formation. The development of amino acid type specific classifiers is shown to increase IFR classification performance. Also, we found that the addition of an amino acid conservation attribute did not improve the classification prediction. This result indicates that the increase in predictive power associated with amino acid conservation is exhausted by adequate use of an extensive list of independent physicochemical and structural parameters that, by themselves, fully describe the nano-environment at protein-protein interfaces. The IFR classifier developed in this study is now integrated into the BlueStar STING suite of programs. Consequently, the prediction of protein-protein interfaces for all proteins available in the PDB is possible through STING_interfaces module, accessible at the following website: (http://www.cbi.cnptia.embrapa.br/SMS/predictions/index.html).
de Moraes, Fábio R.; Neshich, Izabella A. P.; Mazoni, Ivan; Yano, Inácio H.; Pereira, José G. C.; Salim, José A.; Jardine, José G.; Neshich, Goran
2014-01-01
Protein-protein interactions are involved in nearly all regulatory processes in the cell and are considered one of the most important issues in molecular biology and pharmaceutical sciences but are still not fully understood. Structural and computational biology contributed greatly to the elucidation of the mechanism of protein interactions. In this paper, we present a collection of the physicochemical and structural characteristics that distinguish interface-forming residues (IFR) from free surface residues (FSR). We formulated a linear discriminative analysis (LDA) classifier to assess whether chosen descriptors from the BlueStar STING database (http://www.cbi.cnptia.embrapa.br/SMS/) are suitable for such a task. Receiver operating characteristic (ROC) analysis indicates that the particular physicochemical and structural descriptors used for building the linear classifier perform much better than a random classifier and in fact, successfully outperform some of the previously published procedures, whose performance indicators were recently compared by other research groups. The results presented here show that the selected set of descriptors can be utilized to predict IFRs, even when homologue proteins are missing (particularly important for orphan proteins where no homologue is available for comparative analysis/indication) or, when certain conformational changes accompany interface formation. The development of amino acid type specific classifiers is shown to increase IFR classification performance. Also, we found that the addition of an amino acid conservation attribute did not improve the classification prediction. This result indicates that the increase in predictive power associated with amino acid conservation is exhausted by adequate use of an extensive list of independent physicochemical and structural parameters that, by themselves, fully describe the nano-environment at protein-protein interfaces. The IFR classifier developed in this study is now integrated into the BlueStar STING suite of programs. Consequently, the prediction of protein-protein interfaces for all proteins available in the PDB is possible through STING_interfaces module, accessible at the following website: (http://www.cbi.cnptia.embrapa.br/SMS/predictions/index.html). PMID:24489849
Upadhyay, Vaibhav; Singh, Anupam; Jha, Divya; Singh, Akansha; Panda, Amulya K
2016-06-08
Formation of inclusion bodies poses a major hurdle in recovery of bioactive recombinant protein from Escherichia coli. Urea and guanidine hydrochloride have routinely been used to solubilize inclusion body proteins, but many times result in poor recovery of bioactive protein. High pH buffers, detergents and organic solvents like n-propanol have been successfully used as mild solubilization agents for high throughput recovery of bioactive protein from bacterial inclusion bodies. These mild solubilization agents preserve native-like secondary structures of proteins in inclusion body aggregates and result in improved recovery of bioactive protein as compared to conventional solubilization agents. Here we demonstrate solubilization of human growth hormone inclusion body aggregates using 30% trifluoroethanol in presence of 3 M urea and its refolding into bioactive form. Human growth hormone was expressed in E. coli M15 (pREP) cells in the form of inclusion bodies. Different concentrations of trifluoroethanol with or without addition of low concentration (3 M) of urea were used for solubilization of inclusion body aggregates. Thirty percent trifluoroethanol in combination with 3 M urea was found to be suitable for efficient solubilization of human growth hormone inclusion bodies. Solubilized protein was refolded by dilution and purified by anion exchange and size exclusion chromatography. Purified protein was analyzed for secondary and tertiary structure using different spectroscopic tools and was found to be bioactive by cell proliferation assay. To understand the mechanism of action of trifluoroethanol, secondary and tertiary structure of human growth hormone in trifluoroethanol was compared to that in presence of other denaturants like urea and guanidine hydrochloride. Trifluoroethanol was found to be stabilizing the secondary structure and destabilizing the tertiary structure of protein. Finally, it was observed that trifluoroethanol can be used to solubilize inclusion bodies of a number of proteins. Trifluoroethanol was found to be a suitable mild solubilization agent for bacterial inclusion bodies. Fully functional, bioactive human growth hormone was recovered in high yield from inclusion bodies using trifluoroethanol based solubilization buffer. It was also observed that trifluoroethanol has potential to solubilize inclusion bodies of different proteins.
Sudha, Govindarajan; Srinivasan, Narayanaswamy
2016-09-01
A comprehensive analysis of the quaternary features of distantly related homo-oligomeric proteins is the focus of the current study. This study has been performed at the levels of quaternary state, symmetry, and quaternary structure. Quaternary state and quaternary structure refers to the number of subunits and spatial arrangements of subunits, respectively. Using a large dataset of available 3D structures of biologically relevant assemblies, we show that only 53% of the distantly related homo-oligomeric proteins have the same quaternary state. Considering these homologous homo-oligomers with the same quaternary state, conservation of quaternary structures is observed only in 38% of the pairs. In 36% of the pairs of distantly related homo-oligomers with different quaternary states the larger assembly in a pair shows high structural similarity with the entire quaternary structure of the related protein with lower quaternary state and it is referred as "Russian doll effect." The differences in quaternary state and structure have been suggested to contribute to the functional diversity. Detailed investigations show that even though the gross functions of many distantly related homo-oligomers are the same, finer level differences in molecular functions are manifested by differences in quaternary states and structures. Comparison of structures of biological assemblies in distantly and closely related homo-oligomeric proteins throughout the study differentiates the effects of sequence divergence on the quaternary structures and function. Knowledge inferred from this study can provide insights for improved protein structure classification and function prediction of homo-oligomers. Proteins 2016; 84:1190-1202. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Esque, Jérémy; Urbain, Aurélie; Etchebest, Catherine; de Brevern, Alexandre G
2015-11-01
Transmembrane proteins (TMPs) are major drug targets, but the knowledge of their precise topology structure remains highly limited compared with globular proteins. In spite of the difficulties in obtaining their structures, an important effort has been made these last years to increase their number from an experimental and computational point of view. In view of this emerging challenge, the development of computational methods to extract knowledge from these data is crucial for the better understanding of their functions and in improving the quality of structural models. Here, we revisit an efficient unsupervised learning procedure, called Hybrid Protein Model (HPM), which is applied to the analysis of transmembrane proteins belonging to the all-α structural class. HPM method is an original classification procedure that efficiently combines sequence and structure learning. The procedure was initially applied to the analysis of globular proteins. In the present case, HPM classifies a set of overlapping protein fragments, extracted from a non-redundant databank of TMP 3D structure. After fine-tuning of the learning parameters, the optimal classification results in 65 clusters. They represent at best similar relationships between sequence and local structure properties of TMPs. Interestingly, HPM distinguishes among the resulting clusters two helical regions with distinct hydrophobic patterns. This underlines the complexity of the topology of these proteins. The HPM classification enlightens unusual relationship between amino acids in TMP fragments, which can be useful to elaborate new amino acids substitution matrices. Finally, two challenging applications are described: the first one aims at annotating protein functions (channel or not), the second one intends to assess the quality of the structures (X-ray or models) via a new scoring function deduced from the HPM classification.
Lapkouski, Mikalai; Hofbauerova, Katerina; Sovova, Zofie; Ettrichova, Olga; González-Pérez, Sergio; Dulebo, Alexander; Kaftan, David; Kuta Smatanova, Ivana; Revuelta, Jose L.; Arellano, Juan B.; Carey, Jannette; Ettrich, Rüdiger
2012-01-01
Raman microscopy permits structural analysis of protein crystals in situ in hanging drops, allowing for comparison with Raman measurements in solution. Nevertheless, the two methods sometimes reveal subtle differences in structure that are often ascribed to the water layer surrounding the protein. The novel method of drop-coating deposition Raman spectropscopy (DCDR) exploits an intermediate phase that, although nominally “dry,” has been shown to preserve protein structural features present in solution. The potential of this new approach to bridge the structural gap between proteins in solution and in crystals is explored here with extrinsic protein PsbP of photosystem II from Spinacia oleracea. In the high-resolution (1.98 Å) x-ray crystal structure of PsbP reported here, several segments of the protein chain are present but unresolved. Analysis of the three kinds of Raman spectra of PsbP suggests that most of the subtle differences can indeed be attributed to the water envelope, which is shown here to have a similar Raman intensity in glassy and crystal states. Using molecular dynamics simulations cross-validated by Raman solution data, two unresolved segments of the PsbP crystal structure were modeled as loops, and the amino terminus was inferred to contain an additional beta segment. The complete PsbP structure was compared with that of the PsbP-like protein CyanoP, which plays a more peripheral role in photosystem II function. The comparison suggests possible interaction surfaces of PsbP with higher-plant photosystem II. This work provides the first complete structural picture of this key protein, and it represents the first systematic comparison of Raman data from solution, glassy, and crystalline states of a protein. PMID:23071614
Deoxycholate-Based Glycosides (DCGs) for Membrane Protein Stabilisation.
Bae, Hyoung Eun; Gotfryd, Kamil; Thomas, Jennifer; Hussain, Hazrat; Ehsan, Muhammad; Go, Juyeon; Loland, Claus J; Byrne, Bernadette; Chae, Pil Seok
2015-07-06
Detergents are an absolute requirement for studying the structure of membrane proteins. However, many conventional detergents fail to stabilise denaturation-sensitive membrane proteins, such as eukaryotic proteins and membrane protein complexes. New amphipathic agents with enhanced efficacy in stabilising membrane proteins will be helpful in overcoming the barriers to studying membrane protein structures. We have prepared a number of deoxycholate-based amphiphiles with carbohydrate head groups, designated deoxycholate-based glycosides (DCGs). These DCGs are the hydrophilic variants of previously reported deoxycholate-based N-oxides (DCAOs). Membrane proteins in these agents, particularly the branched diglucoside-bearing amphiphiles DCG-1 and DCG-2, displayed favourable behaviour compared to previously reported parent compounds (DCAOs) and conventional detergents (LDAO and DDM). Given their excellent properties, these agents should have significant potential for membrane protein studies. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Song, Wei; Guo, Jun-Tao
2015-01-01
Transcription factors regulate gene expression through binding to specific DNA sequences. How transcription factors achieve high binding specificity is still not well understood. In this paper, we investigated the role of protein flexibility in protein-DNA-binding specificity by comparative molecular dynamics (MD) simulations. Protein flexibility has been considered as a key factor in molecular recognition, which is intrinsically a dynamic process involving fine structural fitting between binding components. In this study, we performed comparative MD simulations on wild-type and F10V mutant P22 Arc repressor in both free and complex conformations. The F10V mutant has lower DNA-binding specificity though both the bound and unbound main-chain structures between the wild-type and F10V mutant Arc are highly similar. We found that the DNA-binding motif of wild-type Arc is structurally more flexible than the F10V mutant in the unbound state, especially for the six DNA base-contacting residues in each dimer. We demonstrated that the flexible side chains of wild-type Arc lead to a higher DNA-binding specificity through forming more hydrogen bonds with DNA bases upon binding. Our simulations also showed a possible conformational selection mechanism for Arc-DNA binding. These results indicate the important roles of protein flexibility and dynamic properties in protein-DNA-binding specificity.
Baek, Minkyung; Park, Taeyong; Heo, Lim; Park, Chiwook; Seok, Chaok
2017-07-03
Homo-oligomerization of proteins is abundant in nature, and is often intimately related with the physiological functions of proteins, such as in metabolism, signal transduction or immunity. Information on the homo-oligomer structure is therefore important to obtain a molecular-level understanding of protein functions and their regulation. Currently available web servers predict protein homo-oligomer structures either by template-based modeling using homo-oligomer templates selected from the protein structure database or by ab initio docking of monomer structures resolved by experiment or predicted by computation. The GalaxyHomomer server, freely accessible at http://galaxy.seoklab.org/homomer, carries out template-based modeling, ab initio docking or both depending on the availability of proper oligomer templates. It also incorporates recently developed model refinement methods that can consistently improve model quality. Moreover, the server provides additional options that can be chosen by the user depending on the availability of information on the monomer structure, oligomeric state and locations of unreliable/flexible loops or termini. The performance of the server was better than or comparable to that of other available methods when tested on benchmark sets and in a recent CASP performed in a blind fashion. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
2017-01-01
The accurate prediction of protein chemical shifts using a quantum mechanics (QM)-based method has been the subject of intense research for more than 20 years but so far empirical methods for chemical shift prediction have proven more accurate. In this paper we show that a QM-based predictor of a protein backbone and CB chemical shifts (ProCS15, PeerJ, 2016, 3, e1344) is of comparable accuracy to empirical chemical shift predictors after chemical shift-based structural refinement that removes small structural errors. We present a method by which quantum chemistry based predictions of isotropic chemical shielding values (ProCS15) can be used to refine protein structures using Markov Chain Monte Carlo (MCMC) simulations, relating the chemical shielding values to the experimental chemical shifts probabilistically. Two kinds of MCMC structural refinement simulations were performed using force field geometry optimized X-ray structures as starting points: simulated annealing of the starting structure and constant temperature MCMC simulation followed by simulated annealing of a representative ensemble structure. Annealing of the CHARMM structure changes the CA-RMSD by an average of 0.4 Å but lowers the chemical shift RMSD by 1.0 and 0.7 ppm for CA and N. Conformational averaging has a relatively small effect (0.1–0.2 ppm) on the overall agreement with carbon chemical shifts but lowers the error for nitrogen chemical shifts by 0.4 ppm. If an amino acid specific offset is included the ProCS15 predicted chemical shifts have RMSD values relative to experiments that are comparable to popular empirical chemical shift predictors. The annealed representative ensemble structures differ in CA-RMSD relative to the initial structures by an average of 2.0 Å, with >2.0 Å difference for six proteins. In four of the cases, the largest structural differences arise in structurally flexible regions of the protein as determined by NMR, and in the remaining two cases, the large structural change may be due to force field deficiencies. The overall accuracy of the empirical methods are slightly improved by annealing the CHARMM structure with ProCS15, which may suggest that the minor structural changes introduced by ProCS15-based annealing improves the accuracy of the protein structures. Having established that QM-based chemical shift prediction can deliver the same accuracy as empirical shift predictors we hope this can help increase the accuracy of related approaches such as QM/MM or linear scaling approaches or interpreting protein structural dynamics from QM-derived chemical shift. PMID:28451325
Quality assessment of protein model-structures based on structural and functional similarities
2012-01-01
Background Experimental determination of protein 3D structures is expensive, time consuming and sometimes impossible. A gap between number of protein structures deposited in the World Wide Protein Data Bank and the number of sequenced proteins constantly broadens. Computational modeling is deemed to be one of the ways to deal with the problem. Although protein 3D structure prediction is a difficult task, many tools are available. These tools can model it from a sequence or partial structural information, e.g. contact maps. Consequently, biologists have the ability to generate automatically a putative 3D structure model of any protein. However, the main issue becomes evaluation of the model quality, which is one of the most important challenges of structural biology. Results GOBA - Gene Ontology-Based Assessment is a novel Protein Model Quality Assessment Program. It estimates the compatibility between a model-structure and its expected function. GOBA is based on the assumption that a high quality model is expected to be structurally similar to proteins functionally similar to the prediction target. Whereas DALI is used to measure structure similarity, protein functional similarity is quantified using standardized and hierarchical description of proteins provided by Gene Ontology combined with Wang's algorithm for calculating semantic similarity. Two approaches are proposed to express the quality of protein model-structures. One is a single model quality assessment method, the other is its modification, which provides a relative measure of model quality. Exhaustive evaluation is performed on data sets of model-structures submitted to the CASP8 and CASP9 contests. Conclusions The validation shows that the method is able to discriminate between good and bad model-structures. The best of tested GOBA scores achieved 0.74 and 0.8 as a mean Pearson correlation to the observed quality of models in our CASP8 and CASP9-based validation sets. GOBA also obtained the best result for two targets of CASP8, and one of CASP9, compared to the contest participants. Consequently, GOBA offers a novel single model quality assessment program that addresses the practical needs of biologists. In conjunction with other Model Quality Assessment Programs (MQAPs), it would prove useful for the evaluation of single protein models. PMID:22998498
Dewhurst, Henry M; Choudhury, Shilpa; Torres, Matthew P
2015-08-01
Predicting the biological function potential of post-translational modifications (PTMs) is becoming increasingly important in light of the exponential increase in available PTM data from high-throughput proteomics. We developed structural analysis of PTM hotspots (SAPH-ire)--a quantitative PTM ranking method that integrates experimental PTM observations, sequence conservation, protein structure, and interaction data to allow rank order comparisons within or between protein families. Here, we applied SAPH-ire to the study of PTMs in diverse G protein families, a conserved and ubiquitous class of proteins essential for maintenance of intracellular structure (tubulins) and signal transduction (large and small Ras-like G proteins). A total of 1728 experimentally verified PTMs from eight unique G protein families were clustered into 451 unique hotspots, 51 of which have a known and cited biological function or response. Using customized software, the hotspots were analyzed in the context of 598 unique protein structures. By comparing distributions of hotspots with known versus unknown function, we show that SAPH-ire analysis is predictive for PTM biological function. Notably, SAPH-ire revealed high-ranking hotspots for which a functional impact has not yet been determined, including phosphorylation hotspots in the N-terminal tails of G protein gamma subunits--conserved protein structures never before reported as regulators of G protein coupled receptor signaling. To validate this prediction we used the yeast model system for G protein coupled receptor signaling, revealing that gamma subunit-N-terminal tail phosphorylation is activated in response to G protein coupled receptor stimulation and regulates protein stability in vivo. These results demonstrate the utility of integrating protein structural and sequence features into PTM prioritization schemes that can improve the analysis and functional power of modification-specific proteomics data. © 2015 by The American Society for Biochemistry and Molecular Biology, Inc.
Shen, Yang; Bax, Ad
2015-01-01
Summary Chemical shifts are obtained at the first stage of any protein structural study by NMR spectroscopy. Chemical shifts are known to be impacted by a wide range of structural factors and the artificial neural network based TALOS-N program has been trained to extract backbone and sidechain torsion angles from 1H, 15N and 13C shifts. The program is quite robust, and typically yields backbone torsion angles for more than 90% of the residues, and sidechain χ1 rotamer information for about half of these, in addition to reliably predicting secondary structure. The use of TALOS-N is illustrated for the protein DinI, and torsion angles obtained by TALOS-N analysis from the measured chemical shifts of its backbone and 13Cβ nuclei are compared to those seen in a prior, experimentally determined structure. The program is also particularly useful for generating torsion angle restraints, which then can be used during standard NMR protein structure calculations. PMID:25502373
Kumar, Avishek; Butler, Brandon M; Kumar, Sudhir; Ozkan, S Banu
2015-12-01
Sequencing technologies are revealing many new non-synonymous single nucleotide variants (nsSNVs) in each personal exome. To assess their functional impacts, comparative genomics is frequently employed to predict if they are benign or not. However, evolutionary analysis alone is insufficient, because it misdiagnoses many disease-associated nsSNVs, such as those at positions involved in protein interfaces, and because evolutionary predictions do not provide mechanistic insights into functional change or loss. Structural analyses can aid in overcoming both of these problems by incorporating conformational dynamics and allostery in nSNV diagnosis. Finally, protein-protein interaction networks using systems-level methodologies shed light onto disease etiology and pathogenesis. Bridging these network approaches with structurally resolved protein interactions and dynamics will advance genomic medicine. Copyright © 2015 Elsevier Ltd. All rights reserved.
Improving consensus structure by eliminating averaging artifacts
KC, Dukka B
2009-01-01
Background Common structural biology methods (i.e., NMR and molecular dynamics) often produce ensembles of molecular structures. Consequently, averaging of 3D coordinates of molecular structures (proteins and RNA) is a frequent approach to obtain a consensus structure that is representative of the ensemble. However, when the structures are averaged, artifacts can result in unrealistic local geometries, including unphysical bond lengths and angles. Results Herein, we describe a method to derive representative structures while limiting the number of artifacts. Our approach is based on a Monte Carlo simulation technique that drives a starting structure (an extended or a 'close-by' structure) towards the 'averaged structure' using a harmonic pseudo energy function. To assess the performance of the algorithm, we applied our approach to Cα models of 1364 proteins generated by the TASSER structure prediction algorithm. The average RMSD of the refined model from the native structure for the set becomes worse by a mere 0.08 Å compared to the average RMSD of the averaged structures from the native structure (3.28 Å for refined structures and 3.36 A for the averaged structures). However, the percentage of atoms involved in clashes is greatly reduced (from 63% to 1%); in fact, the majority of the refined proteins had zero clashes. Moreover, a small number (38) of refined structures resulted in lower RMSD to the native protein versus the averaged structure. Finally, compared to PULCHRA [1], our approach produces representative structure of similar RMSD quality, but with much fewer clashes. Conclusion The benchmarking results demonstrate that our approach for removing averaging artifacts can be very beneficial for the structural biology community. Furthermore, the same approach can be applied to almost any problem where averaging of 3D coordinates is performed. Namely, structure averaging is also commonly performed in RNA secondary prediction [2], which could also benefit from our approach. PMID:19267905
Chandrasekaran, Srinivas Niranj; Das, Jhuma; Dokholyan, Nikolay V.; Carter, Charles W.
2016-01-01
PATH rapidly computes a path and a transition state between crystal structures by minimizing the Onsager-Machlup action. It requires input parameters whose range of values can generate different transition-state structures that cannot be uniquely compared with those generated by other methods. We outline modifications to estimate these input parameters to circumvent these difficulties and validate the PATH transition states by showing consistency between transition-states derived by different algorithms for unrelated protein systems. Although functional protein conformational change trajectories are to a degree stochastic, they nonetheless pass through a well-defined transition state whose detailed structural properties can rapidly be identified using PATH. PMID:26958584
Minimalist design of water-soluble cross-[beta] architecture
DOE Office of Scientific and Technical Information (OSTI.GOV)
Biancalana, Matthew; Makabe, Koki; Koide, Shohei
Demonstrated successes of protein design and engineering suggest significant potential to produce diverse protein architectures and assemblies beyond those found in nature. Here, we describe a new class of synthetic protein architecture through the successful design and atomic structures of water-soluble cross-{beta} proteins. The cross-{beta} motif is formed from the lamination of successive {beta}-sheet layers, and it is abundantly observed in the core of insoluble amyloid fibrils associated with protein-misfolding diseases. Despite its prominence, cross-{beta} has been designed only in the context of insoluble aggregates of peptides or proteins. Cross-{beta}'s recalcitrance to protein engineering and conspicuous absence among the knownmore » atomic structures of natural proteins thus makes it a challenging target for design in a water-soluble form. Through comparative analysis of the cross-{beta} structures of fibril-forming peptides, we identified rows of hydrophobic residues ('ladders') running across {beta}-strands of each {beta}-sheet layer as a minimal component of the cross-{beta} motif. Grafting a single ladder of hydrophobic residues designed from the Alzheimer's amyloid-{beta} peptide onto a large {beta}-sheet protein formed a dimeric protein with a cross-{beta} architecture that remained water-soluble, as revealed by solution analysis and x-ray crystal structures. These results demonstrate that the cross-{beta} motif is a stable architecture in water-soluble polypeptides and can be readily designed. Our results provide a new route for accessing the cross-{beta} structure and expanding the scope of protein design.« less
Minimalist design of water-soluble cross-beta architecture.
Biancalana, Matthew; Makabe, Koki; Koide, Shohei
2010-02-23
Demonstrated successes of protein design and engineering suggest significant potential to produce diverse protein architectures and assemblies beyond those found in nature. Here, we describe a new class of synthetic protein architecture through the successful design and atomic structures of water-soluble cross-beta proteins. The cross-beta motif is formed from the lamination of successive beta-sheet layers, and it is abundantly observed in the core of insoluble amyloid fibrils associated with protein-misfolding diseases. Despite its prominence, cross-beta has been designed only in the context of insoluble aggregates of peptides or proteins. Cross-beta's recalcitrance to protein engineering and conspicuous absence among the known atomic structures of natural proteins thus makes it a challenging target for design in a water-soluble form. Through comparative analysis of the cross-beta structures of fibril-forming peptides, we identified rows of hydrophobic residues ("ladders") running across beta-strands of each beta-sheet layer as a minimal component of the cross-beta motif. Grafting a single ladder of hydrophobic residues designed from the Alzheimer's amyloid-beta peptide onto a large beta-sheet protein formed a dimeric protein with a cross-beta architecture that remained water-soluble, as revealed by solution analysis and x-ray crystal structures. These results demonstrate that the cross-beta motif is a stable architecture in water-soluble polypeptides and can be readily designed. Our results provide a new route for accessing the cross-beta structure and expanding the scope of protein design.
Improving Protein Fold Recognition by Deep Learning Networks.
Jo, Taeho; Hou, Jie; Eickholt, Jesse; Cheng, Jianlin
2015-12-04
For accurate recognition of protein folds, a deep learning network method (DN-Fold) was developed to predict if a given query-template protein pair belongs to the same structural fold. The input used stemmed from the protein sequence and structural features extracted from the protein pair. We evaluated the performance of DN-Fold along with 18 different methods on Lindahl's benchmark dataset and on a large benchmark set extracted from SCOP 1.75 consisting of about one million protein pairs, at three different levels of fold recognition (i.e., protein family, superfamily, and fold) depending on the evolutionary distance between protein sequences. The correct recognition rate of ensembled DN-Fold for Top 1 predictions is 84.5%, 61.5%, and 33.6% and for Top 5 is 91.2%, 76.5%, and 60.7% at family, superfamily, and fold levels, respectively. We also evaluated the performance of single DN-Fold (DN-FoldS), which showed the comparable results at the level of family and superfamily, compared to ensemble DN-Fold. Finally, we extended the binary classification problem of fold recognition to real-value regression task, which also show a promising performance. DN-Fold is freely available through a web server at http://iris.rnet.missouri.edu/dnfold.
Cross-Link Guided Molecular Modeling with ROSETTA
Leitner, Alexander; Rosenberger, George; Aebersold, Ruedi; Malmström, Lars
2013-01-01
Chemical cross-links identified by mass spectrometry generate distance restraints that reveal low-resolution structural information on proteins and protein complexes. The technology to reliably generate such data has become mature and robust enough to shift the focus to the question of how these distance restraints can be best integrated into molecular modeling calculations. Here, we introduce three workflows for incorporating distance restraints generated by chemical cross-linking and mass spectrometry into ROSETTA protocols for comparative and de novo modeling and protein-protein docking. We demonstrate that the cross-link validation and visualization software Xwalk facilitates successful cross-link data integration. Besides the protocols we introduce XLdb, a database of chemical cross-links from 14 different publications with 506 intra-protein and 62 inter-protein cross-links, where each cross-link can be mapped on an experimental structure from the Protein Data Bank. Finally, we demonstrate on a protein-protein docking reference data set the impact of virtual cross-links on protein docking calculations and show that an inter-protein cross-link can reduce on average the RMSD of a docking prediction by 5.0 Å. The methods and results presented here provide guidelines for the effective integration of chemical cross-link data in molecular modeling calculations and should advance the structural analysis of particularly large and transient protein complexes via hybrid structural biology methods. PMID:24069194
Sequence diagrams and the presentation of structural and evolutionary relationships among proteins.
Thomas, B R
1975-01-01
Protein sequences mapped on two-dimensional diagrams show characteristic patterns that should be of value in visualising sequence information and in distinguishing simpler structures. A convenient map form for comparative purposes is the alpha-helix diagram with aminoacid distribution analogous to the surface of an alpha-helix oriented so that an alpha-helix structure corresponds on the diagram to a vertical band 3.6 residues wide. The sequence diagram for an alpha-keratin, high-sulphur protein suggests a new form of polypeptide helix based on a repeating unit of five which may be an important component of alpha-keratin fibres.
Crespo, Maria D.; Rubini, Marina
2011-01-01
Background Many strategies have been employed to increase the conformational stability of proteins. The use of 4-substituted proline analogs capable to induce pre-organization in target proteins is an attractive tool to deliver an additional conformational stability without perturbing the overall protein structure. Both, peptides and proteins containing 4-fluorinated proline derivatives can be stabilized by forcing the pyrrolidine ring in its favored puckering conformation. The fluorinated pyrrolidine rings of proline can preferably stabilize either a Cγ-exo or a Cγ-endo ring pucker in dependence of proline chirality (4R/4S) in a complex protein structure. To examine whether this rational strategy can be generally used for protein stabilization, we have chosen human ubiquitin as a model protein which contains three proline residues displaying Cγ-exo puckering. Methodology/Principal Findings While (2S,4R)-4-fluoroproline ((4R)-FPro) containing ubiquitinin can be expressed in related auxotrophic Escherichia coli strain, all attempts to incorporate (2S,4S)-4-fluoroproline ((4S)-FPro) failed. Our results indicate that (4R)-FPro is favoring the Cγ-exo conformation present in the wild type structure and stabilizes the protein structure due to a pre-organization effect. This was confirmed by thermal and guanidinium chloride-induced denaturation profile analyses, where we observed an increase in stability of −4.71 kJ·mol−1 in the case of (4R)-FPro containing ubiquitin ((4R)-FPro-ub) compared to wild type ubiquitin (wt-ub). Expectedly, activity assays revealed that (4R)-FPro-ub retained the full biological activity compared to wt-ub. Conclusions/Significance The results fully confirm the general applicability of incorporating fluoroproline derivatives for improving protein stability. In general, a rational design strategy that enforces the natural occurring proline puckering conformation can be used to stabilize the desired target protein. PMID:21625626
P-proteins in Arabidopsis are heteromeric structures involved in rapid sieve tube sealing
Jekat, Stephan B.; Ernst, Antonia M.; von Bohl, Andreas; Zielonka, Sascia; Twyman, Richard M.; Noll, Gundula A.; Prüfer, Dirk
2013-01-01
Structural phloem proteins (P-proteins) are characteristic components of the sieve elements in all dicotyledonous and many monocotyledonous angiosperms. Tobacco P-proteins were recently confirmed to be encoded by the widespread sieve element occlusion (SEO) gene family, and tobacco SEO proteins were shown to be directly involved in sieve tube sealing thus preventing the loss of photosynthate. Analysis of the two Arabidopsis SEO proteins (AtSEOa and AtSEOb) indicated that the corresponding P-protein subunits do not act in a redundant manner. However, there are still pending questions regarding the interaction properties and specific functions of AtSEOa and AtSEOb as well as the general function of structural P-proteins in Arabidopsis. In this study, we characterized the Arabidopsis P-proteins in more detail. We used in planta bimolecular fluorescence complementation assays to confirm the predicted heteromeric interactions between AtSEOa and AtSEOb. Arabidopsis mutants depleted for one or both AtSEO proteins lacked the typical P-protein structures normally found in sieve elements, underlining the identity of AtSEO proteins as P-proteins and furthermore providing the means to determine the role of Arabidopsis P-proteins in sieve tube sealing. We therefore developed an assay based on phloem exudation. Mutants with reduced AtSEO expression levels lost twice as much photosynthate following injury as comparable wild-type plants, confirming that Arabidopsis P-proteins are indeed involved in sieve tube sealing. PMID:23840197
Significance of structural changes in proteins: expected errors in refined protein structures.
Stroud, R. M.; Fauman, E. B.
1995-01-01
A quantitative expression key to evaluating significant structural differences or induced shifts between any two protein structures is derived. Because crystallography leads to reports of a single (or sometimes dual) position for each atom, the significance of any structural change based on comparison of two structures depends critically on knowing the expected precision of each median atomic position reported, and on extracting it for each atom, from the information provided in the Protein Data Bank and in the publication. The differences between structures of protein molecules that should be identical, and that are normally distributed, indicating that they are not affected by crystal contacts, were analyzed with respect to many potential indicators of structure precision, so as to extract, essentially by "machine learning" principles, a generally applicable expression involving the highest correlates. Eighteen refined crystal structures from the Protein Data Bank, in which there are multiple molecules in the crystallographic asymmetric unit, were selected and compared. The thermal B factor, the connectivity of the atom, and the ratio of the number of reflections to the number of atoms used in refinement correlate best with the magnitude of the positional differences between regions of the structures that otherwise would be expected to be the same. These results are embodied in a six-parameter equation that can be applied to any crystallographically refined structure to estimate the expected uncertainty in position of each atom. Structure change in a macromolecule can thus be referenced to the expected uncertainty in atomic position as reflected in the variance between otherwise identical structures with the observed values of correlated parameters. PMID:8563637
Park, Hahnbeom; Lee, Gyu Rie; Heo, Lim; Seok, Chaok
2014-01-01
Protein loop modeling is a tool for predicting protein local structures of particular interest, providing opportunities for applications involving protein structure prediction and de novo protein design. Until recently, the majority of loop modeling methods have been developed and tested by reconstructing loops in frameworks of experimentally resolved structures. In many practical applications, however, the protein loops to be modeled are located in inaccurate structural environments. These include loops in model structures, low-resolution experimental structures, or experimental structures of different functional forms. Accordingly, discrepancies in the accuracy of the structural environment assumed in development of the method and that in practical applications present additional challenges to modern loop modeling methods. This study demonstrates a new strategy for employing a hybrid energy function combining physics-based and knowledge-based components to help tackle this challenge. The hybrid energy function is designed to combine the strengths of each energy component, simultaneously maintaining accurate loop structure prediction in a high-resolution framework structure and tolerating minor environmental errors in low-resolution structures. A loop modeling method based on global optimization of this new energy function is tested on loop targets situated in different levels of environmental errors, ranging from experimental structures to structures perturbed in backbone as well as side chains and template-based model structures. The new method performs comparably to force field-based approaches in loop reconstruction in crystal structures and better in loop prediction in inaccurate framework structures. This result suggests that higher-accuracy predictions would be possible for a broader range of applications. The web server for this method is available at http://galaxy.seoklab.org/loop with the PS2 option for the scoring function.
Zheng, Wenjun
2017-01-10
Dynactin, a large multiprotein complex, binds with the cytoplasmic dynein-1 motor and various adaptor proteins to allow recruitment and transportation of cellular cargoes toward the minus end of microtubules. The structure of the dynactin complex is built around an actin-like minifilament with a defined length, which has been visualized in a high-resolution structure of the dynactin filament determined by cryo-electron microscopy (cryo-EM). To understand the energetic basis of dynactin filament assembly, we used molecular dynamics simulation to probe the intersubunit interactions among the actin-like proteins, various capping proteins, and four extended regions of the dynactin shoulder. Our simulations revealed stronger intersubunit interactions at the barbed and pointed ends of the filament and involving the extended regions (compared with the interactions within the filament), which may energetically drive filament termination by the capping proteins and recruitment of the actin-like proteins by the extended regions, two key features of the dynactin filament assembly process. Next, we modeled the unknown binding configuration among dynactin, dynein tails, and a number of coiled-coil adaptor proteins (including several Bicaudal-D and related proteins and three HOOK proteins), and predicted a key set of charged residues involved in their electrostatic interactions. Our modeling is consistent with previous findings of conserved regions, functional sites, and disease mutations in the adaptor proteins and will provide a structural framework for future functional and mutational studies of these adaptor proteins. In sum, this study yielded rich structural and energetic information about dynactin and associated adaptor proteins that cannot be directly obtained from the cryo-EM structures with limited resolutions.
Etzkorn, Manuel; Raschle, Thomas; Hagn, Franz; Gelev, Vladimir; Rice, Amanda J; Walz, Thomas; Wagner, Gerhard
2013-03-05
Selecting a suitable membrane-mimicking environment is of fundamental importance for the investigation of membrane proteins. Nonconventional surfactants, such as amphipathic polymers (amphipols) and lipid bilayer nanodiscs, have been introduced as promising environments that may overcome intrinsic disadvantages of detergent micelle systems. However, structural insights into the effects of different environments on the embedded protein are limited. Here, we present a comparative study of the heptahelical membrane protein bacteriorhodopsin in detergent micelles, amphipols, and nanodiscs. Our results confirm that nonconventional environments can increase stability of functional bacteriorhodopsin, and demonstrate that well-folded heptahelical membrane proteins are, in principle, accessible by solution-NMR methods in amphipols and phospholipid nanodiscs. Our data distinguish regions of bacteriorhodopsin that mediate membrane/solvent contacts in the tested environments, whereas the protein's functional inner core remains almost unperturbed. The presented data allow comparing the investigated membrane mimetics in terms of NMR spectral quality and thermal stability required for structural studies. Copyright © 2013 Elsevier Ltd. All rights reserved.
Similarity Measures for Protein Ensembles
Lindorff-Larsen, Kresten; Ferkinghoff-Borg, Jesper
2009-01-01
Analyses of similarities and changes in protein conformation can provide important information regarding protein function and evolution. Many scores, including the commonly used root mean square deviation, have therefore been developed to quantify the similarities of different protein conformations. However, instead of examining individual conformations it is in many cases more relevant to analyse ensembles of conformations that have been obtained either through experiments or from methods such as molecular dynamics simulations. We here present three approaches that can be used to compare conformational ensembles in the same way as the root mean square deviation is used to compare individual pairs of structures. The methods are based on the estimation of the probability distributions underlying the ensembles and subsequent comparison of these distributions. We first validate the methods using a synthetic example from molecular dynamics simulations. We then apply the algorithms to revisit the problem of ensemble averaging during structure determination of proteins, and find that an ensemble refinement method is able to recover the correct distribution of conformations better than standard single-molecule refinement. PMID:19145244
Small-molecule ligand docking into comparative models with Rosetta
Combs, Steven A; DeLuca, Samuel L; DeLuca, Stephanie H; Lemmon, Gordon H; Nannemann, David P; Nguyen, Elizabeth D; Willis, Jordan R; Sheehan, Jonathan H; Meiler, Jens
2017-01-01
Structure-based drug design is frequently used to accelerate the development of small-molecule therapeutics. Although substantial progress has been made in X-ray crystallography and nuclear magnetic resonance (NMR) spectroscopy, the availability of high-resolution structures is limited owing to the frequent inability to crystallize or obtain sufficient NMR restraints for large or flexible proteins. Computational methods can be used to both predict unknown protein structures and model ligand interactions when experimental data are unavailable. This paper describes a comprehensive and detailed protocol using the Rosetta modeling suite to dock small-molecule ligands into comparative models. In the protocol presented here, we review the comparative modeling process, including sequence alignment, threading and loop building. Next, we cover docking a small-molecule ligand into the protein comparative model. In addition, we discuss criteria that can improve ligand docking into comparative models. Finally, and importantly, we present a strategy for assessing model quality. The entire protocol is presented on a single example selected solely for didactic purposes. The results are therefore not representative and do not replace benchmarks published elsewhere. We also provide an additional tutorial so that the user can gain hands-on experience in using Rosetta. The protocol should take 5–7 h, with additional time allocated for computer generation of models. PMID:23744289
Ferrada, Evandro; Vergara, Ismael A; Melo, Francisco
2007-01-01
The correct discrimination between native and near-native protein conformations is essential for achieving accurate computer-based protein structure prediction. However, this has proven to be a difficult task, since currently available physical energy functions, empirical potentials and statistical scoring functions are still limited in achieving this goal consistently. In this work, we assess and compare the ability of different full atom knowledge-based potentials to discriminate between native protein structures and near-native protein conformations generated by comparative modeling. Using a benchmark of 152 near-native protein models and their corresponding native structures that encompass several different folds, we demonstrate that the incorporation of close non-bonded pairwise atom terms improves the discriminating power of the empirical potentials. Since the direct and unbiased derivation of close non-bonded terms from current experimental data is not possible, we obtained and used those terms from the corresponding pseudo-energy functions of a non-local knowledge-based potential. It is shown that this methodology significantly improves the discrimination between native and near-native protein conformations, suggesting that a proper description of close non-bonded terms is important to achieve a more complete and accurate description of native protein conformations. Some external knowledge-based energy functions that are widely used in model assessment performed poorly, indicating that the benchmark of models and the specific discrimination task tested in this work constitutes a difficult challenge.
Influence of metformin on mitochondrial subproteome in the brain of apoE knockout mice.
Suski, Maciej; Olszanecki, Rafał; Chmura, Łukasz; Stachowicz, Aneta; Madej, Józef; Okoń, Krzysztof; Adamek, Dariusz; Korbut, Ryszard
2016-02-05
Neurodegenerative diseases are the set of progressive, age-related brain disorders, characterized by an excessive accumulation of mutant proteins in the certain regions of the brain. Such changes, collectively identified as causal factors of neurodegeneration, all impact mitochondria, imminently leading to their dysfunction. These observations predestine mitochondria as an attractive drug target for counteracting degenerative brain damage. The aim of this study was to use a differential proteomic approach to comprehensively assess the changes in mitochondrial protein expression in the brain of apoE-knockout mice (apoE(-/-)) and to investigate the influence of prolonged treatment with metformin - an indirect activator of AMP-activated protein kinase (AMPK) on the brain mitoproteome in apoE(-/-) mice. The quantitative assessment of the brain mitoproteome in apoE(-/-) revealed the changes in 10 proteins expression as compared to healthy C57BL/6J mice and 25 proteins expression in metformin-treated apoE(-/-) mice. Identified proteins mainly included apoptosis regulators, metabolic enzymes and structural proteins. In summary, our study provided proteomic characteristics suggesting the decrease of antioxidant defense and structural disturbances in the brain mitochondria of apoE(-/-) mice as compared to healthy controls. In this setting, the use of metformin changed the expression of several proteins primarily involved in metabolic processes, the regulation of apoptosis and the structural maintenance of mitochondria, what could potentially restore their native functionalities. Copyright © 2015 Elsevier B.V. All rights reserved.
Homochiral stereochemistry: the missing link of structure to energetics in protein folding.
Kumar, Anil; Ramakrishnan, Vibin; Ranbhor, Ranjit; Patel, Kirti; Durani, Susheel
2009-12-24
The notion is tested that homochiral stereochemistry being ubiquitous to protein structure could be critical to protein folding as well, causing it to become frustrated energetically providing the basis for its solvent- and sequence-mediated control. The proof in support of the notion is found in a consensus of experiment and computation according to which suitable oligopeptides are in their folding-unfolding equilibria, at both macrostate and microstate levels, susceptible to dielectric because of the conflict of peptide-chain electrostatics with interpeptide hydrogen bonds when the structure is poly-L but not when it is alternating-L,D. The argument is thus made that homochiral stereochemistry may in protein folding provide the unifying basis for its solvent- and sequence-mediated control based on screening of peptide-chain electrostatics under conflict with folding of the chain due to homochiral stereochemistry. Dielectric is brought into spotlight as the effect comparatively obscure but presumably critical to the folding in protein structure for its control.
Tao, Fei; Jiang, He; Chen, Wenwei; Zhang, Yongyong; Pan, Jiarong; Jiang, Jiaxin; Jia, Zhenbao
2018-05-07
Soy protein isolate (SPI) has promising applications in various food products because of its excellent functional properties and nutritional quality. The structural and emulsifying properties of covalently modified SPI by (-)-epigallocatechin-3-gallate (EGCG) were investigated. SPI was covalently modified by EGCG under alkaline conditions. SDS-PAGE analysis revealed that EGCG modification caused cross-linking of SPI proteins. Circular dichroism spectra demonstrated that the secondary structure of SPI proteins was changed by EGCG modification. In addition, the modifications resulted in the perturbation of the tertiary structure of SPI as evidenced by intrinsic fluorescence spectra and surface hydrophobicity measurements. Oil-in-water emulsions of modified SPI had smaller droplet sizes and better creaming stability compared to those from unmodified SPI. The covalent modification by EGCG improved the emulsifying property of SPI. This study provided an innovative approach for improving the emulsifying properties of proteins. This article is protected by copyright. All rights reserved.
Fitting Multimeric Protein Complexes into Electron Microscopy Maps Using 3D Zernike Descriptors
Esquivel-Rodríguez, Juan; Kihara, Daisuke
2012-01-01
A novel computational method for fitting high-resolution structures of multiple proteins into a cryoelectron microscopy map is presented. The method named EMLZerD generates a pool of candidate multiple protein docking conformations of component proteins, which are later compared with a provided electron microscopy (EM) density map to select the ones that fit well into the EM map. The comparison of docking conformations and the EM map is performed using the 3D Zernike descriptor (3DZD), a mathematical series expansion of three-dimensional functions. The 3DZD provides a unified representation of the surface shape of multimeric protein complex models and EM maps, which allows a convenient, fast quantitative comparison of the three dimensional structural data. Out of 19 multimeric complexes tested, near native complex structures with a root mean square deviation of less than 2.5 Å were obtained for 14 cases while medium range resolution structures with correct topology were computed for the additional 5 cases. PMID:22417139
Fitting multimeric protein complexes into electron microscopy maps using 3D Zernike descriptors.
Esquivel-Rodríguez, Juan; Kihara, Daisuke
2012-06-14
A novel computational method for fitting high-resolution structures of multiple proteins into a cryoelectron microscopy map is presented. The method named EMLZerD generates a pool of candidate multiple protein docking conformations of component proteins, which are later compared with a provided electron microscopy (EM) density map to select the ones that fit well into the EM map. The comparison of docking conformations and the EM map is performed using the 3D Zernike descriptor (3DZD), a mathematical series expansion of three-dimensional functions. The 3DZD provides a unified representation of the surface shape of multimeric protein complex models and EM maps, which allows a convenient, fast quantitative comparison of the three-dimensional structural data. Out of 19 multimeric complexes tested, near native complex structures with a root-mean-square deviation of less than 2.5 Å were obtained for 14 cases while medium range resolution structures with correct topology were computed for the additional 5 cases.
Comparative structural analysis of human DEAD-box RNA helicases.
Schütz, Patrick; Karlberg, Tobias; van den Berg, Susanne; Collins, Ruairi; Lehtiö, Lari; Högbom, Martin; Holmberg-Schiavone, Lovisa; Tempel, Wolfram; Park, Hee-Won; Hammarström, Martin; Moche, Martin; Thorsell, Ann-Gerd; Schüler, Herwig
2010-09-30
DEAD-box RNA helicases play various, often critical, roles in all processes where RNAs are involved. Members of this family of proteins are linked to human disease, including cancer and viral infections. DEAD-box proteins contain two conserved domains that both contribute to RNA and ATP binding. Despite recent advances the molecular details of how these enzymes convert chemical energy into RNA remodeling is unknown. We present crystal structures of the isolated DEAD-domains of human DDX2A/eIF4A1, DDX2B/eIF4A2, DDX5, DDX10/DBP4, DDX18/myc-regulated DEAD-box protein, DDX20, DDX47, DDX52/ROK1, and DDX53/CAGE, and of the helicase domains of DDX25 and DDX41. Together with prior knowledge this enables a family-wide comparative structural analysis. We propose a general mechanism for opening of the RNA binding site. This analysis also provides insights into the diversity of DExD/H- proteins, with implications for understanding the functions of individual family members.
Comparative Structural Analysis of Human DEAD-Box RNA Helicases
Schütz, Patrick; Karlberg, Tobias; van den Berg, Susanne; Collins, Ruairi; Lehtiö, Lari; Högbom, Martin; Holmberg-Schiavone, Lovisa; Tempel, Wolfram; Park, Hee-Won; Hammarström, Martin; Moche, Martin; Thorsell, Ann-Gerd; Schüler, Herwig
2010-01-01
DEAD-box RNA helicases play various, often critical, roles in all processes where RNAs are involved. Members of this family of proteins are linked to human disease, including cancer and viral infections. DEAD-box proteins contain two conserved domains that both contribute to RNA and ATP binding. Despite recent advances the molecular details of how these enzymes convert chemical energy into RNA remodeling is unknown. We present crystal structures of the isolated DEAD-domains of human DDX2A/eIF4A1, DDX2B/eIF4A2, DDX5, DDX10/DBP4, DDX18/myc-regulated DEAD-box protein, DDX20, DDX47, DDX52/ROK1, and DDX53/CAGE, and of the helicase domains of DDX25 and DDX41. Together with prior knowledge this enables a family-wide comparative structural analysis. We propose a general mechanism for opening of the RNA binding site. This analysis also provides insights into the diversity of DExD/H- proteins, with implications for understanding the functions of individual family members. PMID:20941364
Erban, Tomas; Harant, Karel; Hubalek, Martin; Vitamvas, Pavel; Kamler, Martin; Poltronieri, Palmiro; Tyl, Jan; Markovic, Martin; Titera, Dalibor
2015-09-11
We investigated pathogens in the parasitic honeybee mite Varroa destructor using nanoLC-MS/MS (TripleTOF) and 2D-E-MS/MS proteomics approaches supplemented with affinity-chromatography to concentrate trace target proteins. Peptides were detected from the currently uncharacterized Varroa destructor Macula-like virus (VdMLV), the deformed wing virus (DWV)-complex and the acute bee paralysis virus (ABPV). Peptide alignments revealed detection of complete structural DWV-complex block VP2-VP1-VP3, VDV-1 helicase and single-amino-acid substitution A/K/Q in VP1, the ABPV structural block VP1-VP4-VP2-VP3 including uncleaved VP4/VP2, and VdMLV coat protein. Isoforms of viral structural proteins of highest abundance were localized via 2D-E. The presence of all types of capsid/coat proteins of a particular virus suggested the presence of virions in Varroa. Also, matches between the MWs of viral structural proteins on 2D-E and their theoretical MWs indicated that viruses were not digested. The absence/scarce detection of non-structural proteins compared with high-abundance structural proteins suggest that the viruses did not replicate in the mite; hence, virions accumulate in the Varroa gut via hemolymph feeding. Hemolymph feeding also resulted in the detection of a variety of honeybee proteins. The advantages of MS-based proteomics for pathogen detection, false-positive pathogen detection, virus replication, posttranslational modifications, and the presence of honeybee proteins in Varroa are discussed.
Erban, Tomas; Harant, Karel; Hubalek, Martin; Vitamvas, Pavel; Kamler, Martin; Poltronieri, Palmiro; Tyl, Jan; Markovic, Martin; Titera, Dalibor
2015-01-01
We investigated pathogens in the parasitic honeybee mite Varroa destructor using nanoLC-MS/MS (TripleTOF) and 2D-E-MS/MS proteomics approaches supplemented with affinity-chromatography to concentrate trace target proteins. Peptides were detected from the currently uncharacterized Varroa destructor Macula-like virus (VdMLV), the deformed wing virus (DWV)-complex and the acute bee paralysis virus (ABPV). Peptide alignments revealed detection of complete structural DWV-complex block VP2-VP1-VP3, VDV-1 helicase and single-amino-acid substitution A/K/Q in VP1, the ABPV structural block VP1-VP4-VP2-VP3 including uncleaved VP4/VP2, and VdMLV coat protein. Isoforms of viral structural proteins of highest abundance were localized via 2D-E. The presence of all types of capsid/coat proteins of a particular virus suggested the presence of virions in Varroa. Also, matches between the MWs of viral structural proteins on 2D-E and their theoretical MWs indicated that viruses were not digested. The absence/scarce detection of non-structural proteins compared with high-abundance structural proteins suggest that the viruses did not replicate in the mite; hence, virions accumulate in the Varroa gut via hemolymph feeding. Hemolymph feeding also resulted in the detection of a variety of honeybee proteins. The advantages of MS-based proteomics for pathogen detection, false-positive pathogen detection, virus replication, posttranslational modifications, and the presence of honeybee proteins in Varroa are discussed. PMID:26358842
Oezguen, Numan; Zhou, Bin; Negi, Surendra S.; Ivanciuc, Ovidiu; Schein, Catherine H.; Labesse, Gilles; Braun, Werner
2008-01-01
Similarities in sequences and 3D structures of allergenic proteins provide vital clues to identify clinically relevant IgE cross-reactivities. However, experimental 3D structures are available in the Protein Data Bank for only 5% (45/829) of all allergens catalogued in the Structural Database of Allergenic Proteins (SDAP, http://fermi.utmb.edu/SDAP). Here, an automated procedure was used to prepare 3D-models of all allergens where there was no experimentally determined 3D structure or high identity (95%) to another protein of known 3D structure. After a final selection by quality criteria, 433 reliable 3D models were retained and are available from our SDAP Website. The new 3D models extensively enhance our knowledge of allergen structures. As an example of their use, experimentally derived “continuous IgE epitopes” were mapped on 3 experimentally determined structures and 13 of our 3D-models of allergenic proteins. Large portions of these continuous sequences are not entirely on the surface and therefore cannot interact with IgE or other proteins. Only the surface exposed residues are constituents of “conformational IgE epitopes” which are not in all cases continuous in sequence. The surface exposed parts of the experimental determined continuous IgE epitopes showed a distinct statistical distribution as compared to their presence in typical protein-protein interfaces. The amino acids Ala, Ser, Asn, Gly and particularly Lys have a high propensity to occur in IgE binding sites. The 3D-models will facilitate further analysis of the common properties of IgE binding sites of allergenic proteins. PMID:18621419
Anjos, Liliana; Morgado, Isabel; Guerreiro, Marta; Cardoso, João C R; Melo, Eduardo P; Power, Deborah M
2017-02-01
Cartilage acidic protein1 (CRTAC1) is an extracellular matrix protein of chondrogenic tissue in humans and its presence in bacteria indicate it is of ancient origin. Structural modeling of piscine CRTAC1 reveals it belongs to the large family of beta-propeller proteins that in mammals have been associated with diseases, including amyloid diseases such as Alzheimer's. In order to characterize the structure/function evolution of this new member of the beta-propeller family we exploited the unique characteristics of piscine duplicate genes Crtac1a and Crtac1b and compared their structural and biochemical modifications with human recombinant CRTAC1. We demonstrate that CRTAC1 has a beta-propeller structure that has been conserved during evolution and easily forms high molecular weight thermo-stable aggregates. We reveal for the first time the propensity of CRTAC1 to form amyloid-like structures, and hypothesize that the aggregating property of CRTAC1 may be related to its disease-association. We further contribute to the general understating of CRTAC1's and beta-propeller family evolution and function. Proteins 2017; 85:242-255. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Rapid comparison of properties on protein surface
Sael, Lee; La, David; Li, Bin; Rustamov, Raif; Kihara, Daisuke
2008-01-01
The mapping of physicochemical characteristics onto the surface of a protein provides crucial insights into its function and evolution. This information can be further used in the characterization and identification of similarities within protein surface regions. We propose a novel method which quantitatively compares global and local properties on the protein surface. We have tested the method on comparison of electrostatic potentials and hydrophobicity. The method is based on 3D Zernike descriptors, which provides a compact representation of a given property defined on a protein surface. Compactness and rotational invariance of this descriptor enable fast comparison suitable for database searches. The usefulness of this method is exemplified by studying several protein families including globins, thermophilic and mesophilic proteins, and active sites of TIM β/α barrel proteins. In all the cases studied, the descriptor is able to cluster proteins into functionally relevant groups. The proposed approach can also be easily extended to other surface properties. This protein surface-based approach will add a new way of viewing and comparing proteins to conventional methods, which compare proteins in terms of their primary sequence or tertiary structure. PMID:18618695
Rapid comparison of properties on protein surface.
Sael, Lee; La, David; Li, Bin; Rustamov, Raif; Kihara, Daisuke
2008-10-01
The mapping of physicochemical characteristics onto the surface of a protein provides crucial insights into its function and evolution. This information can be further used in the characterization and identification of similarities within protein surface regions. We propose a novel method which quantitatively compares global and local properties on the protein surface. We have tested the method on comparison of electrostatic potentials and hydrophobicity. The method is based on 3D Zernike descriptors, which provides a compact representation of a given property defined on a protein surface. Compactness and rotational invariance of this descriptor enable fast comparison suitable for database searches. The usefulness of this method is exemplified by studying several protein families including globins, thermophilic and mesophilic proteins, and active sites of TIM beta/alpha barrel proteins. In all the cases studied, the descriptor is able to cluster proteins into functionally relevant groups. The proposed approach can also be easily extended to other surface properties. This protein surface-based approach will add a new way of viewing and comparing proteins to conventional methods, which compare proteins in terms of their primary sequence or tertiary structure.
Cryo-EM structure of the large subunit of the spinach chloroplast ribosome
Ahmed, Tofayel; Yin, Zhan; Bhushan, Shashi
2016-01-01
Protein synthesis in the chloroplast is mediated by the chloroplast ribosome (chloro-ribosome). Overall architecture of the chloro-ribosome is considerably similar to the Escherichia coli (E. coli) ribosome but certain differences are evident. The chloro-ribosome proteins are generally larger because of the presence of chloroplast-specific extensions in their N- and C-termini. The chloro-ribosome harbours six plastid-specific ribosomal proteins (PSRPs); four in the small subunit and two in the large subunit. Deletions and insertions occur throughout the rRNA sequence of the chloro-ribosome (except for the conserved peptidyl transferase center region) but the overall length of the rRNAs do not change significantly, compared to the E. coli. Although, recent advancements in cryo-electron microscopy (cryo-EM) have provided detailed high-resolution structures of ribosomes from many different sources, a high-resolution structure of the chloro-ribosome is still lacking. Here, we present a cryo-EM structure of the large subunit of the chloro-ribosome from spinach (Spinacia oleracea) at an average resolution of 3.5 Å. High-resolution map enabled us to localize and model chloro-ribosome proteins, chloroplast-specific protein extensions, two PSRPs (PSRP5 and 6) and three rRNA molecules present in the chloro-ribosome. Although comparable to E. coli, the polypeptide tunnel and the tunnel exit site show chloroplast-specific features. PMID:27762343
Kundu, Sangeeta; Roy, Debjani
2009-01-01
Comparative molecular dynamics simulations of psychrophilic type III antifreeze protein from the North-Atlantic ocean-pout Macrozoarces americanus and its corresponding mesophilic counterpart, the antifreeze-like domain of human sialic acid synthase, have been performed for 10 ns each at five different temperatures. Analyses of trajectories in terms of secondary structure content, solvent accessibility, intramolecular hydrogen bonds and protein-solvent interactions indicate distinct differences in these two proteins. The two proteins also follow dissimilar unfolding pathways. The overall flexibility calculated by the trace of the diagonalized covariance matrix displays similar flexibility of both the proteins near their growth temperatures. However at higher temperatures psychrophilic protein shows increased overall flexibility than its mesophilic counterpart. Principal component analysis also indicates that the essential subspaces explored by the simulations of two proteins at different temperatures are non-overlapping and they show significantly different directions of motion. However, there are significant overlaps within the trajectories and similar directions of motion of each protein especially at 298 K, 310 K and 373 K. Overall, the psychrophilic protein leads to increased conformational sampling of the phase space than its mesophilic counterpart. Our study may help in elucidating the molecular basis of thermostability of homologous proteins from two organisms living at different temperature conditions. Such an understanding is required for designing efficient proteins with characteristics for a particular application at desired working temperatures.
2000-05-05
This computer graphic depicts the relative complexity of crystallizing large proteins in order to study their structures through x-ray crystallography. Insulin is a vital protein whose structure has several subtle points that scientists are still trying to determine. Large molecules such as insuline are complex with structures that are comparatively difficult to understand. For comparison, a sugar molecule (which many people have grown as hard crystals in science glass) and a water molecule are shown. These images were produced with the Macmolecule program. Photo credit: NASA/Marshall Space Flight Center (MSFC)
Structural alignment of protein descriptors - a combinatorial model.
Antczak, Maciej; Kasprzak, Marta; Lukasiak, Piotr; Blazewicz, Jacek
2016-09-17
Structural alignment of proteins is one of the most challenging problems in molecular biology. The tertiary structure of a protein strictly correlates with its function and computationally predicted structures are nowadays a main premise for understanding the latter. However, computationally derived 3D models often exhibit deviations from the native structure. A way to confirm a model is a comparison with other structures. The structural alignment of a pair of proteins can be defined with the use of a concept of protein descriptors. The protein descriptors are local substructures of protein molecules, which allow us to divide the original problem into a set of subproblems and, consequently, to propose a more efficient algorithmic solution. In the literature, one can find many applications of the descriptors concept that prove its usefulness for insight into protein 3D structures, but the proposed approaches are presented rather from the biological perspective than from the computational or algorithmic point of view. Efficient algorithms for identification and structural comparison of descriptors can become crucial components of methods for structural quality assessment as well as tertiary structure prediction. In this paper, we propose a new combinatorial model and new polynomial-time algorithms for the structural alignment of descriptors. The model is based on the maximum-size assignment problem, which we define here and prove that it can be solved in polynomial time. We demonstrate suitability of this approach by comparison with an exact backtracking algorithm. Besides a simplification coming from the combinatorial modeling, both on the conceptual and complexity level, we gain with this approach high quality of obtained results, in terms of 3D alignment accuracy and processing efficiency. All the proposed algorithms were developed and integrated in a computationally efficient tool descs-standalone, which allows the user to identify and structurally compare descriptors of biological molecules, such as proteins and RNAs. Both PDB (Protein Data Bank) and mmCIF (macromolecular Crystallographic Information File) formats are supported. The proposed tool is available as an open source project stored on GitHub ( https://github.com/mantczak/descs-standalone ).
The Leptospiral Antigen Lp49 is a Two-Domain Protein with Putative Protein Binding Function
DOE Office of Scientific and Technical Information (OSTI.GOV)
Oliveira Giuseppe,P.; Oliveira Neves, F.; Nascimento, A.
2008-01-01
Pathogenic Leptospira is the etiological agent of leptospirosis, a life-threatening disease that affects populations worldwide. Currently available vaccines have limited effectiveness and therapeutic interventions are complicated by the difficulty in making an early diagnosis of leptospirosis. The genome of Leptospira interrogans was recently sequenced and comparative genomic analysis contributed to the identification of surface antigens, potential candidates for development of new vaccines and serodiagnosis. Lp49 is a membrane-associated protein recognized by antibodies present in sera from early and convalescent phases of leptospirosis patients. Its crystal structure was determined by single-wavelength anomalous diffraction using selenomethionine-labelled crystals and refined at 2.0 Angstromsmore » resolution. Lp49 is composed of two domains and belongs to the all-beta-proteins class. The N-terminal domain folds in an immunoglobulin-like beta-sandwich structure, whereas the C-terminal domain presents a seven-bladed beta-propeller fold. Structural analysis of Lp49 indicates putative protein-protein binding sites, suggesting a role in Leptospira-host interaction. This is the first crystal structure of a leptospiral antigen described to date.« less
Andersen, Ole Juul; Grouleff, Julie; Needham, Perri; Walker, Ross C; Jensen, Frank
2015-11-19
Current enhanced sampling molecular dynamics methods for studying large conformational changes in proteins suffer from certain limitations. These include, among others, the need for user defined collective variables, the prerequisite of both start and end point structures of the conformational change, and the need for a priori knowledge of the amount by which to boost specific parts of the potential. In this paper, a framework is proposed for a molecular dynamics method for studying ligand-induced conformational changes, in which the nonbonded interactions between the ligand and the protein are used to calculate a biasing force. The method requires only a single input structure, and does not entail the use of collective variables. We provide a proof-of-concept for accelerating conformational changes in three simple test molecules, as well as promising results for two proteins known to undergo domain closure upon ligand binding. For the ribose-binding protein, backbone root-mean-square deviations as low as 0.75 Å compared to the crystal structure of the closed conformation are obtained within 50 ns simulations, whereas no domain closures are observed in unbiased simulations. A skewed closed structure is obtained for the glutamine-binding protein at high bias values, indicating that specific protein-ligand interactions might suppress important protein-protein interactions.
Structural modeling of the N-terminal signal–receiving domain of IκBα
Yazdi, Samira; Durdagi, Serdar; Naumann, Michael; Stein, Matthias
2015-01-01
The transcription factor nuclear factor-κB (NF-κB) exerts essential roles in many biological processes including cell growth, apoptosis and innate and adaptive immunity. The NF-κB inhibitor (IκBα) retains NF-κB in the cytoplasm and thus inhibits nuclear localization of NF-κB and its association with DNA. Recent protein crystal structures of the C-terminal part of IκBα in complex with NF-κB provided insights into the protein-protein interactions but could not reveal structural details about the N-terminal signal receiving domain (SRD). The SRD of IκBα contains a degron, formed following phosphorylation by IκB kinases (IKK). In current protein X-ray structures, however, the SRD is not resolved and assumed to be disordered. Here, we combined secondary structure annotation and domain threading followed by long molecular dynamics (MD) simulations and showed that the SRD possesses well-defined secondary structure elements. We show that the SRD contains 3 additional stable α-helices supplementing the six ARDs present in crystallized IκBα. The IκBα/NF-κB protein-protein complex remained intact and stable during the entire simulations. Also in solution, free IκBα retains its structural integrity. Differences in structural topology and dynamics were observed by comparing the structures of NF-κB free and NF-κB bound IκBα-complex. This study paves the way for investigating the signaling properties of the SRD in the IκBα degron. A detailed atomic scale understanding of molecular mechanism of NF-κB activation, regulation and the protein-protein interactions may assist to design and develop novel chronic inflammation modulators. PMID:26157801
Raval, Alpan; Piana, Stefano; Eastwood, Michael P; Shaw, David E
2016-01-01
Molecular dynamics (MD) simulation is a well-established tool for the computational study of protein structure and dynamics, but its application to the important problem of protein structure prediction remains challenging, in part because extremely long timescales can be required to reach the native structure. Here, we examine the extent to which the use of low-resolution information in the form of residue-residue contacts, which can often be inferred from bioinformatics or experimental studies, can accelerate the determination of protein structure in simulation. We incorporated sets of 62, 31, or 15 contact-based restraints in MD simulations of ubiquitin, a benchmark system known to fold to the native state on the millisecond timescale in unrestrained simulations. One-third of the restrained simulations folded to the native state within a few tens of microseconds-a speedup of over an order of magnitude compared with unrestrained simulations and a demonstration of the potential for limited amounts of structural information to accelerate structure determination. Almost all of the remaining ubiquitin simulations reached near-native conformations within a few tens of microseconds, but remained trapped there, apparently due to the restraints. We discuss potential methodological improvements that would facilitate escape from these near-native traps and allow more simulations to quickly reach the native state. Finally, using a target from the Critical Assessment of protein Structure Prediction (CASP) experiment, we show that distance restraints can improve simulation accuracy: In our simulations, restraints stabilized the native state of the protein, enabling a reasonable structural model to be inferred. © 2015 The Authors Protein Science published by Wiley Periodicals, Inc. on behalf of The Protein Society.
A quasi-atomic model of human adenovirus type 5 capsid
Fabry, Céline M S; Rosa-Calatrava, Manuel; Conway, James F; Zubieta, Chloé; Cusack, Stephen; Ruigrok, Rob W H; Schoehn, Guy
2005-01-01
Adenoviruses infect a wide range of vertebrates including humans. Their icosahedral capsids are composed of three major proteins: the trimeric hexon forms the facets and the penton, a noncovalent complex of the pentameric penton base and trimeric fibre proteins, is located at the 12 capsid vertices. Several proteins (IIIa, VI, VIII and IX) stabilise the capsid. We have obtained a 10 Å resolution map of the human adenovirus 5 by image analysis from cryo-electron micrographs (cryoEMs). This map, in combination with the X-ray structures of the penton base and hexon, was used to build a quasi-atomic model of the arrangement of the two major capsid components and to analyse the hexon–hexon and hexon–penton interactions. The secondary proteins, notably VIII, were located by comparing cryoEM maps of native and pIX deletion mutant virions. Minor proteins IX and IIIa are located on the outside of the capsid, whereas protein VIII is organised with a T=2 lattice on the inner face of the capsid. The capsid organisation is compared with the known X-ray structure of bacteriophage PRD1. PMID:15861131
Kinjo, Akira R.; Bekker, Gert-Jan; Suzuki, Hirofumi; Tsuchiya, Yuko; Kawabata, Takeshi; Ikegawa, Yasuyo; Nakamura, Haruki
2017-01-01
The Protein Data Bank Japan (PDBj, http://pdbj.org), a member of the worldwide Protein Data Bank (wwPDB), accepts and processes the deposited data of experimentally determined macromolecular structures. While maintaining the archive in collaboration with other wwPDB partners, PDBj also provides a wide range of services and tools for analyzing structures and functions of proteins. We herein outline the updated web user interfaces together with RESTful web services and the backend relational database that support the former. To enhance the interoperability of the PDB data, we have previously developed PDB/RDF, PDB data in the Resource Description Framework (RDF) format, which is now a wwPDB standard called wwPDB/RDF. We have enhanced the connectivity of the wwPDB/RDF data by incorporating various external data resources. Services for searching, comparing and analyzing the ever-increasing large structures determined by hybrid methods are also described. PMID:27789697
Fraune, Johanna; Alsheimer, Manfred; Volff, Jean-Nicolas; Busch, Karoline; Fraune, Sebastian; Bosch, Thomas C G; Benavente, Ricardo
2012-10-09
The synaptonemal complex (SC) is a key structure of meiosis, mediating the stable pairing (synapsis) of homologous chromosomes during prophase I. Its remarkable tripartite structure is evolutionarily well conserved and can be found in almost all sexually reproducing organisms. However, comparison of the different SC protein components in the common meiosis model organisms Saccharomyces cerevisiae, Arabidopsis thaliana, Caenorhabditis elegans, Drosophila melanogaster, and Mus musculus revealed no sequence homology. This discrepancy challenged the hypothesis that the SC arose only once in evolution. To pursue this matter we focused on the evolution of SYCP1 and SYCP3, the two major structural SC proteins of mammals. Remarkably, our comparative bioinformatic and expression studies revealed that SYCP1 and SYCP3 are also components of the SC in the basal metazoan Hydra. In contrast to previous assumptions, we therefore conclude that SYCP1 and SYCP3 form monophyletic groups of orthologous proteins across metazoans.
Ringer, Ashley L.; Senenko, Anastasia; Sherrill, C. David
2007-01-01
S/π interactions are prevalent in biochemistry and play an important role in protein folding and stabilization. Geometries of cysteine/aromatic interactions found in crystal structures from the Brookhaven Protein Data Bank (PDB) are analyzed and compared with the equilibrium configurations predicted by high-level quantum mechanical results for the H2S–benzene complex. A correlation is observed between the energetically favorable configurations on the quantum mechanical potential energy surface of the H2S–benzene model and the cysteine/aromatic configurations most frequently found in crystal structures of the PDB. In contrast to some previous PDB analyses, configurations with the sulfur over the aromatic ring are found to be the most important. Our results suggest that accurate quantum computations on models of noncovalent interactions may be helpful in understanding the structures of proteins and other complex systems. PMID:17766371
Fraune, Johanna; Alsheimer, Manfred; Volff, Jean-Nicolas; Busch, Karoline; Fraune, Sebastian; Bosch, Thomas C. G.; Benavente, Ricardo
2012-01-01
The synaptonemal complex (SC) is a key structure of meiosis, mediating the stable pairing (synapsis) of homologous chromosomes during prophase I. Its remarkable tripartite structure is evolutionarily well conserved and can be found in almost all sexually reproducing organisms. However, comparison of the different SC protein components in the common meiosis model organisms Saccharomyces cerevisiae, Arabidopsis thaliana, Caenorhabditis elegans, Drosophila melanogaster, and Mus musculus revealed no sequence homology. This discrepancy challenged the hypothesis that the SC arose only once in evolution. To pursue this matter we focused on the evolution of SYCP1 and SYCP3, the two major structural SC proteins of mammals. Remarkably, our comparative bioinformatic and expression studies revealed that SYCP1 and SYCP3 are also components of the SC in the basal metazoan Hydra. In contrast to previous assumptions, we therefore conclude that SYCP1 and SYCP3 form monophyletic groups of orthologous proteins across metazoans. PMID:23012415
NASA Astrophysics Data System (ADS)
Liu, Hui; Su, Qinglong; Sheng, Daping; Zheng, Wei; Wang, Xin
2017-02-01
In this paper, FTIR spectroscopy was used to compare gastric cancer patients' red blood cells (RBCs) with healthy persons' RBCs. IR spectra were acquired with high resolution. The A1653/A1543 (the protein secondary structures), A1543/A2958 (the relative content of proteins and lipids), A1106/A1166 (the structure and content changes of sugars) and A1543/A1106 (the relative content of proteins and sugars) ratios of gastric cancer patients' RBCs were significantly different from those of healthy persons' RBCs. Curve fitting results showed that the protein secondary structures and sugars' structures had differences between gastric cancer patients' and healthy persons' RBCs. Additionally, FTIR spectroscopy could obtain 95% sensitivity, 70% specificity, 84.2% accuracy and 80.9% positive predictive value in combination with canconical discriminant analysis. The above results indicate FTIR spectroscopy may be useful for diagnosing gastric cancer.
Improved protein surface comparison and application to low-resolution protein structure data.
Sael, Lee; Kihara, Daisuke
2010-12-14
Recent advancements of experimental techniques for determining protein tertiary structures raise significant challenges for protein bioinformatics. With the number of known structures of unknown function expanding at a rapid pace, an urgent task is to provide reliable clues to their biological function on a large scale. Conventional approaches for structure comparison are not suitable for a real-time database search due to their slow speed. Moreover, a new challenge has arisen from recent techniques such as electron microscopy (EM), which provide low-resolution structure data. Previously, we have introduced a method for protein surface shape representation using the 3D Zernike descriptors (3DZDs). The 3DZD enables fast structure database searches, taking advantage of its rotation invariance and compact representation. The search results of protein surface represented with the 3DZD has showngood agreement with the existing structure classifications, but some discrepancies were also observed. The three new surface representations of backbone atoms, originally devised all-atom-surface representation, and the combination of all-atom surface with the backbone representation are examined. All representations are encoded with the 3DZD. Also, we have investigated the applicability of the 3DZD for searching protein EM density maps of varying resolutions. The surface representations are evaluated on structure retrieval using two existing classifications, SCOP and the CE-based classification. Overall, the 3DZDs representing backbone atoms show better retrieval performance than the original all-atom surface representation. The performance further improved when the two representations are combined. Moreover, we observed that the 3DZD is also powerful in comparing low-resolution structures obtained by electron microscopy.
Liu, Tong; Wang, Zheng
2018-01-01
The segment overlap score (SOV) has been used to evaluate the predicted protein secondary structures, a sequence composed of helix (H), strand (E), and coil (C), by comparing it with the native or reference secondary structures, another sequence of H, E, and C. SOV's advantage is that it can consider the size of continuous overlapping segments and assign extra allowance to longer continuous overlapping segments instead of only judging from the percentage of overlapping individual positions as Q3 score does. However, we have found a drawback from its previous definition, that is, it cannot ensure increasing allowance assignment when more residues in a segment are further predicted accurately. A new way of assigning allowance has been designed, which keeps all the advantages of the previous SOV score definitions and ensures that the amount of allowance assigned is incremental when more elements in a segment are predicted accurately. Furthermore, our improved SOV has achieved a higher correlation with the quality of protein models measured by GDT-TS score and TM-score, indicating its better abilities to evaluate tertiary structure quality at the secondary structure level. We analyzed the statistical significance of SOV scores and found the threshold values for distinguishing two protein structures (SOV_refine > 0.19) and indicating whether two proteins are under the same CATH fold (SOV_refine > 0.94 and > 0.90 for three- and eight-state secondary structures respectively). We provided another two example applications, which are when used as a machine learning feature for protein model quality assessment and comparing different definitions of topologically associating domains. We proved that our newly defined SOV score resulted in better performance. The SOV score can be widely used in bioinformatics research and other fields that need to compare two sequences of letters in which continuous segments have important meanings. We also generalized the previous SOV definitions so that it can work for sequences composed of more than three states (e.g., it can work for the eight-state definition of protein secondary structures). A standalone software package has been implemented in Perl with source code released. The software can be downloaded from http://dna.cs.miami.edu/SOV/.
Joseph, Agnel Praveen; Srinivasan, Narayanaswamy; de Brevern, Alexandre G
2012-09-01
Comparison of multiple protein structures has a broad range of applications in the analysis of protein structure, function and evolution. Multiple structure alignment tools (MSTAs) are necessary to obtain a simultaneous comparison of a family of related folds. In this study, we have developed a method for multiple structure comparison largely based on sequence alignment techniques. A widely used Structural Alphabet named Protein Blocks (PBs) was used to transform the information on 3D protein backbone conformation as a 1D sequence string. A progressive alignment strategy similar to CLUSTALW was adopted for multiple PB sequence alignment (mulPBA). Highly similar stretches identified by the pairwise alignments are given higher weights during the alignment. The residue equivalences from PB based alignments are used to obtain a three dimensional fit of the structures followed by an iterative refinement of the structural superposition. Systematic comparisons using benchmark datasets of MSTAs underlines that the alignment quality is better than MULTIPROT, MUSTANG and the alignments in HOMSTRAD, in more than 85% of the cases. Comparison with other rigid-body and flexible MSTAs also indicate that mulPBA alignments are superior to most of the rigid-body MSTAs and highly comparable to the flexible alignment methods. Copyright © 2012 Elsevier Masson SAS. All rights reserved.
Corsaro, M Michela; Parrilli, Ermenegilda; Lanzetta, Rosa; Naldi, Teresa; Pieretti, Giuseppina; Lindner, Buko; Carpentieri, Andrea; Parrilli, Michelangelo; Tutino, M Luisa
2009-08-01
The role of lipopolysaccharides (LPSs) in the biogenesis of outer membrane proteins have been investigated in several studies. Some of these analyses showed that LPS is required for correct and efficient folding of outer membrane proteins; other studies support the idea of independence of outer membrane proteins biogenesis from LPS structure. In this article, we investigated the involvement of LPS structure in the anomalous aggregation of outer membrane proteins in a E. coli mutant strain (S17-1(lambdapir)). To achieve this aim, the LPS structure of the mutant strain was carefully determined and compared with the E. coli K-12 one. It turned out that LPS of these two strains differs in the inner core for the absence of a heptose residue (HepIII). We demonstrated that this difference is due to a mutation in waaQ, a gene encoding the transferase for the branch heptose HepIII residue. The mutation was complemented to find out if the restoration of LPS structure influenced the observed outer membrane proteins aggregation. Data reported in this work demonstrated that, in E. coli S17-1(lambdapir) there is no influence of LPS structure on the outer membrane proteins inclusion bodies formation.
i3Drefine software for protein 3D structure refinement and its assessment in CASP10.
Bhattacharya, Debswapna; Cheng, Jianlin
2013-01-01
Protein structure refinement refers to the process of improving the qualities of protein structures during structure modeling processes to bring them closer to their native states. Structure refinement has been drawing increasing attention in the community-wide Critical Assessment of techniques for Protein Structure prediction (CASP) experiments since its addition in 8(th) CASP experiment. During the 9(th) and recently concluded 10(th) CASP experiments, a consistent growth in number of refinement targets and participating groups has been witnessed. Yet, protein structure refinement still remains a largely unsolved problem with majority of participating groups in CASP refinement category failed to consistently improve the quality of structures issued for refinement. In order to alleviate this need, we developed a completely automated and computationally efficient protein 3D structure refinement method, i3Drefine, based on an iterative and highly convergent energy minimization algorithm with a powerful all-atom composite physics and knowledge-based force fields and hydrogen bonding (HB) network optimization technique. In the recent community-wide blind experiment, CASP10, i3Drefine (as 'MULTICOM-CONSTRUCT') was ranked as the best method in the server section as per the official assessment of CASP10 experiment. Here we provide the community with free access to i3Drefine software and systematically analyse the performance of i3Drefine in strict blind mode on the refinement targets issued in CASP10 refinement category and compare with other state-of-the-art refinement methods participating in CASP10. Our analysis demonstrates that i3Drefine is only fully-automated server participating in CASP10 exhibiting consistent improvement over the initial structures in both global and local structural quality metrics. Executable version of i3Drefine is freely available at http://protein.rnet.missouri.edu/i3drefine/.
Diehl, Carl; Wisniewska, Magdalena; Frick, Inga-Maria; Streicher, Werner; Björck, Lars; Malmström, Johan; Wikström, Mats
2016-01-01
Streptococcus pyogenes is one of the most significant bacterial pathogens in the human population mostly causing superficial and uncomplicated infections (pharyngitis and impetigo) but also invasive and life-threatening disease. We have previously identified a virulence determinant, protein sHIP, which is secreted at higher levels by an invasive compared to a non-invasive strain of S. pyogenes. The present work presents a further characterization of the structural and functional properties of this bacterial protein. Biophysical and structural studies have shown that protein sHIP forms stable tetramers both in the crystal and in solution. The tetramers are composed of four helix-loop-helix motifs with the loop regions connecting the helices displaying a high degree of flexibility. Owing to interactions at the tetramer interface, the observed tetramer can be described as a dimer of dimers. We identified three residues at the tetramer interface (Leu84, Leu88, Tyr95), which due to largely non-polar side-chains, could be important determinants for protein oligomerization. Based on these observations, we produced a sHIP variant in which these residues were mutated to alanines. Biophysical experiments clearly indicated that the sHIP mutant appear only as dimers in solution confirming the importance of the interfacial residues for protein oligomerisation. Furthermore, we could show that the sHIP mutant interacts with intact histidine-rich glycoprotein (HRG) and the histidine-rich repeats in HRG, and inhibits their antibacterial activity to the same or even higher extent as compared to the wild type protein sHIP. We determined the crystal structure of the sHIP mutant, which, as a result of the high quality of the data, allowed us to improve the existing structural model of the protein. Finally, by employing NMR spectroscopy in solution, we generated a model for the complex between the sHIP mutant and an HRG-derived heparin-binding peptide, providing further molecular details into the interactions involving protein sHIP.
Bartels, Desirée; Baumann, Alexander; Maeder, Malte; Geske, Thomas; Heise, Esther Marie; von Schwartzenberg, Klaus; Classen, Birgit
2017-05-01
Arabinogalactan-proteins (AGPs) are important proteoglycans of plant cell walls. They seem to be present in most, if not all seed plants, but their occurrence and structure in bryophytes is widely unknown and actually the focus of AGP research. With regard to evolution of plant cell wall, we isolated AGPs from the three mosses Sphagnum sp., Physcomitrella patens and Polytrichastrum formosum. The moss AGPs show structural characteristics common for AGPs of seed plants, but also unique features, especially 3-O-methyl-rhamnose (trivial name acofriose) as terminal monosaccharide not found in arabinogalactan-proteins of angiosperms and 1,2,3-linked galactose as branching point never found in arabinogalactan-proteins before. Copyright © 2017 Elsevier Ltd. All rights reserved.
Döring, Clemens; Hussein, Mohamed A; Jekle, Mario; Becker, Thomas
2017-08-15
For rye dough structure, it is hypothesised that the presence of arabinoxylan hinders the proteins from forming a coherent network. This hypothesis was investigated using fluorescent-stained antibodies that bind to the arabinoxylan chains. Image analysis proves that the arabinoxylan surrounds the proteins, negatively affecting protein networking. Further, it is hypothesised that the dosing of xylanase and transglutaminase has a positive impact on rye dough and bread characteristics; the findings in this study evidenced that this increases the protein network by up to 38% accompanied by a higher volume rise of 10.67%, compared to standard rye dough. These outcomes combine a product-oriented and physiochemical design of a recipe, targeting structural and functional relationships, and demonstrate a successful methodology for enhancing rye bread quality. Copyright © 2017 Elsevier Ltd. All rights reserved.
Water entrapment and structure ordering as protection mechanisms for protein structural preservation
NASA Astrophysics Data System (ADS)
Arsiccio, A.; Pisano, R.
2018-02-01
In this paper, molecular dynamics is used to further gain insight into the mechanisms by which typical pharmaceutical excipients preserve the protein structure. More specifically, the water entrapment scenario will be analyzed, which states that excipients form a cage around the protein, entrapping and slowing water molecules. Human growth hormone will be used as a model protein, but the results obtained are generally applicable. We will show that water entrapment, as well as the other mechanisms of protein stabilization in the dried state proposed so far, may be related to the formation of a dense hydrogen bonding network between excipient molecules. We will also present a simple phenomenological model capable of explaining the behavior and stabilizing effect provided by typical cryo- and lyo-protectants. This model uses, as input data, molecular properties which can be easily evaluated. We will finally show that the model predictions compare fairly well with experimental data.
Mapping of Ligand-Binding Cavities in Proteins
Andersson, C. David; Chen, Brian Y.; Linusson, Anna
2010-01-01
The complex interactions between proteins and small organic molecules (ligands) are intensively studied because they play key roles in biological processes and drug activities. Here, we present a novel approach to characterise and map the ligand-binding cavities of proteins without direct geometric comparison of structures, based on Principal Component Analysis of cavity properties (related mainly to size, polarity and charge). This approach can provide valuable information on the similarities, and dissimilarities, of binding cavities due to mutations, between-species differences and flexibility upon ligand-binding. The presented results show that information on ligand-binding cavity variations can complement information on protein similarity obtained from sequence comparisons. The predictive aspect of the method is exemplified by successful predictions of serine proteases that were not included in the model construction. The presented strategy to compare ligand-binding cavities of related and unrelated proteins has many potential applications within protein and medicinal chemistry, for example in the characterisation and mapping of “orphan structures”, selection of protein structures for docking studies in structure-based design and identification of proteins for selectivity screens in drug design programs. PMID:20034113
CABS-flex predictions of protein flexibility compared with NMR ensembles
Jamroz, Michal; Kolinski, Andrzej; Kmiecik, Sebastian
2014-01-01
Motivation: Identification of flexible regions of protein structures is important for understanding of their biological functions. Recently, we have developed a fast approach for predicting protein structure fluctuations from a single protein model: the CABS-flex. CABS-flex was shown to be an efficient alternative to conventional all-atom molecular dynamics (MD). In this work, we evaluate CABS-flex and MD predictions by comparison with protein structural variations within NMR ensembles. Results: Based on a benchmark set of 140 proteins, we show that the relative fluctuations of protein residues obtained from CABS-flex are well correlated to those of NMR ensembles. On average, this correlation is stronger than that between MD and NMR ensembles. In conclusion, CABS-flex is useful and complementary to MD in predicting protein regions that undergo conformational changes as well as the extent of such changes. Availability and implementation: The CABS-flex is freely available to all users at http://biocomp.chem.uw.edu.pl/CABSflex. Contact: sekmi@chem.uw.edu.pl Supplementary information: Supplementary data are available at Bioinformatics online. PMID:24735558
CABS-flex predictions of protein flexibility compared with NMR ensembles.
Jamroz, Michal; Kolinski, Andrzej; Kmiecik, Sebastian
2014-08-01
Identification of flexible regions of protein structures is important for understanding of their biological functions. Recently, we have developed a fast approach for predicting protein structure fluctuations from a single protein model: the CABS-flex. CABS-flex was shown to be an efficient alternative to conventional all-atom molecular dynamics (MD). In this work, we evaluate CABS-flex and MD predictions by comparison with protein structural variations within NMR ensembles. Based on a benchmark set of 140 proteins, we show that the relative fluctuations of protein residues obtained from CABS-flex are well correlated to those of NMR ensembles. On average, this correlation is stronger than that between MD and NMR ensembles. In conclusion, CABS-flex is useful and complementary to MD in predicting protein regions that undergo conformational changes as well as the extent of such changes. The CABS-flex is freely available to all users at http://biocomp.chem.uw.edu.pl/CABSflex. sekmi@chem.uw.edu.pl Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press.
Investigating homology between proteins using energetic profiles.
Wrabl, James O; Hilser, Vincent J
2010-03-26
Accumulated experimental observations demonstrate that protein stability is often preserved upon conservative point mutation. In contrast, less is known about the effects of large sequence or structure changes on the stability of a particular fold. Almost completely unknown is the degree to which stability of different regions of a protein is generally preserved throughout evolution. In this work, these questions are addressed through thermodynamic analysis of a large representative sample of protein fold space based on remote, yet accepted, homology. More than 3,000 proteins were computationally analyzed using the structural-thermodynamic algorithm COREX/BEST. Estimated position-specific stability (i.e., local Gibbs free energy of folding) and its component enthalpy and entropy were quantitatively compared between all proteins in the sample according to all-vs.-all pairwise structural alignment. It was discovered that the local stabilities of homologous pairs were significantly more correlated than those of non-homologous pairs, indicating that local stability was indeed generally conserved throughout evolution. However, the position-specific enthalpy and entropy underlying stability were less correlated, suggesting that the overall regional stability of a protein was more important than the thermodynamic mechanism utilized to achieve that stability. Finally, two different types of statistically exceptional evolutionary structure-thermodynamic relationships were noted. First, many homologous proteins contained regions of similar thermodynamics despite localized structure change, suggesting a thermodynamic mechanism enabling evolutionary fold change. Second, some homologous proteins with extremely similar structures nonetheless exhibited different local stabilities, a phenomenon previously observed experimentally in this laboratory. These two observations, in conjunction with the principal conclusion that homologous proteins generally conserved local stability, may provide guidance for a future thermodynamically informed classification of protein homology.
Maintenance of a Protein Structure in the Dynamic Evolution of TIMPs over 600 Million Years
Nicosia, Aldo; Maggio, Teresa; Costa, Salvatore; Salamone, Monica; Tagliavia, Marcello; Mazzola, Salvatore; Gianguzza, Fabrizio; Cuttitta, Angela
2016-01-01
Deciphering the events leading to protein evolution represents a challenge, especially for protein families showing complex evolutionary history. Among them, TIMPs represent an ancient eukaryotic protein family widely distributed in the animal kingdom. They are known to control the turnover of the extracellular matrix and are considered to arise early during metazoan evolution, arguably tuning essential features of tissue and epithelial organization. To probe the structure and molecular evolution of TIMPs within metazoans, we report the mining and structural characterization of a large data set of TIMPs over approximately 600 Myr. The TIMPs repertoire was explored starting from the Cnidaria phylum, coeval with the origins of connective tissue, to great apes and humans. Despite dramatic sequence differences compared with highest metazoans, the ancestral proteins displayed the canonical TIMP fold. Only small structural changes, represented by an α-helix located in the N-domain, have occurred over the evolution. Both the occurrence of such secondary structure elements and the relative solvent accessibility of the corresponding residues in the three-dimensional structures raises the possibility that these sites represent unconserved element prone to accept variations. PMID:26957029
Koharudin, Leonardus M I; Kollipara, Sireesha; Aiken, Christopher; Gronenborn, Angela M
2012-09-28
Oscillatoria agardhii agglutinin homolog (OAAH) proteins belong to a recently discovered lectin family. All members contain a sequence repeat of ~66 amino acids, with the number of repeats varying among different family members. Apart from data for the founding member OAA, neither three-dimensional structures, information about carbohydrate binding specificities, nor antiviral activity data have been available up to now for any other members of the OAAH family. To elucidate the structural basis for the antiviral mechanism of OAAHs, we determined the crystal structures of Pseudomonas fluorescens and Myxococcus xanthus lectins. Both proteins exhibit the same fold, resembling the founding family member, OAA, with minor differences in loop conformations. Carbohydrate binding studies by NMR and x-ray structures of glycan-lectin complexes reveal that the number of sugar binding sites corresponds to the number of sequence repeats in each protein. As for OAA, tight and specific binding to α3,α6-mannopentaose was observed. All the OAAH proteins described here exhibit potent anti-HIV activity at comparable levels. Altogether, our results provide structural details of the protein-carbohydrate interaction for this novel lectin family and insights into the molecular basis of their HIV inactivation properties.
Grandison, Scott; Roberts, Carl; Morris, Richard J
2009-03-01
Protein structures are not static entities consisting of equally well-determined atomic coordinates. Proteins undergo continuous motion, and as catalytic machines, these movements can be of high relevance for understanding function. In addition to this strong biological motivation for considering shape changes is the necessity to correctly capture different levels of detail and error in protein structures. Some parts of a structural model are often poorly defined, and the atomic displacement parameters provide an excellent means to characterize the confidence in an atom's spatial coordinates. A mathematical framework for studying these shape changes, and handling positional variance is therefore of high importance. We present an approach for capturing various protein structure properties in a concise mathematical framework that allows us to compare features in a highly efficient manner. We demonstrate how three-dimensional Zernike moments can be employed to describe functions, not only on the surface of a protein but throughout the entire molecule. A number of proof-of-principle examples are given which demonstrate how this approach may be used in practice for the representation of movement and uncertainty.
Structural Insights into the Degradation of Mcl-1 Induced by BH3 Domains
DOE Office of Scientific and Technical Information (OSTI.GOV)
Czabotar,P.; Lee, E.; van Delft, M.
2007-01-01
Apoptosis is held in check by prosurvival proteins of the Bcl-2 family. The distantly related BH3-only proteins bind to and antagonize them, thereby promoting apoptosis. Whereas binding of the BH3-only protein Noxa to prosurvival Mcl-1 induces Mcl-1 degradation by the proteasome, binding of another BH3-only ligand, Bim, elevates Mcl-1 protein levels. We compared the three-dimensional structures of the complexes formed between BH3 peptides of both Bim and Noxa, and we show that a discrete C-terminal sequence of the Noxa BH3 is necessary to instigate Mcl-1 degradation.
Hydrogen atoms in protein structures: high-resolution X-ray diffraction structure of the DFPase
2013-01-01
Background Hydrogen atoms represent about half of the total number of atoms in proteins and are often involved in substrate recognition and catalysis. Unfortunately, X-ray protein crystallography at usual resolution fails to access directly their positioning, mainly because light atoms display weak contributions to diffraction. However, sub-Ångstrom diffraction data, careful modeling and a proper refinement strategy can allow the positioning of a significant part of hydrogen atoms. Results A comprehensive study on the X-ray structure of the diisopropyl-fluorophosphatase (DFPase) was performed, and the hydrogen atoms were modeled, including those of solvent molecules. This model was compared to the available neutron structure of DFPase, and differences in the protein and the active site solvation were noticed. Conclusions A further examination of the DFPase X-ray structure provides substantial evidence about the presence of an activated water molecule that may constitute an interesting piece of information as regard to the enzymatic hydrolysis mechanism. PMID:23915572
Cao, Han; Ng, Marcus C K; Jusoh, Siti Azma; Tai, Hio Kuan; Siu, Shirley W I
2017-09-01
[Formula: see text]-Helical transmembrane proteins are the most important drug targets in rational drug development. However, solving the experimental structures of these proteins remains difficult, therefore computational methods to accurately and efficiently predict the structures are in great demand. We present an improved structure prediction method TMDIM based on Park et al. (Proteins 57:577-585, 2004) for predicting bitopic transmembrane protein dimers. Three major algorithmic improvements are introduction of the packing type classification, the multiple-condition decoy filtering, and the cluster-based candidate selection. In a test of predicting nine known bitopic dimers, approximately 78% of our predictions achieved a successful fit (RMSD <2.0 Å) and 78% of the cases are better predicted than the two other methods compared. Our method provides an alternative for modeling TM bitopic dimers of unknown structures for further computational studies. TMDIM is freely available on the web at https://cbbio.cis.umac.mo/TMDIM . Website is implemented in PHP, MySQL and Apache, with all major browsers supported.
TMDIM: an improved algorithm for the structure prediction of transmembrane domains of bitopic dimers
NASA Astrophysics Data System (ADS)
Cao, Han; Ng, Marcus C. K.; Jusoh, Siti Azma; Tai, Hio Kuan; Siu, Shirley W. I.
2017-09-01
α-Helical transmembrane proteins are the most important drug targets in rational drug development. However, solving the experimental structures of these proteins remains difficult, therefore computational methods to accurately and efficiently predict the structures are in great demand. We present an improved structure prediction method TMDIM based on Park et al. (Proteins 57:577-585, 2004) for predicting bitopic transmembrane protein dimers. Three major algorithmic improvements are introduction of the packing type classification, the multiple-condition decoy filtering, and the cluster-based candidate selection. In a test of predicting nine known bitopic dimers, approximately 78% of our predictions achieved a successful fit (RMSD <2.0 Å) and 78% of the cases are better predicted than the two other methods compared. Our method provides an alternative for modeling TM bitopic dimers of unknown structures for further computational studies. TMDIM is freely available on the web at https://cbbio.cis.umac.mo/TMDIM. Website is implemented in PHP, MySQL and Apache, with all major browsers supported.
Fischer, Axel W.; Bordignon, Enrica; Bleicken, Stephanie; García-Sáez, Ana J.; Jeschke, Gunnar; Meiler, Jens
2016-01-01
Structure determination remains a challenge for many biologically important proteins. In particular, proteins that adopt multiple conformations often evade crystallization in all biologically relevant states. Although computational de novo protein folding approaches often sample biologically relevant conformations, the selection of the most accurate model for different functional states remains a formidable challenge, in particular, for proteins with more than about 150 residues. Electron paramagnetic resonance (EPR) spectroscopy can obtain limited structural information for proteins in well-defined biological states and thereby assist in selecting biologically relevant conformations. The present study demonstrates that de novo folding methods are able to accurately sample the folds of 192-residue long soluble monomeric Bcl-2-associated X protein (BAX). The tertiary structures of the monomeric and homodimeric forms of BAX were predicted using the primary structure as well as 25 and 11 EPR distance restraints, respectively. The predicted models were subsequently compared to respective NMR/X-ray structures of BAX. EPR restraints improve the protein-size normalized root-mean-square-deviation (RMSD100) of the most accurate models with respect to the NMR/crystal structure from 5.9 Å to 3.9 Å and from 5.7 Å to 3.3 Å, respectively. Additionally, the model discrimination is improved, which is demonstrated by an improvement of the enrichment from 5% to 15% and from 13% to 21%, respectively. PMID:27129417
On the relationship between residue structural environment and sequence conservation in proteins.
Liu, Jen-Wei; Lin, Jau-Ji; Cheng, Chih-Wen; Lin, Yu-Feng; Hwang, Jenn-Kang; Huang, Tsun-Tsao
2017-09-01
Residues that are crucial to protein function or structure are usually evolutionarily conserved. To identify the important residues in protein, sequence conservation is estimated, and current methods rely upon the unbiased collection of homologous sequences. Surprisingly, our previous studies have shown that the sequence conservation is closely correlated with the weighted contact number (WCN), a measure of packing density for residue's structural environment, calculated only based on the C α positions of a protein structure. Moreover, studies have shown that sequence conservation is correlated with environment-related structural properties calculated based on different protein substructures, such as a protein's all atoms, backbone atoms, side-chain atoms, or side-chain centroid. To know whether the C α atomic positions are adequate to show the relationship between residue environment and sequence conservation or not, here we compared C α atoms with other substructures in their contributions to the sequence conservation. Our results show that C α positions are substantially equivalent to the other substructures in calculations of various measures of residue environment. As a result, the overlapping contributions between C α atoms and the other substructures are high, yielding similar structure-conservation relationship. Take the WCN as an example, the average overlapping contribution to sequence conservation is 87% between C α and all-atom substructures. These results indicate that only C α atoms of a protein structure could reflect sequence conservation at the residue level. © 2017 Wiley Periodicals, Inc.
Prediction of physical protein protein interactions
NASA Astrophysics Data System (ADS)
Szilágyi, András; Grimm, Vera; Arakaki, Adrián K.; Skolnick, Jeffrey
2005-06-01
Many essential cellular processes such as signal transduction, transport, cellular motion and most regulatory mechanisms are mediated by protein-protein interactions. In recent years, new experimental techniques have been developed to discover the protein-protein interaction networks of several organisms. However, the accuracy and coverage of these techniques have proven to be limited, and computational approaches remain essential both to assist in the design and validation of experimental studies and for the prediction of interaction partners and detailed structures of protein complexes. Here, we provide a critical overview of existing structure-independent and structure-based computational methods. Although these techniques have significantly advanced in the past few years, we find that most of them are still in their infancy. We also provide an overview of experimental techniques for the detection of protein-protein interactions. Although the developments are promising, false positive and false negative results are common, and reliable detection is possible only by taking a consensus of different experimental approaches. The shortcomings of experimental techniques affect both the further development and the fair evaluation of computational prediction methods. For an adequate comparative evaluation of prediction and high-throughput experimental methods, an appropriately large benchmark set of biophysically characterized protein complexes would be needed, but is sorely lacking.
Constraint Logic Programming approach to protein structure prediction.
Dal Palù, Alessandro; Dovier, Agostino; Fogolari, Federico
2004-11-30
The protein structure prediction problem is one of the most challenging problems in biological sciences. Many approaches have been proposed using database information and/or simplified protein models. The protein structure prediction problem can be cast in the form of an optimization problem. Notwithstanding its importance, the problem has very seldom been tackled by Constraint Logic Programming, a declarative programming paradigm suitable for solving combinatorial optimization problems. Constraint Logic Programming techniques have been applied to the protein structure prediction problem on the face-centered cube lattice model. Molecular dynamics techniques, endowed with the notion of constraint, have been also exploited. Even using a very simplified model, Constraint Logic Programming on the face-centered cube lattice model allowed us to obtain acceptable results for a few small proteins. As a test implementation their (known) secondary structure and the presence of disulfide bridges are used as constraints. Simplified structures obtained in this way have been converted to all atom models with plausible structure. Results have been compared with a similar approach using a well-established technique as molecular dynamics. The results obtained on small proteins show that Constraint Logic Programming techniques can be employed for studying protein simplified models, which can be converted into realistic all atom models. The advantage of Constraint Logic Programming over other, much more explored, methodologies, resides in the rapid software prototyping, in the easy way of encoding heuristics, and in exploiting all the advances made in this research area, e.g. in constraint propagation and its use for pruning the huge search space.
Brunak, S; Engelbrecht, J
1996-06-01
A direct comparison of experimentally determined protein structures and their corresponding protein coding mRNA sequences has been performed. We examine whether real world data support the hypothesis that clusters of rare codons correlate with the location of structural units in the resulting protein. The degeneracy of the genetic code allows for a biased selection of codons which may control the translational rate of the ribosome, and may thus in vivo have a catalyzing effect on the folding of the polypeptide chain. A complete search for GenBank nucleotide sequences coding for structural entries in the Brookhaven Protein Data Bank produced 719 protein chains with matching mRNA sequence, amino acid sequence, and secondary structure assignment. By neural network analysis, we found strong signals in mRNA sequence regions surrounding helices and sheets. These signals do not originate from the clustering of rare codons, but from the similarity of codons coding for very abundant amino acid residues at the N- and C-termini of helices and sheets. No correlation between the positioning of rare codons and the location of structural units was found. The mRNA signals were also compared with conserved nucleotide features of 16S-like ribosomal RNA sequences and related to mechanisms for maintaining the correct reading frame by the ribosome.
Quantifying the relationship between sequence and three-dimensional structure conservation in RNA
2010-01-01
Background In recent years, the number of available RNA structures has rapidly grown reflecting the increased interest on RNA biology. Similarly to the studies carried out two decades ago for proteins, which gave the fundamental grounds for developing comparative protein structure prediction methods, we are now able to quantify the relationship between sequence and structure conservation in RNA. Results Here we introduce an all-against-all sequence- and three-dimensional (3D) structure-based comparison of a representative set of RNA structures, which have allowed us to quantitatively confirm that: (i) there is a measurable relationship between sequence and structure conservation that weakens for alignments resulting in below 60% sequence identity, (ii) evolution tends to conserve more RNA structure than sequence, and (iii) there is a twilight zone for RNA homology detection. Discussion The computational analysis here presented quantitatively describes the relationship between sequence and structure for RNA molecules and defines a twilight zone region for detecting RNA homology. Our work could represent the theoretical basis and limitations for future developments in comparative RNA 3D structure prediction. PMID:20550657
Non-interacting surface solvation and dynamics in protein-protein interactions.
Visscher, Koen M; Kastritis, Panagiotis L; Bonvin, Alexandre M J J
2015-03-01
Protein-protein interactions control a plethora of cellular processes, including cell proliferation, differentiation, apoptosis, and signal transduction. Understanding how and why proteins interact will inevitably lead to novel structure-based drug design methods, as well as design of de novo binders with preferred interaction properties. At a structural and molecular level, interface and rim regions are not enough to fully account for the energetics of protein-protein binding, even for simple lock-and-key rigid binders. As we have recently shown, properties of the global surface might also play a role in protein-protein interactions. Here, we report on molecular dynamics simulations performed to understand solvent effects on protein-protein surfaces. We compare properties of the interface, rim, and non-interacting surface regions for five different complexes and their free components. Interface and rim residues become, as expected, less mobile upon complexation. However, non-interacting surface appears more flexible in the complex. Fluctuations of polar residues are always lower compared with charged ones, independent of the protein state. Further, stable water molecules are often observed around polar residues, in contrast to charged ones. Our analysis reveals that (a) upon complexation, the non-interacting surface can have a direct entropic compensation for the lower interface and rim entropy and (b) the mobility of the first hydration layer, which is linked to the stability of the protein-protein complex, is influenced by the local chemical properties of the surface. These findings corroborate previous hypotheses on the role of the hydration layer in shielding protein-protein complexes from unintended protein-protein interactions. © 2014 Wiley Periodicals, Inc.
Denesyuk, Alexander; Denessiouk, Konstantin; Johnson, Mark S
2018-02-01
An integrin-like β-propeller domain contains seven repeats of a four-stranded antiparallel β-sheet motif (blades). Previously we described a 3D structural motif within each blade of the integrin-type β-propeller. Here, we show unique structural links that join different blades of the β-propeller structure, which together with the structural motif for a single blade are repeated in a β-propeller to provide the functional top face of the barrel, found to be involved in protein-protein interactions and substrate recognition. We compare functional top face diagrams of the integrin-type β-propeller domain and two non-integrin type β-propeller domains of virginiamycin B lyase and WD Repeat-Containing Protein 5. Copyright © 2017 Elsevier Inc. All rights reserved.
Gollaher, C J; Fechner, K; Karlstad, M; Babayan, V K; Bistrian, B R
1993-01-01
This report investigates the effect of various levels of medium-chain/fish oil structured triglycerides on protein and energy metabolism in hypermetabolic rats. Male Sprague-Dawley rats (192 to 226 g) were continuously infused with isovolemic diets that provided 200 kcal/kg per day and 2 g of amino acid nitrogen per kilogram per day. The percentage of nonnitrogen calories as structured triglyceride was varied: no fat, 5%, 15%, or 30%. A 30% long-chain triglyceride diet was also provided as a control to compare the protein-sparing abilities of these two types of fat. Nitrogen excretion, plasma albumin, plasma triglycerides, and whole-body and liver and muscle protein kinetics were determined after 3 days of feeding. Whole-body protein breakdown, flux, and oxidation were similar in all groups. The 15% structured triglyceride diet maximized whole-body protein synthesis (p < .05). Liver fractional synthetic rate was significantly greater in animals receiving 5% of nonprotein calories as structured triglyceride (p < .05). Muscle fractional synthetic rate was unchanged. Plasma triglycerides were markedly elevated in the 30% structured triglyceride-fed rats. The 30% structured triglyceride diet maintained plasma albumin levels better than those diets containing no fat, 5% medium-chain triglyceride/fish oil structured triglyceride, or 30% long-chain triglycerides. Nitrogen excretion was lower in animals receiving 30% of nonnitrogen calories as a structured triglyceride than in those receiving 30% as long-chain triglycerides, but this difference did not reach statistical significance (p = .1). These data suggest that protein metabolism is optimized when structured triglyceride is provided at relatively low dietary fat intakes.
Ritchie, Andrew W; Webb, Lauren J
2015-11-05
Biological function emerges in large part from the interactions of biomacromolecules in the complex and dynamic environment of the living cell. For this reason, macromolecular interactions in biological systems are now a major focus of interest throughout the biochemical and biophysical communities. The affinity and specificity of macromolecular interactions are the result of both structural and electrostatic factors. Significant advances have been made in characterizing structural features of stable protein-protein interfaces through the techniques of modern structural biology, but much less is understood about how electrostatic factors promote and stabilize specific functional macromolecular interactions over all possible choices presented to a given molecule in a crowded environment. In this Feature Article, we describe how vibrational Stark effect (VSE) spectroscopy is being applied to measure electrostatic fields at protein-protein interfaces, focusing on measurements of guanosine triphosphate (GTP)-binding proteins of the Ras superfamily binding with structurally related but functionally distinct downstream effector proteins. In VSE spectroscopy, spectral shifts of a probe oscillator's energy are related directly to that probe's local electrostatic environment. By performing this experiment repeatedly throughout a protein-protein interface, an experimental map of measured electrostatic fields generated at that interface is determined. These data can be used to rationalize selective binding of similarly structured proteins in both in vitro and in vivo environments. Furthermore, these data can be used to compare to computational predictions of electrostatic fields to explore the level of simulation detail that is necessary to accurately predict our experimental findings.
Identification of structural domains in proteins by a graph heuristic.
Wernisch, L; Hunting, M; Wodak, S J
1999-05-15
A novel automatic procedure for identifying domains from protein atomic coordinates is presented. The procedure, termed STRUDL (STRUctural Domain Limits), does not take into account information on secondary structures and handles any number of domains made up of contiguous or non-contiguous chain segments. The core algorithm uses the Kernighan-Lin graph heuristic to partition the protein into residue sets which display minimum interactions between them. These interactions are deduced from the weighted Voronoi diagram. The generated partitions are accepted or rejected on the basis of optimized criteria, representing basic expected physical properties of structural domains. The graph heuristic approach is shown to be very effective, it approximates closely the exact solution provided by a branch and bound algorithm for a number of test proteins. In addition, the overall performance of STRUDL is assessed on a set of 787 representative proteins from the Protein Data Bank by comparison to domain definitions in the CATH protein classification. The domains assigned by STRUDL agree with the CATH assignments in at least 81% of the tested proteins. This result is comparable to that obtained previously using PUU (Holm and Sander, Proteins 1994;9:256-268), the only other available algorithm designed to identify domains with any number of non-contiguous chain segments. A detailed discussion of the structures for which our assignments differ from those in CATH brings to light some clear inconsistencies between the concept of structural domains based on minimizing inter-domain interactions and that of delimiting structural motifs that represent acceptable folding topologies or architectures. Considering both concepts as complementary and combining them in a layered approach might be the way forward.
Protein structure prediction with local adjust tabu search algorithm
2014-01-01
Background Protein folding structure prediction is one of the most challenging problems in the bioinformatics domain. Because of the complexity of the realistic protein structure, the simplified structure model and the computational method should be adopted in the research. The AB off-lattice model is one of the simplification models, which only considers two classes of amino acids, hydrophobic (A) residues and hydrophilic (B) residues. Results The main work of this paper is to discuss how to optimize the lowest energy configurations in 2D off-lattice model and 3D off-lattice model by using Fibonacci sequences and real protein sequences. In order to avoid falling into local minimum and faster convergence to the global minimum, we introduce a novel method (SATS) to the protein structure problem, which combines simulated annealing algorithm and tabu search algorithm. Various strategies, such as the new encoding strategy, the adaptive neighborhood generation strategy and the local adjustment strategy, are adopted successfully for high-speed searching the optimal conformation corresponds to the lowest energy of the protein sequences. Experimental results show that some of the results obtained by the improved SATS are better than those reported in previous literatures, and we can sure that the lowest energy folding state for short Fibonacci sequences have been found. Conclusions Although the off-lattice models is not very realistic, they can reflect some important characteristics of the realistic protein. It can be found that 3D off-lattice model is more like native folding structure of the realistic protein than 2D off-lattice model. In addition, compared with some previous researches, the proposed hybrid algorithm can more effectively and more quickly search the spatial folding structure of a protein chain. PMID:25474708
Binding ligand prediction for proteins using partial matching of local surface patches.
Sael, Lee; Kihara, Daisuke
2010-01-01
Functional elucidation of uncharacterized protein structures is an important task in bioinformatics. We report our new approach for structure-based function prediction which captures local surface features of ligand binding pockets. Function of proteins, specifically, binding ligands of proteins, can be predicted by finding similar local surface regions of known proteins. To enable partial comparison of binding sites in proteins, a weighted bipartite matching algorithm is used to match pairs of surface patches. The surface patches are encoded with the 3D Zernike descriptors. Unlike the existing methods which compare global characteristics of the protein fold or the global pocket shape, the local surface patch method can find functional similarity between non-homologous proteins and binding pockets for flexible ligand molecules. The proposed method improves prediction results over global pocket shape-based method which was previously developed by our group.
Binding Ligand Prediction for Proteins Using Partial Matching of Local Surface Patches
Sael, Lee; Kihara, Daisuke
2010-01-01
Functional elucidation of uncharacterized protein structures is an important task in bioinformatics. We report our new approach for structure-based function prediction which captures local surface features of ligand binding pockets. Function of proteins, specifically, binding ligands of proteins, can be predicted by finding similar local surface regions of known proteins. To enable partial comparison of binding sites in proteins, a weighted bipartite matching algorithm is used to match pairs of surface patches. The surface patches are encoded with the 3D Zernike descriptors. Unlike the existing methods which compare global characteristics of the protein fold or the global pocket shape, the local surface patch method can find functional similarity between non-homologous proteins and binding pockets for flexible ligand molecules. The proposed method improves prediction results over global pocket shape-based method which was previously developed by our group. PMID:21614188
Sequence composition and environment effects on residue fluctuations in protein structures
NASA Astrophysics Data System (ADS)
Ruvinsky, Anatoly M.; Vakser, Ilya A.
2010-10-01
Structure fluctuations in proteins affect a broad range of cell phenomena, including stability of proteins and their fragments, allosteric transitions, and energy transfer. This study presents a statistical-thermodynamic analysis of relationship between the sequence composition and the distribution of residue fluctuations in protein-protein complexes. A one-node-per-residue elastic network model accounting for the nonhomogeneous protein mass distribution and the interatomic interactions through the renormalized inter-residue potential is developed. Two factors, a protein mass distribution and a residue environment, were found to determine the scale of residue fluctuations. Surface residues undergo larger fluctuations than core residues in agreement with experimental observations. Ranking residues over the normalized scale of fluctuations yields a distinct classification of amino acids into three groups: (i) highly fluctuating-Gly, Ala, Ser, Pro, and Asp, (ii) moderately fluctuating-Thr, Asn, Gln, Lys, Glu, Arg, Val, and Cys, and (iii) weakly fluctuating-Ile, Leu, Met, Phe, Tyr, Trp, and His. The structural instability in proteins possibly relates to the high content of the highly fluctuating residues and a deficiency of the weakly fluctuating residues in irregular secondary structure elements (loops), chameleon sequences, and disordered proteins. Strong correlation between residue fluctuations and the sequence composition of protein loops supports this hypothesis. Comparing fluctuations of binding site residues (interface residues) with other surface residues shows that, on average, the interface is more rigid than the rest of the protein surface and Gly, Ala, Ser, Cys, Leu, and Trp have a propensity to form more stable docking patches on the interface. The findings have broad implications for understanding mechanisms of protein association and stability of protein structures.
Characterization of the Structural Gene Promoter of Aedes aegypti Densovirus
Ward, Todd W.; Kimmick, Michael W.; Afanasiev, Boris N.; Carlson, Jonathan O.
2001-01-01
Aedes aegypti densonucleosis virus (AeDNV) has two promoters that have been shown to be active by reporter gene expression analysis (B. N. Afanasiev, Y. V. Koslov, J. O. Carlson, and B. J. Beaty, Exp. Parasitol. 79:322–339, 1994). Northern blot analysis of cells infected with AeDNV revealed two transcripts 1,200 and 3,500 nucleotides in length that are assumed to express the structural protein (VP) gene and nonstructural protein genes, respectively. Primer extension was used to map the transcriptional start site of the structural protein gene. Surprisingly, the structural protein gene transcript began at an initiator consensus sequence, CAGT, 60 nucleotides upstream from the map unit 61 TATAA sequence previously thought to define the promoter. Constructs with the β-galactosidase gene fused to the structural protein gene were used to determine elements necessary for promoter function. Deletion or mutation of the initiator sequence, CAGT, reduced protein expression by 93%, whereas mutation of the TATAA sequence at map unit 61 had little effect. An additional open reading frame was observed upstream of the structural protein gene that can express β-galactosidase at a low level (20% of that of VP fusions). Expression of the AeDNV structural protein gene was shown to be stimulated by the major nonstructural protein NS1 (Afanasiev et al., Exp. parasitol., 1994). To determine the sequences required for transactivation, expression of structural protein gene–β-galactosidase gene fusion constructs differing in AeDNV genome content was measured with and without NS1. The presence of NS1 led to an 8- to 10-fold increase in expression when either genomic end was present, compared to a 2-fold increase with a construct lacking the genomic ends. An even higher (37-fold) increase in expression occurred with both genomic ends present; however, this was in part due to template replication as shown by Southern blot analysis. These data indicate the location and importance of various elements necessary for efficient protein expression and transactivation from the structural protein gene promoter of AeDNV. PMID:11152505
Gomaa, Walaa M S; Mosaad, Gamal M; Yu, Peiqiang
2018-04-21
The objectives of this study were to: (1) Use molecular spectroscopy as a novel technique to quantify protein molecular structures in relation to its chemical profiles and bioenergy values in oil-seeds and co-products from bio-oil processing. (2) Determine and compare: (a) protein molecular structure using Fourier transform infrared (FT/IR-ATR) molecular spectroscopy technique; (b) bioactive compounds, anti-nutritional factors, and chemical composition; and (c) bioenergy values in oil seeds (canola seeds), co-products (meal or pellets) from bio-oil processing plants in Canada in comparison with China. (3) Determine the relationship between protein molecular structural features and nutrient profiles in oil-seeds and co-products from bio-oil processing. Our results showed the possibility to characterize protein molecular structure using FT/IR molecular spectroscopy. Processing induced changes between oil seeds and co-products were found in the chemical, bioenergy profiles and protein molecular structure. However, no strong correlation was found between the chemical and nutrient profiles of oil seeds (canola seeds) and their protein molecular structure. On the other hand, co-products were strongly correlated with protein molecular structure in the chemical profile and bioenergy values. Generally, comparisons of oil seeds (canola seeds) and co-products (meal or pellets) in Canada, in China, and between Canada and China indicated the presence of variations among different crusher plants and bio-oil processing products.
Duffy, Fergal J; O'Donovan, Darragh; Devocelle, Marc; Moran, Niamh; O'Connell, David J; Shields, Denis C
2015-03-23
Protein-protein and protein-peptide interactions are responsible for the vast majority of biological functions in vivo, but targeting these interactions with small molecules has historically been difficult. What is required are efficient combined computational and experimental screening methods to choose among a number of potential protein interfaces worthy of targeting lead macrocyclic compounds for further investigation. To achieve this, we have generated combinatorial 3D virtual libraries of short disulfide-bonded peptides and compared them to pharmacophore models of important protein-protein and protein-peptide structures, including short linear motifs (SLiMs), protein-binding peptides, and turn structures at protein-protein interfaces, built from 3D models available in the Protein Data Bank. We prepared a total of 372 reference pharmacophores, which were matched against 108,659 multiconformer cyclic peptides. After normalization to exclude nonspecific cyclic peptides, the top hits notably are enriched for mimetics of turn structures, including a turn at the interaction surface of human α thrombin, and also feature several protein-binding peptides. The top cyclic peptide hits also cover the critical "hot spot" interaction sites predicted from the interaction crystal structure. We have validated our method by testing cyclic peptides predicted to inhibit thrombin, a key protein in the blood coagulation pathway of important therapeutic interest, identifying a cyclic peptide inhibitor with lead-like activity. We conclude that protein interfaces most readily targetable by cyclic peptides and related macrocyclic drugs may be identified computationally among a set of candidate interfaces, accelerating the choice of interfaces against which lead compounds may be screened.
Sokalingam, Sriram; Raghunathan, Govindan; Soundrarajan, Nagasundarapandian; Lee, Sun-Gu
2012-01-01
Two positively charged basic amino acids, arginine and lysine, are mostly exposed to protein surface, and play important roles in protein stability by forming electrostatic interactions. In particular, the guanidinium group of arginine allows interactions in three possible directions, which enables arginine to form a larger number of electrostatic interactions compared to lysine. The higher pKa of the basic residue in arginine may also generate more stable ionic interactions than lysine. This paper reports an investigation whether the advantageous properties of arginine over lysine can be utilized to enhance protein stability. A variant of green fluorescent protein (GFP) was created by mutating the maximum possible number of lysine residues on the surface to arginines while retaining the activity. When the stability of the variant was examined under a range of denaturing conditions, the variant was relatively more stable compared to control GFP in the presence of chemical denaturants such as urea, alkaline pH and ionic detergents, but the thermal stability of the protein was not changed. The modeled structure of the variant indicated putative new salt bridges and hydrogen bond interactions that help improve the rigidity of the protein against different chemical denaturants. Structural analyses of the electrostatic interactions also confirmed that the geometric properties of the guanidinium group in arginine had such effects. On the other hand, the altered electrostatic interactions induced by the mutagenesis of surface lysines to arginines adversely affected protein folding, which decreased the productivity of the functional form of the variant. These results suggest that the surface lysine mutagenesis to arginines can be considered one of the parameters in protein stability engineering. PMID:22792305
Pérez-Munive, Clara; Blumenthal, Sonal S D; de la Espina, Susana Moreno Díaz
2012-01-01
Plant cells have a well organized nucleus and nuclear matrix, but lack orthologues of the main structural components of the metazoan nuclear matrix. Although data is limited, most plant nuclear structural proteins are coiled-coil proteins, such as the NIFs (nuclear intermediate filaments) in Pisum sativum that cross-react with anti-intermediate filament and anti-lamin antibodies, form filaments 6-12 nm in diameter in vitro, and may play the role of lamins. We have investigated the conservation and features of NIFs in a monocot species, Allium cepa, and compared them with onion lamin-like proteins. Polyclonal antisera against the pea 65 kDa NIF were used in 1D and 2D Western blots, ICM (imunofluorescence confocal microscopy) and IEM (immunoelectron microscopy). Their presence in the nuclear matrix was analysed by differential extraction of nuclei, and their association with structural spectrin-like proteins by co-immunoprecipitation and co-localization in ICM. NIF is a conserved structural component of the nucleus and its matrix in monocots with Mr and pI values similar to those of pea 65 kDa NIF, which localized to the nuclear envelope, perichromatin domains and foci, and to the nuclear matrix, interacting directly with structural nuclear spectrin-like proteins. Its similarities with some of the proteins described as onion lamin-like proteins suggest that they are highly related or perhaps the same proteins.
Effects of power ultrasound on oxidation and structure of beef proteins during curing processing.
Kang, Da-Cheng; Zou, Yun-He; Cheng, Yu-Ping; Xing, Lu-Juan; Zhou, Guang-Hong; Zhang, Wan-Gang
2016-11-01
The aim of this study was to evaluate the effects of power ultrasound intensity (PUS, 2.39, 6.23, 11.32 and 20.96Wcm(-2)) and treatment time (30, 60, 90 and 120min) on the oxidation and structure of beef proteins during the brining procedure with 6% NaCl concentration. The investigation was conducted with an ultrasonic generator with the frequency of 20kHz and fresh beef at 48h after slaughter. Analysis of TBARS (Thiobarbituric acid reactive substances) contents showed that PUS treatment significantly increased the extent of lipid oxidation compared to static brining (P<0.05). As indicators of protein oxidation, the carbonyl contents were significantly affected by PUS (P<0.05). SDS-PAGE analysis showed that PUS treatment increased protein aggregation through disulfide cross-linking, indicated by the decreasing content of total sulfhydryl groups which would contribute to protein oxidation. In addition, changes in protein structure after PUS treatment are suggested by the increases in free sulfhydryl residues and protein surface hydrophobicity. Fourier transformed infrared spectroscopy (FTIR) provided further information about the changes in protein secondary structures with increases in β-sheet and decreases in α-helix contents after PUS processing. These results indicate that PUS leads to changes in structures and oxidation of beef proteins caused by mechanical effects of cavitation and the resultant generation of free radicals. Copyright © 2016 Elsevier B.V. All rights reserved.
Superimposition of protein structures with dynamically weighted RMSD.
Wu, Di; Wu, Zhijun
2010-02-01
In protein modeling, one often needs to superimpose a group of structures for a protein. A common way to do this is to translate and rotate the structures so that the square root of the sum of squares of coordinate differences of the atoms in the structures, called the root-mean-square deviation (RMSD) of the structures, is minimized. While it has provided a general way of aligning a group of structures, this approach has not taken into account the fact that different atoms may have different properties and they should be compared differently. For this reason, when superimposed with RMSD, the coordinate differences of different atoms should be evaluated with different weights. The resulting RMSD is called the weighted RMSD (wRMSD). Here we investigate the use of a special wRMSD for superimposing a group of structures with weights assigned to the atoms according to certain thermal motions of the atoms. We call such an RMSD the dynamically weighted RMSD (dRMSD). We show that the thermal motions of the atoms can be obtained from several sources such as the mean-square fluctuations that can be estimated by Gaussian network model analysis. We show that the superimposition of structures with dRMSD can successfully identify protein domains and protein motions, and that it has important implications in practice, e.g., in aligning the ensemble of structures determined by nuclear magnetic resonance.
Jafari, Rahim; Sadeghi, Mehdi; Mirzaie, Mehdi
2016-05-01
The approaches taken to represent and describe structural features of the macromolecules are of major importance when developing computational methods for studying and predicting their structures and interactions. This study attempts to explore the significance of Delaunay tessellation for the definition of atomic interactions by evaluating its impact on the performance of scoring protein-protein docking prediction. Two sets of knowledge-based scoring potentials are extracted from a training dataset of native protein-protein complexes. The potential of the first set is derived using atomic interactions extracted from Delaunay tessellated structures. The potential of the second set is calculated conventionally, that is, using atom pairs whose interactions were determined by their separation distances. The scoring potentials were tested against two different docking decoy sets and their performances were compared. The results show that, if properly optimized, the Delaunay-based scoring potentials can achieve higher success rate than the usual scoring potentials. These results and the results of a previous study on the use of Delaunay-based potentials in protein fold recognition, all point to the fact that Delaunay tessellation of protein structure can provide a more realistic definition of atomic interaction, and therefore, if appropriately utilized, may be able to improve the accuracy of pair potentials. Copyright © 2016 Elsevier Inc. All rights reserved.
Evolutionary Strategies for Protein Folding
NASA Astrophysics Data System (ADS)
Murthy Gopal, Srinivasa; Wenzel, Wolfgang
2006-03-01
The free energy approach for predicting the protein tertiary structure describes the native state of a protein as the global minimum of an appropriate free-energy forcefield. The low-energy region of the free-energy landscape of a protein is extremely rugged. Efficient optimization methods must therefore speed up the search for the global optimum by avoiding high energy transition states, adapt large scale moves or accept unphysical intermediates. Here we investigate an evolutionary strategies(ES) for optimizing a protein conformation in our all-atom free-energy force field([1],[2]). A set of random conformations is evolved using an ES to get a diverse population containing low energy structure. The ES is shown to balance energy improvement and yet maintain diversity in structures. The ES is implemented as a master-client model for distributed computing. Starting from random structures and by using this optimization technique, we were able to fold a 20 amino-acid helical protein and 16 amino-acid beta hairpin[3]. We compare ES to basin hopping method. [1]T. Herges and W. Wenzel,Biophys.J. 87,3100(2004) [2] A. Verma and W. Wenzel Stabilization and folding of beta-sheet and alpha-helical proteins in an all-atom free energy model(submitted)(2005) [3] S. M. Gopal and W. Wenzel Evolutionary Strategies for Protein Folding (in preparation)
NASA Astrophysics Data System (ADS)
Khairudin, Nurul Bahiyah Ahmad; Wahab, Habibah A.
In the current work, the structure of the enzyme CC chemokine eotaxin-3 (1G2S) was chosen as a case study to investigate the effects of gas phase on the predicted protein conformation using molecular dynamics simulation. Generally, simulating proteins in the gas phase tend to suffer from various drawbacks, among which excessive numbers of protein-protein hydrogen bonds. However, current results showed that the effects of gas phase simulation on 1G2S did not amplify the protein-protein hydrogen bonds. It was also found that some of the hydrogen bonds which were crucial in maintaining the secondary structural elements were disrupted. The predicted models showed high values of RMSD, 11.5 Å and 13.5 Å for both vacuum and explicit solvent simulations, respectively, indicating that the conformers were very much different from the native conformation. Even though the RMSD value for the in vacuo model was slightly lower, it somehow suffered from lower fraction of native contacts, poor hydrogen bonding networks and fewer occurrences of secondary structural elements compared to the solvated model. This finding supports the notion that water plays a dominant role in guiding the protein to fold along the correct path.
Cortés-Ruiz, Juan A; Pacheco-Aguilar, Ramón; Ramírez-Suárez, Juan C; Lugo-Sánchez, Maria E; García-Orozco, Karina D; Sotelo-Mundo, Rogerio R; Peña-Ramos, Aida
2016-04-01
Conformational and thermal-rheological properties of acidic (APC) and neutral (NPC) protein concentrates were evaluated and compared to those of squid (Dosidicus gigas) muscle proteins (SM). Surface hydrophobicity, sulfhydryl status, secondary structure profile, differential scanning calorimetry and oscillatory dynamic rheology were used to evaluate the effect of treatments on protein properties. Acidic condition during the washing process (APC) promoted structural and conformational changes in the protein present in the concentrate produced. These changes were enhanced during the heat setting of the corresponding sol. Results demonstrate that washing squid muscle under the proposed acidic conditions is a feasible technological alternative for squid-based surimi production improving its yield and gel-forming ability. Copyright © 2015. Published by Elsevier Ltd.
Comparison of effect of gamma ray irradiation on wild-type and N-terminal mutants of αA-crystallin.
Ramkumar, Srinivasagan; Fujii, Noriko; Fujii, Norihiko; Thankappan, Bency; Sakaue, Hiroaki; Ingu, Kim; Natarajaseenivasan, Kalimuthusamy; Anbarasu, Kumarasamy
2014-01-01
To study the comparative structural and functional changes between wild-type (wt) and N-terminal congenital cataract causing αA-crystallin mutants (R12C, R21L, R49C, and R54C) upon exposure to different dosages of gamma rays. Alpha A crystallin N-terminal mutants were created with the site-directed mutagenesis method. The recombinantly overexpressed and purified wt and mutant proteins were used for further studies. A (60)Co source was used to generate gamma rays to irradiate wild and mutant proteins at dosages of 0.5, 1.0, and 2.0 kGy. The biophysical property of the gamma irradiated (GI) and non-gamma irradiated (NGI) αA-crystallin wt and N-terminal mutants were determined. Oligomeric size was determined by size exclusion high-performance liquid chromatography (HPLC), the secondary structure with circular dichroism (CD) spectrometry, conformation of proteins with surface hydrophobicity, and the functional characterization were determined regarding chaperone activity using the alcohol dehydrogenase (ADH) aggregation assay. αA-crystallin N-terminal mutants formed high molecular weight (HMW) cross-linked products as well as aggregates when exposed to GI compared to the NGI wt counterparts. Furthermore, all mutants exhibited changed β-sheet and random coil structure. The GI mutants demonstrated decreased surface hydrophobicity when compared to αA-crystallin wt at 0, 1.0, and 1.5 kGy; however, at 2.0 kGy a drastic increase in hydrophobicity was observed only in the mutant R54C, not the wt. In contrast, chaperone activity toward ADH was gradually elevated at the minimum level in all GI mutants, and significant elevation was observed in the R12C mutant. Our findings suggest that the N-terminal mutants of αA-crystallin are structurally and functionally more sensitive to GI when compared to their NGI counterparts and wt. Protein oxidation as a result of gamma irradiation drives the protein to cross-link and aggregate culminating in cataract formation.
Thapliyal, Charu; Jain, Neha; Chaudhuri, Pratima
2015-01-01
A protein, differing in origin, may exhibit variable physicochemical behaviour, difference in sequence homology, fold and function. Thus studying structure-function relationship of proteins from altered sources is meaningful in the sense that it may give rise to comparative aspects of their sequence-structure-function relationship. Dihydrofolate reductase is an enzyme involved in cell cycle regulation. It is a significant enzyme as.a target for developing anticancer drugs. Hence, detailed understanding of structure-function relationships of wide variants of the enzyme dihydrofolate reductase would be important for developing an inhibitor or an antagonist against the enzyme involved in the cellular developmental processes. In this communication, we have reported the comparative structure-function relationship between E. coli and human dihydrofolate reductase. The differences in the unfolding behaviour of these two proteins have been investigated to understand various properties of these two proteins like relative' stability differences and variation in conformational changes under identical denaturing conditions. The equilibrium unfolding mechanism of dihydrofolate reductase proteins using guanidine hydrochloride as a denaturant in the presence of various types of osmolytes has been monitored using loss in enzymatic activity, intrinsic tryptophan fluorescence and an extrinsic fluorophore 8-anilino-1-naphthalene-sulfonic acid as probes. It has been observed that osmolytes, such as 1M sucrose, and 30% glycerol, provided enhanced stability to both variants of dihydrofolate reductase. Their level of stabilisation has been observed to be dependent on intrinsic protein stability. It was observed that 100 mM proline does not show any 'significant stabilisation to either of dihydrofolate reductases. In the present study, it has been observed that the human protein is relatively less stable than the E.coli counterpart.
Analysis of zinc binding sites in protein crystal structures.
Alberts, I L; Nadassy, K; Wodak, S J
1998-08-01
The geometrical properties of zinc binding sites in a dataset of high quality protein crystal structures deposited in the Protein Data Bank have been examined to identify important differences between zinc sites that are directly involved in catalysis and those that play a structural role. Coordination angles in the zinc primary coordination sphere are compared with ideal values for each coordination geometry, and zinc coordination distances are compared with those in small zinc complexes from the Cambridge Structural Database as a guide of expected trends. We find that distances and angles in the primary coordination sphere are in general close to the expected (or ideal) values. Deviations occur primarily for oxygen coordinating atoms and are found to be mainly due to H-bonding of the oxygen coordinating ligand to protein residues, bidentate binding arrangements, and multi-zinc sites. We find that H-bonding of oxygen containing residues (or water) to zinc bound histidines is almost universal in our dataset and defines the elec-His-Zn motif. Analysis of the stereochemistry shows that carboxyl elec-His-Zn motifs are geometrically rigid, while water elec-His-Zn motifs show the most geometrical variation. As catalytic motifs have a higher proportion of carboxyl elec atoms than structural motifs, they provide a more rigid framework for zinc binding. This is understood biologically, as a small distortion in the zinc position in an enzyme can have serious consequences on the enzymatic reaction. We also analyze the sequence pattern of the zinc ligands and residues that provide elecs, and identify conserved hydrophobic residues in the endopeptidases that also appear to contribute to stabilizing the catalytic zinc site. A zinc binding template in protein crystal structures is derived from these observations.
Hidden relationships between metalloproteins unveiled by structural comparison of their metal sites
NASA Astrophysics Data System (ADS)
Valasatava, Yana; Andreini, Claudia; Rosato, Antonio
2015-03-01
Metalloproteins account for a substantial fraction of all proteins. They incorporate metal atoms, which are required for their structure and/or function. Here we describe a new computational protocol to systematically compare and classify metal-binding sites on the basis of their structural similarity. These sites are extracted from the MetalPDB database of minimal functional sites (MFSs) in metal-binding biological macromolecules. Structural similarity is measured by the scoring function of the available MetalS2 program. Hierarchical clustering was used to organize MFSs into clusters, for each of which a representative MFS was identified. The comparison of all representative MFSs provided a thorough structure-based classification of the sites analyzed. As examples, the application of the proposed computational protocol to all heme-binding proteins and zinc-binding proteins of known structure highlighted the existence of structural subtypes, validated known evolutionary links and shed new light on the occurrence of similar sites in systems at different evolutionary distances. The present approach thus makes available an innovative viewpoint on metalloproteins, where the functionally crucial metal sites effectively lead the discovery of structural and functional relationships in a largely protein-independent manner.
Automated prediction of protein function and detection of functional sites from structure.
Pazos, Florencio; Sternberg, Michael J E
2004-10-12
Current structural genomics projects are yielding structures for proteins whose functions are unknown. Accordingly, there is a pressing requirement for computational methods for function prediction. Here we present PHUNCTIONER, an automatic method for structure-based function prediction using automatically extracted functional sites (residues associated to functions). The method relates proteins with the same function through structural alignments and extracts 3D profiles of conserved residues. Functional features to train the method are extracted from the Gene Ontology (GO) database. The method extracts these features from the entire GO hierarchy and hence is applicable across the whole range of function specificity. 3D profiles associated with 121 GO annotations were extracted. We tested the power of the method both for the prediction of function and for the extraction of functional sites. The success of function prediction by our method was compared with the standard homology-based method. In the zone of low sequence similarity (approximately 15%), our method assigns the correct GO annotation in 90% of the protein structures considered, approximately 20% higher than inheritance of function from the closest homologue.
Sivakolundu, Sivashankar G; Mabrouk, Patricia Ann
2003-05-01
The complete solution structure of ferrocytochrome c in 30% acetonitrile/70% water has been determined using high-field 1D and 2D (1)H NMR methods and deposited in the Protein Data Bank with codes 1LC1 and 1LC2. This is the first time a complete solution protein structure has been determined for a protein in nonaqueous media. Ferrocyt c retains a native protein secondary structure (five alpha-helices and two omega loops) in 30% acetonitrile. H18 and M80 residues are the axial heme ligands, as in aqueous solution. Residues believed to be axial heme ligands in the alkaline-like conformers of ferricyt c, specifically H33 and K72, are positioned close to the heme iron. The orientations of both heme propionates are markedly different in 30% acetonitrile/70% water. Comparative structural analysis of reduced cyt c in 30% acetonitrile/70% water solution with cyt c in different environments has given new insight into the cyt c folding mechanism, the electron transfer pathway, and cell apoptosis.
Simplified Protein Models: Predicting Folding Pathways and Structure Using Amino Acid Sequences
NASA Astrophysics Data System (ADS)
Adhikari, Aashish N.; Freed, Karl F.; Sosnick, Tobin R.
2013-07-01
We demonstrate the ability of simultaneously determining a protein’s folding pathway and structure using a properly formulated model without prior knowledge of the native structure. Our model employs a natural coordinate system for describing proteins and a search strategy inspired by the observation that real proteins fold in a sequential fashion by incrementally stabilizing nativelike substructures or “foldons.” Comparable folding pathways and structures are obtained for the twelve proteins recently studied using atomistic molecular dynamics simulations [K. Lindorff-Larsen, S. Piana, R. O. Dror, D. E. Shaw, Science 334, 517 (2011)], with our calculations running several orders of magnitude faster. We find that nativelike propensities in the unfolded state do not necessarily determine the order of structure formation, a departure from a major conclusion of the molecular dynamics study. Instead, our results support a more expansive view wherein intrinsic local structural propensities may be enhanced or overridden in the folding process by environmental context. The success of our search strategy validates it as an expedient mechanism for folding both in silico and in vivo.
Systematic Validation of Protein Force Fields against Experimental Data
Eastwood, Michael P.; Dror, Ron O.; Shaw, David E.
2012-01-01
Molecular dynamics simulations provide a vehicle for capturing the structures, motions, and interactions of biological macromolecules in full atomic detail. The accuracy of such simulations, however, is critically dependent on the force field—the mathematical model used to approximate the atomic-level forces acting on the simulated molecular system. Here we present a systematic and extensive evaluation of eight different protein force fields based on comparisons of experimental data with molecular dynamics simulations that reach a previously inaccessible timescale. First, through extensive comparisons with experimental NMR data, we examined the force fields' abilities to describe the structure and fluctuations of folded proteins. Second, we quantified potential biases towards different secondary structure types by comparing experimental and simulation data for small peptides that preferentially populate either helical or sheet-like structures. Third, we tested the force fields' abilities to fold two small proteins—one α-helical, the other with β-sheet structure. The results suggest that force fields have improved over time, and that the most recent versions, while not perfect, provide an accurate description of many structural and dynamical properties of proteins. PMID:22384157
Exploring Fold Space Preferences of New-born and Ancient Protein Superfamilies
Edwards, Hannah; Abeln, Sanne; Deane, Charlotte M.
2013-01-01
The evolution of proteins is one of the fundamental processes that has delivered the diversity and complexity of life we see around ourselves today. While we tend to define protein evolution in terms of sequence level mutations, insertions and deletions, it is hard to translate these processes to a more complete picture incorporating a polypeptide's structure and function. By considering how protein structures change over time we can gain an entirely new appreciation of their long-term evolutionary dynamics. In this work we seek to identify how populations of proteins at different stages of evolution explore their possible structure space. We use an annotation of superfamily age to this space and explore the relationship between these ages and a diverse set of properties pertaining to a superfamily's sequence, structure and function. We note several marked differences between the populations of newly evolved and ancient structures, such as in their length distributions, secondary structure content and tertiary packing arrangements. In particular, many of these differences suggest a less elaborate structure for newly evolved superfamilies when compared with their ancient counterparts. We show that the structural preferences we report are not a residual effect of a more fundamental relationship with function. Furthermore, we demonstrate the robustness of our results, using significant variation in the algorithm used to estimate the ages. We present these age estimates as a useful tool to analyse protein populations. In particularly, we apply this in a comparison of domains containing greek key or jelly roll motifs. PMID:24244135
Lemieux, M Joanne
2007-01-01
The major facilitator superfamily (MFS) of transporters represents the largest family of secondary active transporters and has a diverse range of substrates. With structural information for four MFS transporters, we can see a strong structural commonality suggesting, as predicted, a common architecture for MFS transporters. The rate for crystal structure determination of MFS transporters is slow, making modeling of both prokaryotic and eukaryotic transporters more enticing. In this review, models of eukaryotic transporters Glut1, G6PT, OCT1, OCT2 and Pho84, based on the crystal structures of the prokaryotic GlpT, based on the crystal structure of LacY are discussed. The techniques used to generate the different models are compared. In addition, the validity of these models and the strategy of using prokaryotic crystal structures to model eukaryotic proteins are discussed. For comparison, E. coli GlpT was modeled based on the E. coli LacY structure and compared to the crystal structure of GlpT demonstrating that experimental evidence is essential for accurate modeling of membrane proteins.
qPIPSA: Relating enzymatic kinetic parameters and interaction fields
Gabdoulline, Razif R; Stein, Matthias; Wade, Rebecca C
2007-01-01
Background The simulation of metabolic networks in quantitative systems biology requires the assignment of enzymatic kinetic parameters. Experimentally determined values are often not available and therefore computational methods to estimate these parameters are needed. It is possible to use the three-dimensional structure of an enzyme to perform simulations of a reaction and derive kinetic parameters. However, this is computationally demanding and requires detailed knowledge of the enzyme mechanism. We have therefore sought to develop a general, simple and computationally efficient procedure to relate protein structural information to enzymatic kinetic parameters that allows consistency between the kinetic and structural information to be checked and estimation of kinetic constants for structurally and mechanistically similar enzymes. Results We describe qPIPSA: quantitative Protein Interaction Property Similarity Analysis. In this analysis, molecular interaction fields, for example, electrostatic potentials, are computed from the enzyme structures. Differences in molecular interaction fields between enzymes are then related to the ratios of their kinetic parameters. This procedure can be used to estimate unknown kinetic parameters when enzyme structural information is available and kinetic parameters have been measured for related enzymes or were obtained under different conditions. The detailed interaction of the enzyme with substrate or cofactors is not modeled and is assumed to be similar for all the proteins compared. The protein structure modeling protocol employed ensures that differences between models reflect genuine differences between the protein sequences, rather than random fluctuations in protein structure. Conclusion Provided that the experimental conditions and the protein structural models refer to the same protein state or conformation, correlations between interaction fields and kinetic parameters can be established for sets of related enzymes. Outliers may arise due to variation in the importance of different contributions to the kinetic parameters, such as protein stability and conformational changes. The qPIPSA approach can assist in the validation as well as estimation of kinetic parameters, and provide insights into enzyme mechanism. PMID:17919319
Yan, Yumeng; Wen, Zeyu; Wang, Xinxiang; Huang, Sheng-You
2017-03-01
Protein-protein docking is an important computational tool for predicting protein-protein interactions. With the rapid development of proteomics projects, more and more experimental binding information ranging from mutagenesis data to three-dimensional structures of protein complexes are becoming available. Therefore, how to appropriately incorporate the biological information into traditional ab initio docking has been an important issue and challenge in the field of protein-protein docking. To address these challenges, we have developed a Hybrid DOCKing protocol of template-based and template-free approaches, referred to as HDOCK. The basic procedure of HDOCK is to model the structures of individual components based on the template complex by a template-based method if a template is available; otherwise, the component structures will be modeled based on monomer proteins by regular homology modeling. Then, the complex structure of the component models is predicted by traditional protein-protein docking. With the HDOCK protocol, we have participated in the CPARI experiment for rounds 28-35. Out of the 25 CASP-CAPRI targets for oligomer modeling, our HDOCK protocol predicted correct models for 16 targets, ranking one of the top algorithms in this challenge. Our docking method also made correct predictions on other CAPRI challenges such as protein-peptide binding for 6 out of 8 targets and water predictions for 2 out of 2 targets. The advantage of our hybrid docking approach over pure template-based docking was further confirmed by a comparative evaluation on 20 CASP-CAPRI targets. Proteins 2017; 85:497-512. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Improving Protein Fold Recognition by Deep Learning Networks
NASA Astrophysics Data System (ADS)
Jo, Taeho; Hou, Jie; Eickholt, Jesse; Cheng, Jianlin
2015-12-01
For accurate recognition of protein folds, a deep learning network method (DN-Fold) was developed to predict if a given query-template protein pair belongs to the same structural fold. The input used stemmed from the protein sequence and structural features extracted from the protein pair. We evaluated the performance of DN-Fold along with 18 different methods on Lindahl’s benchmark dataset and on a large benchmark set extracted from SCOP 1.75 consisting of about one million protein pairs, at three different levels of fold recognition (i.e., protein family, superfamily, and fold) depending on the evolutionary distance between protein sequences. The correct recognition rate of ensembled DN-Fold for Top 1 predictions is 84.5%, 61.5%, and 33.6% and for Top 5 is 91.2%, 76.5%, and 60.7% at family, superfamily, and fold levels, respectively. We also evaluated the performance of single DN-Fold (DN-FoldS), which showed the comparable results at the level of family and superfamily, compared to ensemble DN-Fold. Finally, we extended the binary classification problem of fold recognition to real-value regression task, which also show a promising performance. DN-Fold is freely available through a web server at http://iris.rnet.missouri.edu/dnfold.
Merkley, Eric D; Rysavy, Steven; Kahraman, Abdullah; Hafen, Ryan P; Daggett, Valerie; Adkins, Joshua N
2014-06-01
Integrative structural biology attempts to model the structures of protein complexes that are challenging or intractable by classical structural methods (due to size, dynamics, or heterogeneity) by combining computational structural modeling with data from experimental methods. One such experimental method is chemical crosslinking mass spectrometry (XL-MS), in which protein complexes are crosslinked and characterized using liquid chromatography-mass spectrometry to pinpoint specific amino acid residues in close structural proximity. The commonly used lysine-reactive N-hydroxysuccinimide ester reagents disuccinimidylsuberate (DSS) and bis(sulfosuccinimidyl)suberate (BS(3) ) have a linker arm that is 11.4 Å long when fully extended, allowing Cα (alpha carbon of protein backbone) atoms of crosslinked lysine residues to be up to ∼24 Å apart. However, XL-MS studies on proteins of known structure frequently report crosslinks that exceed this distance. Typically, a tolerance of ∼3 Å is added to the theoretical maximum to account for this observation, with limited justification for the chosen value. We used the Dynameomics database, a repository of high-quality molecular dynamics simulations of 807 proteins representative of diverse protein folds, to investigate the relationship between lysine-lysine distances in experimental starting structures and in simulation ensembles. We conclude that for DSS/BS(3), a distance constraint of 26-30 Å between Cα atoms is appropriate. This analysis provides a theoretical basis for the widespread practice of adding a tolerance to the crosslinker length when comparing XL-MS results to structures or in modeling. We also discuss the comparison of XL-MS results to MD simulations and known structures as a means to test and validate experimental XL-MS methods. © 2014 The Protein Society.
Improved protein surface comparison and application to low-resolution protein structure data
2010-01-01
Background Recent advancements of experimental techniques for determining protein tertiary structures raise significant challenges for protein bioinformatics. With the number of known structures of unknown function expanding at a rapid pace, an urgent task is to provide reliable clues to their biological function on a large scale. Conventional approaches for structure comparison are not suitable for a real-time database search due to their slow speed. Moreover, a new challenge has arisen from recent techniques such as electron microscopy (EM), which provide low-resolution structure data. Previously, we have introduced a method for protein surface shape representation using the 3D Zernike descriptors (3DZDs). The 3DZD enables fast structure database searches, taking advantage of its rotation invariance and compact representation. The search results of protein surface represented with the 3DZD has showngood agreement with the existing structure classifications, but some discrepancies were also observed. Results The three new surface representations of backbone atoms, originally devised all-atom-surface representation, and the combination of all-atom surface with the backbone representation are examined. All representations are encoded with the 3DZD. Also, we have investigated the applicability of the 3DZD for searching protein EM density maps of varying resolutions. The surface representations are evaluated on structure retrieval using two existing classifications, SCOP and the CE-based classification. Conclusions Overall, the 3DZDs representing backbone atoms show better retrieval performance than the original all-atom surface representation. The performance further improved when the two representations are combined. Moreover, we observed that the 3DZD is also powerful in comparing low-resolution structures obtained by electron microscopy. PMID:21172052
Analysis of Translocation-Competent Secretory Proteins by HDX-MS.
Tsirigotaki, A; Papanastasiou, M; Trelle, M B; Jørgensen, T J D; Economou, A
2017-01-01
Protein folding is an intricate and precise process in living cells. Most exported proteins evade cytoplasmic folding, become targeted to the membrane, and then trafficked into/across membranes. Their targeting and translocation-competent states are nonnatively folded. However, once they reach the appropriate cellular compartment, they can fold to their native states. The nonnative states of preproteins remain structurally poorly characterized since increased disorder, protein sizes, aggregation propensity, and the observation timescale are often limiting factors for typical structural approaches such as X-ray crystallography and NMR. Here, we present an alternative approach for the in vitro analysis of nonfolded translocation-competent protein states and their comparison with their native states. We make use of hydrogen/deuterium exchange coupled with mass spectrometry (HDX-MS), a method based on differentiated isotope exchange rates in structured vs unstructured protein states/regions, and highly dynamic vs more rigid regions. We present a complete structural characterization pipeline, starting from the preparation of the polypeptides to data analysis and interpretation. Proteolysis and mass spectrometric conditions for the analysis of the labeled proteins are discussed, followed by the analysis and interpretation of HDX-MS data. We highlight the suitability of HDX-MS for identifying short structured regions within otherwise highly flexible protein states, as illustrated by an exported protein example, experimentally tested in our lab. Finally, we discuss statistical analysis in comparative HDX-MS. The protocol is applicable to any protein and protein size, exhibiting slow or fast loss of translocation competence. It could be easily adapted to more complex assemblies, such as the interaction of chaperones with nonnative protein states. © 2017 Elsevier Inc. All rights reserved.
Butts, Carter T.; Bierma, Jan C.; Martin, Rachel W.
2016-01-01
In his 1875 monograph on insectivorous plants, Darwin described the feeding reactions of Drosera flypaper traps and predicted that their secretions contained a “ferment” similar to mammalian pepsin, an aspartic protease. Here we report a high-quality draft genome sequence for the cape sundew, Drosera capensis, the first genome of a carnivorous plant from order Caryophyllales, which also includes the Venus flytrap (Dionaea) and the tropical pitcher plants (Nepenthes). This species was selected in part for its hardiness and ease of cultivation, making it an excellent model organism for further investigations of plant carnivory. Analysis of predicted protein sequences yields genes encoding proteases homologous to those found in other plants, some of which display sequence and structural features that suggest novel functionalities. Because the sequence similarity to proteins of known structure is in most cases too low for traditional homology modeling, 3D structures of representative proteases are predicted using comparative modeling with all-atom refinement. Although the overall folds and active residues for these proteins are conserved, we find structural and sequence differences consistent with a diversity of substrate recognition patterns. Finally, we predict differences in substrate specificities using in silico experiments, providing targets for structure/function studies of novel enzymes with biological and technological significance. PMID:27353064
Comparing pharmacophore models derived from crystallography and NMR ensembles
NASA Astrophysics Data System (ADS)
Ghanakota, Phani; Carlson, Heather A.
2017-11-01
NMR and X-ray crystallography are the two most widely used methods for determining protein structures. Our previous study examining NMR versus X-Ray sources of protein conformations showed improved performance with NMR structures when used in our Multiple Protein Structures (MPS) method for receptor-based pharmacophores (Damm, Carlson, J Am Chem Soc 129:8225-8235, 2007). However, that work was based on a single test case, HIV-1 protease, because of the rich data available for that system. New data for more systems are available now, which calls for further examination of the effect of different sources of protein conformations. The MPS technique was applied to Growth factor receptor bound protein 2 (Grb2), Src SH2 homology domain (Src-SH2), FK506-binding protein 1A (FKBP12), and Peroxisome proliferator-activated receptor-γ (PPAR-γ). Pharmacophore models from both crystal and NMR ensembles were able to discriminate between high-affinity, low-affinity, and decoy molecules. As we found in our original study, NMR models showed optimal performance when all elements were used. The crystal models had more pharmacophore elements compared to their NMR counterparts. The crystal-based models exhibited optimum performance only when pharmacophore elements were dropped. This supports our assertion that the higher flexibility in NMR ensembles helps focus the models on the most essential interactions with the protein. Our studies suggest that the "extra" pharmacophore elements seen at the periphery in X-ray models arise as a result of decreased protein flexibility and make very little contribution to model performance.
Random close packing in protein cores
NASA Astrophysics Data System (ADS)
Ohern, Corey
Shortly after the determination of the first protein x-ray crystal structures, researchers analyzed their cores and reported packing fractions ϕ ~ 0 . 75 , a value that is similar to close packing equal-sized spheres. A limitation of these analyses was the use of `extended atom' models, rather than the more physically accurate `explicit hydrogen' model. The validity of using the explicit hydrogen model is proved by its ability to predict the side chain dihedral angle distributions observed in proteins. We employ the explicit hydrogen model to calculate the packing fraction of the cores of over 200 high resolution protein structures. We find that these protein cores have ϕ ~ 0 . 55 , which is comparable to random close-packing of non-spherical particles. This result provides a deeper understanding of the physical basis of protein structure that will enable predictions of the effects of amino acid mutations and design of new functional proteins. We gratefully acknowledge the support of the Raymond and Beverly Sackler Institute for Biological, Physical, and Engineering Sciences, National Library of Medicine training grant T15LM00705628 (J.C.G.), and National Science Foundation DMR-1307712 (L.R.).
Huang, Sheng Yu; Chen, Sung Fang; Chen, Chun Hao; Huang, Hsuan Wei; Wu, Wen Guey; Sung, Wang Chou
2014-09-02
Snake venom consists of toxin proteins with multiple disulfide linkages to generate unique structures and biological functions. Determination of these cysteine connections usually requires the purification of each protein followed by structural analysis. In this study, dimethyl labeling coupled with LC-MS/MS and RADAR algorithm was developed to identify the disulfide bonds in crude snake venom. Without any protein separation, the disulfide linkages of several cytotoxins and PLA2 could be solved, including more than 20 disulfide bonds. The results show that this method is capable of analyzing protein mixture. In addition, the approach was also used to compare native cytotoxin 3 (CTX III) and its scrambled isomer, another category of protein mixture, for unknown disulfide bonds. Two disulfide-linked peptides were observed in the native CTX III, and 10 in its scrambled form, X-CTX III. This is the first study that reports a platform for the global cysteine connection analysis on a protein mixture. The proposed method is simple and automatic, offering an efficient tool for structural and functional studies of venom proteins.
Nishizawa, Toyohiko; Yoshimizu, Mamoru; Winton, James R.; Kimura, Takahisa
1991-01-01
Genomic RNA was extracted from purified virions of hirame rhabdovirus (HRV), infectious hematopoietic necrosis virus (IHNV), and viral hemorrhagic septicemia virus (VHSV). The full-length RNA was analyzed using formaldehyde agarose gel electrophoresis followed by ethidium bromide staining. Compared with an internal RNA size standard, all three viral genomic RNAs appeared to have identical relative mobilities and were estimated to be approximately 10.7 kilobases in length or about 3.7 megadaltons in molecular mass. Structural protein synthesis of HRV, IHNV, and VHSV was studied using cell cultures treated with actinomycin D. At 2 h intervals, proteins were labeled with 35S-methionine, extracted, and analyzed by SDS-polyacrylamide gel electrophoresis and autoradiography. The five structural proteins of each of the three viruses appeared in the following order : nucleoprotein (N), matrix protein 1 (M1), matrix protein 2 (M2), glycoprotein (G), and polymerase (L) reflecting both the approximate relative abundance of each protein within infected cells and the gene order within the viral genome.
Kalinowska, Barbara; Banach, Mateusz; Konieczny, Leszek; Marchewka, Damian; Roterman, Irena
2014-01-01
This work discusses the role of unstructured polypeptide chain fragments in shaping the protein's hydrophobic core. Based on the "fuzzy oil drop" model, which assumes an idealized distribution of hydrophobicity density described by the 3D Gaussian, we can determine which fragments make up the core and pinpoint residues whose location conflicts with theoretical predictions. We show that the structural influence of the water environment determines the positions of disordered fragments, leading to the formation of a hydrophobic core overlaid by a hydrophilic mantle. This phenomenon is further described by studying selected proteins which are known to be unstable and contain intrinsically disordered fragments. Their properties are established quantitatively, explaining the causative relation between the protein's structure and function and facilitating further comparative analyses of various structural models. © 2014 Elsevier Inc. All rights reserved.
Papanikolopoulou, Katerina; Schoehn, Guy; Forge, Vincent; Forsyth, V Trevor; Riekel, Christian; Hernandez, Jean-François; Ruigrok, Rob W H; Mitraki, Anna
2005-01-28
Amyloid fibrils are fibrous beta-structures that derive from abnormal folding and assembly of peptides and proteins. Despite a wealth of structural studies on amyloids, the nature of the amyloid structure remains elusive; possible connections to natural, beta-structured fibrous motifs have been suggested. In this work we focus on understanding amyloid structure and formation from sequences of a natural, beta-structured fibrous protein. We show that short peptides (25 to 6 amino acids) corresponding to repetitive sequences from the adenovirus fiber shaft have an intrinsic capacity to form amyloid fibrils as judged by electron microscopy, Congo Red binding, infrared spectroscopy, and x-ray fiber diffraction. In the presence of the globular C-terminal domain of the protein that acts as a trimerization motif, the shaft sequences adopt a triple-stranded, beta-fibrous motif. We discuss the possible structure and arrangement of these sequences within the amyloid fibril, as compared with the one adopted within the native structure. A 6-amino acid peptide, corresponding to the last beta-strand of the shaft, was found to be sufficient to form amyloid fibrils. Structural analysis of these amyloid fibrils suggests that perpendicular stacking of beta-strand repeat units is an underlying common feature of amyloid formation.
Plegaria, Jefferson S; Dzul, Stephen P; Zuiderweg, Erik R P; Stemmler, Timothy L; Pecoraro, Vincent L
2015-05-12
De novo protein design is a biologically relevant approach that provides a novel process in elucidating protein folding and modeling the metal centers of metalloproteins in a completely unrelated or simplified fold. An integral step in de novo protein design is the establishment of a well-folded scaffold with one conformation, which is a fundamental characteristic of many native proteins. Here, we report the NMR solution structure of apo α3DIV at pH 7.0, a de novo designed three-helix bundle peptide containing a triscysteine motif (Cys18, Cys28, and Cys67) that binds toxic heavy metals. The structure comprises 1067 NOE restraints derived from multinuclear multidimensional NOESY, as well as 138 dihedral angles (ψ, φ, and χ1). The backbone and heavy atoms of the 20 lowest energy structures have a root mean square deviation from the mean structure of 0.79 (0.16) Å and 1.31 (0.15) Å, respectively. When compared to the parent structure α3D, the substitution of Leu residues to Cys enhanced the α-helical content of α3DIV while maintaining the same overall topology and fold. In addition, solution studies on the metalated species illustrated metal-induced stability. An increase in the melting temperatures was observed for Hg(II), Pb(II), or Cd(II) bound α3DIV by 18-24 °C compared to its apo counterpart. Further, the extended X-ray absorption fine structure analysis on Hg(II)-α3DIV produced an average Hg(II)-S bond length at 2.36 Å, indicating a trigonal T-shaped coordination environment. Overall, the structure of apo α3DIV reveals an asymmetric distorted triscysteine metal binding site, which offers a model for native metalloregulatory proteins with thiol-rich ligands that function in regulating toxic heavy metals, such as ArsR, CadC, MerR, and PbrR.
Baral, Pravas Kumar; Swayampakula, Mridula; Aguzzi, Adriano; James, Michael N G
2018-05-01
Conversion of the cellular prion protein PrP C into its pathogenic isoform PrP S c is the hallmark of prion diseases, fatal neurodegenerative diseases affecting many mammalian species including humans. Anti-prion monoclonal antibodies can arrest the progression of prion diseases by stabilizing the cellular form of the prion protein. Here, we present the crystal structure of the POM6 Fab fragment, in complex with the mouse prion protein (moPrP). The prion epitope of POM6 is in close proximity to the epitope recognized by the purportedly toxic antibody fragment, POM1 Fab also complexed with moPrP. The POM6 Fab recognizes a larger binding interface indicating a likely stronger binding compared to POM1. POM6 and POM1 exhibit distinct biological responses. Structural comparisons of the bound mouse prion proteins from the POM6 Fab:moPrP and POM1 Fab:moPrP complexes reveal several key regions of the prion protein that might be involved in initiating mis-folding events. The structural data of moPrP:POM6 Fab complex are available in the PDB under the accession number www.rcsb.org/pdb/search/structidSearch.do?structureId=6AQ7. © 2018 Federation of European Biochemical Societies.
InterPred: A pipeline to identify and model protein-protein interactions.
Mirabello, Claudio; Wallner, Björn
2017-06-01
Protein-protein interactions (PPI) are crucial for protein function. There exist many techniques to identify PPIs experimentally, but to determine the interactions in molecular detail is still difficult and very time-consuming. The fact that the number of PPIs is vastly larger than the number of individual proteins makes it practically impossible to characterize all interactions experimentally. Computational approaches that can bridge this gap and predict PPIs and model the interactions in molecular detail are greatly needed. Here we present InterPred, a fully automated pipeline that predicts and model PPIs from sequence using structural modeling combined with massive structural comparisons and molecular docking. A key component of the method is the use of a novel random forest classifier that integrate several structural features to distinguish correct from incorrect protein-protein interaction models. We show that InterPred represents a major improvement in protein-protein interaction detection with a performance comparable or better than experimental high-throughput techniques. We also show that our full-atom protein-protein complex modeling pipeline performs better than state of the art protein docking methods on a standard benchmark set. In addition, InterPred was also one of the top predictors in the latest CAPRI37 experiment. InterPred source code can be downloaded from http://wallnerlab.org/InterPred Proteins 2017; 85:1159-1170. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pokkuluri, P. R.; Londer, Y. Y.; Yang, X.
2010-02-01
Periplasmic cytochromes c{sub 7} are important in electron transfer pathway(s) in Fe(III) respiration by Geobacter sulfurreducens. The genome of G. sulfurreducens encodes a family of five 10-kDa, three-heme cytochromes c{sub 7}. The sequence identity between the five proteins (designated PpcA, PpcB, PpcC, PpcD, and PpcE) varies between 45% and 77%. Here, we report the high-resolution structures of PpcC, PpcD, and PpcE determined by X-ray diffraction. This new information made it possible to compare the sequences and structures of the entire family. The triheme cores are largely conserved but are not identical. We observed changes, due to different crystal packing, inmore » the relative positions of the hemes between two molecules in the crystal. The overall protein fold of the cytochromes is similar. The structure of PpcD differs most from that of the other homologs, which is not obvious from the sequence comparisons of the family. Interestingly, PpcD is the only cytochrome c{sub 7} within the family that has higher abundance when G. sulfurreducens is grown on insoluble Fe(III) oxide compared to ferric citrate. The structures have the highest degree of conservation around 'heme IV'; the protein surface around this heme is positively charged in all of the proteins, and therefore all cytochromes c{sub 7} could interact with similar molecules involving this region. The structures and surface characteristics of the proteins near the other two hemes, 'heme I' and 'heme III', differ within the family. The above observations suggest that each of the five cytochromes c{sub 7} could interact with its own redox partner via an interface involving the regions of heme I and/or heme III; this provides a possible rationalization for the existence of five similar proteins in G. sulfurreducens.« less
Relative Sizes of Organic Molecules
NASA Technical Reports Server (NTRS)
2000-01-01
This computer graphic depicts the relative complexity of crystallizing large proteins in order to study their structures through x-ray crystallography. Insulin is a vital protein whose structure has several subtle points that scientists are still trying to determine. Large molecules such as insuline are complex with structures that are comparatively difficult to understand. For comparison, a sugar molecule (which many people have grown as hard crystals in science glass) and a water molecule are shown. These images were produced with the Macmolecule program. Photo credit: NASA/Marshall Space Flight Center (MSFC)
Lätzer, Joachim; Shen, Tongye; Wolynes, Peter G
2008-02-19
We investigate how post-translational phosphorylation modifies the global conformation of a protein by changing its free energy landscape using two test proteins, cystatin and NtrC. We first examine the changes in a free energy landscape caused by phosphorylation using a model containing information about both structural forms. For cystatin the free energy cost is fairly large indicating a low probability of sampling the phosphorylated conformation in a perfectly funneled landscape. The predicted barrier for NtrC conformational transition is several times larger than the barrier for cystatin, indicating that the switch protein NtrC most probably follows a partial unfolding mechanism to move from one basin to the other. Principal component analysis and linear response theory show how the naturally occurring conformational changes in unmodified proteins are captured and stabilized by the change of interaction potential. We also develop a partially guided structure prediction Hamiltonian which is capable of predicting the global structure of a phosphorylated protein using only knowledge of the structure of the unphosphorylated protein or vice versa. This algorithm makes use of a generic transferable long-range residue contact potential along with details of structure short range in sequence. By comparing the results obtained with this guided transferable potential to those from the native-only, perfectly funneled Hamiltonians, we show that the transferable Hamiltonian correctly captures the nature of the global conformational changes induced by phosphorylation and can sample substantially correct structures for the modified protein with high probability.
Shrivastava, Dipty; Nain, Vikrant; Sahi, Shakti; Verma, Anju; Sharma, Priyanka; Sharma, Prakash Chand; Kumar, Polumetla Ananda
2011-01-22
Resistance (R) protein recognizes molecular signature of pathogen infection and activates downstream hypersensitive response signalling in plants. R protein works as a molecular switch for pathogen defence signalling and represent one of the largest plant gene family. Hence, understanding molecular structure and function of R proteins has been of paramount importance for plant biologists. The present study is aimed at predicting structure of R proteins signalling domains (CC-NBS) by creating a homology model, refining and optimising the model by molecular dynamics simulation and comparing ADP and ATP binding. Based on sequence similarity with proteins of known structures, CC-NBS domains were initially modelled using CED- 4 (cell death abnormality protein) and APAF-1 (apoptotic protease activating factor) as multiple templates. The final CC-NBS structural model was built and optimized by molecular dynamic simulation for 5 nanoseconds (ns). Docking of ADP and ATP at active site shows that both ligand bind specifically with same residues and with minor difference (1 Kcal/mol) in binding energy. Sharing of binding site by ADP and ATP and low difference in their binding site makes CC-NBS suitable for working as molecular switch. Furthermore, structural superimposition elucidate that CC-NBS and CARD (caspase recruitment domains) domain of CED-4 have low RMSD value of 0.9 A° Availability of 3D structural model for both CC and NBS domains will . help in getting deeper insight in these pathogen defence genes.
Query3d: a new method for high-throughput analysis of functional residues in protein structures.
Ausiello, Gabriele; Via, Allegra; Helmer-Citterich, Manuela
2005-12-01
The identification of local similarities between two protein structures can provide clues of a common function. Many different methods exist for searching for similar subsets of residues in proteins of known structure. However, the lack of functional and structural information on single residues, together with the low level of integration of this information in comparison methods, is a limitation that prevents these methods from being fully exploited in high-throughput analyses. Here we describe Query3d, a program that is both a structural DBMS (Database Management System) and a local comparison method. The method conserves a copy of all the residues of the Protein Data Bank annotated with a variety of functional and structural information. New annotations can be easily added from a variety of methods and known databases. The algorithm makes it possible to create complex queries based on the residues' function and then to compare only subsets of the selected residues. Functional information is also essential to speed up the comparison and the analysis of the results. With Query3d, users can easily obtain statistics on how many and which residues share certain properties in all proteins of known structure. At the same time, the method also finds their structural neighbours in the whole PDB. Programs and data can be accessed through the PdbFun web interface.
Query3d: a new method for high-throughput analysis of functional residues in protein structures
Ausiello, Gabriele; Via, Allegra; Helmer-Citterich, Manuela
2005-01-01
Background The identification of local similarities between two protein structures can provide clues of a common function. Many different methods exist for searching for similar subsets of residues in proteins of known structure. However, the lack of functional and structural information on single residues, together with the low level of integration of this information in comparison methods, is a limitation that prevents these methods from being fully exploited in high-throughput analyses. Results Here we describe Query3d, a program that is both a structural DBMS (Database Management System) and a local comparison method. The method conserves a copy of all the residues of the Protein Data Bank annotated with a variety of functional and structural information. New annotations can be easily added from a variety of methods and known databases. The algorithm makes it possible to create complex queries based on the residues' function and then to compare only subsets of the selected residues. Functional information is also essential to speed up the comparison and the analysis of the results. Conclusion With Query3d, users can easily obtain statistics on how many and which residues share certain properties in all proteins of known structure. At the same time, the method also finds their structural neighbours in the whole PDB. Programs and data can be accessed through the PdbFun web interface. PMID:16351754
Bronson, Jonathan; Lee, One-Sun; Saven, Jeffery G.
2006-01-01
Poor solubility and low expression levels often make membrane proteins difficult to study. An alternative to the use of detergents to solubilize these aggregation-prone proteins is the partial redesign of the sequence so as to confer water solubility. Recently, computationally assisted membrane protein solubilization (CAMPS) has been reported, where exposed hydrophobic residues on a protein's surface are computationally redesigned. Herein, the structure and fluctuations of a designed, water-soluble variant of KcsA (WSK-3) were studied using molecular dynamics simulations. The root mean square deviation of the protein from its starting structure, where the backbone coordinates are those of KcsA, was 1.8 Å. The structure of salt bridges involved in structural specificity and solubility were examined. The preferred configuration of ions and water in the selectivity filter of WSK-3 was consistent with the reported preferences for KcsA. The structure of the selectivity filter was maintained, which is consistent with WSK-3 having an affinity for agitoxin2 comparable to that of wild-type KcsA. In contrast to KcsA, the central cavity's side chains were observed to reorient, allowing water diffusion through the side of the cavity wall. These simulations provide an atomistic analysis of the CAMPS strategy and its implications for further investigations of membrane proteins. PMID:16299086
Performance of protein-structure predictions with the physics-based UNRES force field in CASP11.
Krupa, Paweł; Mozolewska, Magdalena A; Wiśniewska, Marta; Yin, Yanping; He, Yi; Sieradzan, Adam K; Ganzynkowicz, Robert; Lipska, Agnieszka G; Karczyńska, Agnieszka; Ślusarz, Magdalena; Ślusarz, Rafał; Giełdoń, Artur; Czaplewski, Cezary; Jagieła, Dawid; Zaborowski, Bartłomiej; Scheraga, Harold A; Liwo, Adam
2016-11-01
Participating as the Cornell-Gdansk group, we have used our physics-based coarse-grained UNited RESidue (UNRES) force field to predict protein structure in the 11th Community Wide Experiment on the Critical Assessment of Techniques for Protein Structure Prediction (CASP11). Our methodology involved extensive multiplexed replica exchange simulations of the target proteins with a recently improved UNRES force field to provide better reproductions of the local structures of polypeptide chains. All simulations were started from fully extended polypeptide chains, and no external information was included in the simulation process except for weak restraints on secondary structure to enable us to finish each prediction within the allowed 3-week time window. Because of simplified UNRES representation of polypeptide chains, use of enhanced sampling methods, code optimization and parallelization and sufficient computational resources, we were able to treat, for the first time, all 55 human prediction targets with sizes from 44 to 595 amino acid residues, the average size being 251 residues. Complete structures of six single-domain proteins were predicted accurately, with the highest accuracy being attained for the T0769, for which the CαRMSD was 3.8 Å for 97 residues of the experimental structure. Correct structures were also predicted for 13 domains of multi-domain proteins with accuracy comparable to that of the best template-based modeling methods. With further improvements of the UNRES force field that are now underway, our physics-based coarse-grained approach to protein-structure prediction will eventually reach global prediction capacity and, consequently, reliability in simulating protein structure and dynamics that are important in biochemical processes. Freely available on the web at http://www.unres.pl/ CONTACT: has5@cornell.edu. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Kuzu, Guray; Keskin, Ozlem; Nussinov, Ruth; Gursoy, Attila
2016-10-01
The structures of protein assemblies are important for elucidating cellular processes at the molecular level. Three-dimensional electron microscopy (3DEM) is a powerful method to identify the structures of assemblies, especially those that are challenging to study by crystallography. Here, a new approach, PRISM-EM, is reported to computationally generate plausible structural models using a procedure that combines crystallographic structures and density maps obtained from 3DEM. The predictions are validated against seven available structurally different crystallographic complexes. The models display mean deviations in the backbone of <5 Å. PRISM-EM was further tested on different benchmark sets; the accuracy was evaluated with respect to the structure of the complex, and the correlation with EM density maps and interface predictions were evaluated and compared with those obtained using other methods. PRISM-EM was then used to predict the structure of the ternary complex of the HIV-1 envelope glycoprotein trimer, the ligand CD4 and the neutralizing protein m36.
Structure Prediction of the Second Extracellular Loop in G-Protein-Coupled Receptors
Kmiecik, Sebastian; Jamroz, Michal; Kolinski, Michal
2014-01-01
G-protein-coupled receptors (GPCRs) play key roles in living organisms. Therefore, it is important to determine their functional structures. The second extracellular loop (ECL2) is a functionally important region of GPCRs, which poses significant challenge for computational structure prediction methods. In this work, we evaluated CABS, a well-established protein modeling tool for predicting ECL2 structure in 13 GPCRs. The ECL2s (with between 13 and 34 residues) are predicted in an environment of other extracellular loops being fully flexible and the transmembrane domain fixed in its x-ray conformation. The modeling procedure used theoretical predictions of ECL2 secondary structure and experimental constraints on disulfide bridges. Our approach yielded ensembles of low-energy conformers and the most populated conformers that contained models close to the available x-ray structures. The level of similarity between the predicted models and x-ray structures is comparable to that of other state-of-the-art computational methods. Our results extend other studies by including newly crystallized GPCRs. PMID:24896119
Zhang, Zhe; Schindler, Christina E. M.; Lange, Oliver F.; Zacharias, Martin
2015-01-01
The high-resolution refinement of docked protein-protein complexes can provide valuable structural and mechanistic insight into protein complex formation complementing experiment. Monte Carlo (MC) based approaches are frequently applied to sample putative interaction geometries of proteins including also possible conformational changes of the binding partners. In order to explore efficiency improvements of the MC sampling, several enhanced sampling techniques, including temperature or Hamiltonian replica exchange and well-tempered ensemble approaches, have been combined with the MC method and were evaluated on 20 protein complexes using unbound partner structures. The well-tempered ensemble method combined with a 2-dimensional temperature and Hamiltonian replica exchange scheme (WTE-H-REMC) was identified as the most efficient search strategy. Comparison with prolonged MC searches indicates that the WTE-H-REMC approach requires approximately 5 times fewer MC steps to identify near native docking geometries compared to conventional MC searches. PMID:26053419
A protein-dependent side-chain rotamer library.
Bhuyan, Md Shariful Islam; Gao, Xin
2011-12-14
Protein side-chain packing problem has remained one of the key open problems in bioinformatics. The three main components of protein side-chain prediction methods are a rotamer library, an energy function and a search algorithm. Rotamer libraries summarize the existing knowledge of the experimentally determined structures quantitatively. Depending on how much contextual information is encoded, there are backbone-independent rotamer libraries and backbone-dependent rotamer libraries. Backbone-independent libraries only encode sequential information, whereas backbone-dependent libraries encode both sequential and locally structural information. However, side-chain conformations are determined by spatially local information, rather than sequentially local information. Since in the side-chain prediction problem, the backbone structure is given, spatially local information should ideally be encoded into the rotamer libraries. In this paper, we propose a new type of backbone-dependent rotamer library, which encodes structural information of all the spatially neighboring residues. We call it protein-dependent rotamer libraries. Given any rotamer library and a protein backbone structure, we first model the protein structure as a Markov random field. Then the marginal distributions are estimated by the inference algorithms, without doing global optimization or search. The rotamers from the given library are then re-ranked and associated with the updated probabilities. Experimental results demonstrate that the proposed protein-dependent libraries significantly outperform the widely used backbone-dependent libraries in terms of the side-chain prediction accuracy and the rotamer ranking ability. Furthermore, without global optimization/search, the side-chain prediction power of the protein-dependent library is still comparable to the global-search-based side-chain prediction methods.
Comparative glycoproteomics of stem cells identifies new players in ricin toxicity.
Stadlmann, Johannes; Taubenschmid, Jasmin; Wenzel, Daniel; Gattinger, Anna; Dürnberger, Gerhard; Dusberger, Frederico; Elling, Ulrich; Mach, Lukas; Mechtler, Karl; Penninger, Josef M
2017-09-28
Glycosylation, the covalent attachment of carbohydrate structures onto proteins, is the most abundant post-translational modification. Over 50% of human proteins are glycosylated, which alters their activities in diverse fundamental biological processes. Despite the importance of glycosylation in biology, the identification and functional validation of complex glycoproteins has remained largely unexplored. Here we develop a novel quantitative approach to identify intact glycopeptides from comparative proteomic data sets, allowing us not only to infer complex glycan structures but also to directly map them to sites within the associated proteins at the proteome scale. We apply this method to human and mouse embryonic stem cells to illuminate the stem cell glycoproteome. This analysis nearly doubles the number of experimentally confirmed glycoproteins, identifies previously unknown glycosylation sites and multiple glycosylated stemness factors, and uncovers evolutionarily conserved as well as species-specific glycoproteins in embryonic stem cells. The specificity of our method is confirmed using sister stem cells carrying repairable mutations in enzymes required for fucosylation, Fut9 and Slc35c1. Ablation of fucosylation confers resistance to the bioweapon ricin, and we discover proteins that carry a fucosylation-dependent sugar code for ricin toxicity. Mutations disrupting a subset of these proteins render cells ricin resistant, revealing new players that orchestrate ricin toxicity. Our comparative glycoproteomics platform, SugarQb, enables genome-wide insights into protein glycosylation and glycan modifications in complex biological systems.
Naitow, Hisashi; Matsuura, Yoshinori; Tono, Kensuke; Joti, Yasumasa; Kameshima, Takashi; Hatsui, Takaki; Yabashi, Makina; Tanaka, Rie; Tanaka, Tomoyuki; Sugahara, Michihiro; Kobayashi, Jun; Nango, Eriko; Iwata, So; Kunishima, Naoki
2017-08-01
Serial femtosecond crystallography (SFX) with an X-ray free-electron laser is used for the structural determination of proteins from a large number of microcrystals at room temperature. To examine the feasibility of pharmaceutical applications of SFX, a ligand-soaking experiment using thermolysin microcrystals has been performed using SFX. The results were compared with those from a conventional experiment with synchrotron radiation (SR) at 100 K. A protein-ligand complex structure was successfully obtained from an SFX experiment using microcrystals soaked with a small-molecule ligand; both oil-based and water-based crystal carriers gave essentially the same results. In a comparison of the SFX and SR structures, clear differences were observed in the unit-cell parameters, in the alternate conformation of side chains, in the degree of water coordination and in the ligand-binding mode.
Kihara, Daisuke; Sael, Lee; Chikhi, Rayan; Esquivel-Rodriguez, Juan
2011-09-01
The tertiary structures of proteins have been solved in an increasing pace in recent years. To capitalize the enormous efforts paid for accumulating the structure data, efficient and effective computational methods need to be developed for comparing, searching, and investigating interactions of protein structures. We introduce the 3D Zernike descriptor (3DZD), an emerging technique to describe molecular surfaces. The 3DZD is a series expansion of mathematical three-dimensional function, and thus a tertiary structure is represented compactly by a vector of coefficients of terms in the series. A strong advantage of the 3DZD is that it is invariant to rotation of target object to be represented. These two characteristics of the 3DZD allow rapid comparison of surface shapes, which is sufficient for real-time structure database screening. In this article, we review various applications of the 3DZD, which have been recently proposed.
Li, Min; Zhang, John Z H
2017-02-14
A recently developed two-bead multipole force field (TMFF) is employed in coarse-grained (CG) molecular dynamics (MD) simulation of proteins in combination with polarizable CG water models, the Martini polarizable water model, and modified big multipole water model. Significant improvement in simulated structures and dynamics of proteins is observed in terms of both the root-mean-square deviations (RMSDs) of the structures and residue root-mean-square fluctuations (RMSFs) from the native ones in the present simulation compared with the simulation result with Martini's non-polarizable water model. Our result shows that TMFF simulation using CG water models gives much stable secondary structures of proteins without the need for adding extra interaction potentials to constrain the secondary structures. Our result also shows that by increasing the MD time step from 2 fs to 6 fs, the RMSD and RMSF results are still in excellent agreement with those from all-atom simulations. The current study demonstrated clearly that the application of TMFF together with a polarizable CG water model significantly improves the accuracy and efficiency for CG simulation of proteins.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Soriano, Erika V.; McCloskey, Diane E.; Kinsland, Cynthia
2008-04-01
The crystal structures of two arginine decarboxylase mutant proteins provide insights into the mechanisms of pyruvoyl-group formation and the decarboxylation reaction. Pyruvoyl-dependent arginine decarboxylase (PvlArgDC) catalyzes the first step of the polyamine-biosynthetic pathway in plants and some archaebacteria. The pyruvoyl group of PvlArgDC is generated by an internal autoserinolysis reaction at an absolutely conserved serine residue in the proenzyme, resulting in two polypeptide chains. Based on the native structure of PvlArgDC from Methanococcus jannaschii, the conserved residues Asn47 and Glu109 were proposed to be involved in the decarboxylation and autoprocessing reactions. N47A and E109Q mutant proteins were prepared and themore » three-dimensional structure of each protein was determined at 2.0 Å resolution. The N47A and E109Q mutant proteins showed reduced decarboxylation activity compared with the wild-type PvlArgDC. These residues may also be important for the autoprocessing reaction, which utilizes a mechanism similar to that of the decarboxylation reaction.« less
Protein simulation using coarse-grained two-bead multipole force field with polarizable water models
NASA Astrophysics Data System (ADS)
Li, Min; Zhang, John Z. H.
2017-02-01
A recently developed two-bead multipole force field (TMFF) is employed in coarse-grained (CG) molecular dynamics (MD) simulation of proteins in combination with polarizable CG water models, the Martini polarizable water model, and modified big multipole water model. Significant improvement in simulated structures and dynamics of proteins is observed in terms of both the root-mean-square deviations (RMSDs) of the structures and residue root-mean-square fluctuations (RMSFs) from the native ones in the present simulation compared with the simulation result with Martini's non-polarizable water model. Our result shows that TMFF simulation using CG water models gives much stable secondary structures of proteins without the need for adding extra interaction potentials to constrain the secondary structures. Our result also shows that by increasing the MD time step from 2 fs to 6 fs, the RMSD and RMSF results are still in excellent agreement with those from all-atom simulations. The current study demonstrated clearly that the application of TMFF together with a polarizable CG water model significantly improves the accuracy and efficiency for CG simulation of proteins.
The neuronal porosome complex in health and disease
Naik, Akshata R; Lewis, Kenneth T
2015-01-01
Cup-shaped secretory portals at the cell plasma membrane called porosomes mediate the precision release of intravesicular material from cells. Membrane-bound secretory vesicles transiently dock and fuse at the base of porosomes facing the cytosol to expel pressurized intravesicular contents from the cell during secretion. The structure, isolation, composition, and functional reconstitution of the neuronal porosome complex have greatly progressed, providing a molecular understanding of its function in health and disease. Neuronal porosomes are 15 nm cup-shaped lipoprotein structures composed of nearly 40 proteins, compared to the 120 nm nuclear pore complex composed of >500 protein molecules. Membrane proteins compose the porosome complex, making it practically impossible to solve its atomic structure. However, atomic force microscopy and small-angle X-ray solution scattering studies have provided three-dimensional structural details of the native neuronal porosome at sub-nanometer resolution, providing insights into the molecular mechanism of its function. The participation of several porosome proteins previously implicated in neurotransmission and neurological disorders, further attest to the crosstalk between porosome proteins and their coordinated involvement in release of neurotransmitter at the synapse. PMID:26264442
A monomeric TIM-barrel structure from Pyrococcus furiosus is optimized for extreme temperatures.
Repo, Heidi; Oeemig, Jesper S; Djupsjöbacka, Janica; Iwaï, Hideo; Heikinheimo, Pirkko
2012-11-01
The structure of phosphoribosyl anthranilate isomerase (TrpF) from the hyperthermophilic archaeon Pyrococcus furiosus (PfTrpF) has been determined at 1.75 Å resolution. The PfTrpF structure has a monomeric TIM-barrel fold which differs from the dimeric structures of two other known thermophilic TrpF proteins. A comparison of the PfTrpF structure with the two known bacterial thermophilic TrpF structures and the structure of a related mesophilic protein from Escherichia coli (EcTrpF) is presented. The thermophilic TrpF structures contain a higher proportion of ion pairs and charged residues compared with the mesophilic EcTrpF. These residues contribute to the closure of the central barrel and the stabilization of the barrel and the surrounding α-helices. In the monomeric PfTrpF conserved structural water molecules are mostly absent; instead, the structural waters are replaced by direct side-chain-main-chain interactions. As a consequence of these combined mechanisms, the P. furiosus enzyme is a thermodynamically stable and entropically optimized monomeric TIM-barrel enzyme which defines a good framework for further protein engineering for industrial applications.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Haskins, William E.; Leavell, Michael D.; Lane, Pamela
2005-03-01
Membrane proteins make up a diverse and important subset of proteins for which structural information is limited. In this study, chemical cross-linking and mass spectrometry were used to explore the structure of the G-protein-coupled photoreceptor bovine rhodopsin in the dark-state conformation. All experiments were performed in rod outer segment membranes using amino acid 'handles' in the native protein sequence and thus minimizing perturbations to the native protein structure. Cysteine and lysine residues were covalently cross-linked using commercially available reagents with a range of linker arm lengths. Following chemical digestion of cross-linked protein, cross-linked peptides were identified by accurate mass measurementmore » using liquid chromatography-fourier transform mass spectrometry and an automated data analysis pipeline. Assignments were confirmed and, if necessary, resolved, by tandem MS. The relative reactivity of lysine residues participating in cross-links was evaluated by labeling with NHS-esters. A distinct pattern of cross-link formation within the C-terminal domain, and between loop I and the C-terminal domain, emerged. Theoretical distances based on cross-linking were compared to inter-atomic distances determined from the energy-minimized X-ray crystal structure and Monte Carlo conformational search procedures. In general, the observed cross-links can be explained by re-positioning participating side-chains without significantly altering backbone structure. One exception, between C3 16 and K325, requires backbone motion to bring the reactive atoms into sufficient proximity for cross-linking. Evidence from other studies suggests that residues around K325 for a region of high backbone mobility. These findings show that cross-linking studies can provide insight into the structural dynamics of membrane proteins in their native environment.« less
Kuang, Xingyan; Dhroso, Andi; Han, Jing Ginger; Shyu, Chi-Ren; Korkin, Dmitry
2016-01-01
Macromolecular interactions are formed between proteins, DNA and RNA molecules. Being a principle building block in macromolecular assemblies and pathways, the interactions underlie most of cellular functions. Malfunctioning of macromolecular interactions is also linked to a number of diseases. Structural knowledge of the macromolecular interaction allows one to understand the interaction’s mechanism, determine its functional implications and characterize the effects of genetic variations, such as single nucleotide polymorphisms, on the interaction. Unfortunately, until now the interactions mediated by different types of macromolecules, e.g. protein–protein interactions or protein–DNA interactions, are collected into individual and unrelated structural databases. This presents a significant obstacle in the analysis of macromolecular interactions. For instance, the homogeneous structural interaction databases prevent scientists from studying structural interactions of different types but occurring in the same macromolecular complex. Here, we introduce DOMMINO 2.0, a structural Database Of Macro-Molecular INteractiOns. Compared to DOMMINO 1.0, a comprehensive database on protein-protein interactions, DOMMINO 2.0 includes the interactions between all three basic types of macromolecules extracted from PDB files. DOMMINO 2.0 is automatically updated on a weekly basis. It currently includes ∼1 040 000 interactions between two polypeptide subunits (e.g. domains, peptides, termini and interdomain linkers), ∼43 000 RNA-mediated interactions, and ∼12 000 DNA-mediated interactions. All protein structures in the database are annotated using SCOP and SUPERFAMILY family annotation. As a result, protein-mediated interactions involving protein domains, interdomain linkers, C- and N- termini, and peptides are identified. Our database provides an intuitive web interface, allowing one to investigate interactions at three different resolution levels: whole subunit network, binary interaction and interaction interface. Database URL: http://dommino.org PMID:26827237
Srinivasan, E; Rajasekaran, R
2017-07-25
The genetic substitution mutation of Cys146Arg in the SOD1 protein is predominantly found in the Japanese population suffering from familial amyotrophic lateral sclerosis (FALS). A complete study of the biophysical aspects of this particular missense mutation through conformational analysis and producing free energy landscapes could provide an insight into the pathogenic mechanism of ALS disease. In this study, we utilized general molecular dynamics simulations along with computational predictions to assess the structural characterization of the protein as well as the conformational preferences of monomeric wild type and mutant SOD1. Our static analysis, accomplished through multiple programs, predicted the deleterious and destabilizing effect of mutant SOD1. Subsequently, comparative molecular dynamic studies performed on the wild type and mutant SOD1 indicated a loss in the protein conformational stability and flexibility. We observed the mutational consequences not only in local but also in long-range variations in the structural properties of the SOD1 protein. Long-range intramolecular protein interactions decrease upon mutation, resulting in less compact structures in the mutant protein rather than in the wild type, suggesting that the mutant structures are less stable than the wild type SOD1. We also presented the free energy landscape to study the collective motion of protein conformations through principal component analysis for the wild type and mutant SOD1. Overall, the study assisted in revealing the cause of the structural destabilization and protein misfolding via structural characterization, secondary structure composition and free energy landscapes. Hence, the computational framework in our study provides a valuable direction for the search for the cure against fatal FALS.
Heendeniya, Ravindra G; Yu, Peiqiang
2017-03-20
Alfalfa ( Medicago sativa L.) genotypes transformed with Lc-bHLH and Lc transcription genes were developed with the intention of stimulating proanthocyanidin synthesis in the aerial parts of the plant. To our knowledge, there are no studies on the effect of single-gene and two-gene transformation on chemical functional groups and molecular structure changes in these plants. The objective of this study was to use advanced molecular spectroscopy with multivariate chemometrics to determine chemical functional group intensity and molecular structure changes in alfalfa plants when co-expressing Lc-bHLH and C1-MYB transcriptive flavanoid regulatory genes in comparison with non-transgenic (NT) and AC Grazeland (ACGL) genotypes. The results showed that compared to NT genotype, the presence of double genes ( Lc and C1 ) increased ratios of both the area and peak height of protein structural Amide I/II and the height ratio of α-helix to β-sheet. In carbohydrate-related spectral analysis, the double gene-transformed alfalfa genotypes exhibited lower peak heights at 1370, 1240, 1153, and 1020 cm -1 compared to the NT genotype. Furthermore, the effect of double gene transformation on carbohydrate molecular structure was clearly revealed in the principal component analysis of the spectra. In conclusion, single or double transformation of Lc and C1 genes resulted in changing functional groups and molecular structure related to proteins and carbohydrates compared to the NT alfalfa genotype. The current study provided molecular structural information on the transgenic alfalfa plants and provided an insight into the impact of transgenes on protein and carbohydrate properties and their molecular structure's changes.
Molecular Dynamics Analysis of Lysozyme Protein in Ethanol-Water Mixed Solvent Environment
NASA Astrophysics Data System (ADS)
Ochije, Henry Ikechukwu
Effect of protein-solvent interaction on the protein structure is widely studied using both experimental and computational techniques. Despite such extensive studies molecular level understanding of proteins and some simple solvents is still not fully understood. This work focuses on detailed molecular dynamics simulations to study of solvent effect on lysozyme protein, using water, alcohol and different concentrations of water-alcohol mixtures as solvents. The lysozyme protein structure in water, alcohol and alcohol-water mixture (0-12% alcohol) was studied using GROMACS molecular dynamics simulation code. Compared to water environment, the lysozome structure showed remarkable changes in solvents with increasing alcohol concentration. In particular, significant changes were observed in the protein secondary structure involving alpha helices. The influence of alcohol on the lysozyme protein was investigated by studying thermodynamic and structural properties. With increasing ethanol concentration we observed a systematic increase in total energy, enthalpy, root mean square deviation (RMSD), and radius of gyration. a polynomial interpolation approach. Using the resulting polynomial equation, we could determine above quantities for any intermediate alcohol percentage. In order to validate this approach, we selected an intermediate ethanol percentage and carried out full MD simulation. The results from MD simulation were in reasonably good agreement with that obtained using polynomial approach. Hence, the polynomial approach based method proposed here eliminates the need for computationally intensive full MD analysis for the concentrations within the range (0-12%) studied in this work.
Multiple solvent crystal structures of ribonuclease A: An assessment of the method
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dechene, Michelle; Wink, Glenna; Smith, Mychal
2010-11-12
The multiple solvent crystal structures (MSCS) method uses organic solvents to map the surfaces of proteins. It identifies binding sites and allows for a more thorough examination of protein plasticity and hydration than could be achieved by a single structure. The crystal structures of bovine pancreatic ribonuclease A (RNAse A) soaked in the following organic solvents are presented: 50% dioxane, 50% dimethylformamide, 70% dimethylsulfoxide, 70% 1,6-hexanediol, 70% isopropanol, 50% R,S,R-bisfuran alcohol, 70% t-butanol, 50% trifluoroethanol, or 1.0M trimethylamine-N-oxide. This set of structures is compared with four sets of crystal structures of RNAse A from the protein data bank (PDB) andmore » with the solution NMR structure to assess the validity of previously untested assumptions associated with MSCS analysis. Plasticity from MSCS is the same as from PDB structures obtained in the same crystal form and deviates only at crystal contacts when compared to structures from a diverse set of crystal environments. Furthermore, there is a good correlation between plasticity as observed by MSCS and the dynamic regions seen by NMR. Conserved water binding sites are identified by MSCS to be those that are conserved in the sets of structures taken from the PDB. Comparison of the MSCS structures with inhibitor-bound crystal structures of RNAse A reveals that the organic solvent molecules identify key interactions made by inhibitor molecules, highlighting ligand binding hot-spots in the active site. The present work firmly establishes the relevance of information obtained by MSCS.« less
Vishwanath, Sneha
2018-01-01
The majority of the proteins encoded in the genomes of eukaryotes contain more than one domain. Reasons for high prevalence of multi-domain proteins in various organisms have been attributed to higher stability and functional and folding advantages over single-domain proteins. Despite these advantages, many proteins are composed of only one domain while their homologous domains are part of multi-domain proteins. In the study presented here, differences in the properties of protein domains in single-domain and multi-domain systems and their influence on functions are discussed. We studied 20 pairs of identical protein domains, which were crystallized in two forms (a) tethered to other proteins domains and (b) tethered to fewer protein domains than (a) or not tethered to any protein domain. Results suggest that tethering of domains in multi-domain proteins influences the structural, dynamic and energetic properties of the constituent protein domains. 50% of the protein domain pairs show significant structural deviations while 90% of the protein domain pairs show differences in dynamics and 12% of the residues show differences in the energetics. To gain further insights on the influence of tethering on the function of the domains, 4 pairs of homologous protein domains, where one of them is a full-length single-domain protein and the other protein domain is a part of a multi-domain protein, were studied. Analyses showed that identical and structurally equivalent functional residues show differential dynamics in homologous protein domains; though comparable dynamics between in-silico generated chimera protein and multi-domain proteins were observed. From these observations, the differences observed in the functions of homologous proteins could be attributed to the presence of tethered domain. Overall, we conclude that tethered domains in multi-domain proteins not only provide stability or folding advantages but also influence pathways resulting in differences in function or regulatory properties. PMID:29432415
Vishwanath, Sneha; de Brevern, Alexandre G; Srinivasan, Narayanaswamy
2018-02-01
The majority of the proteins encoded in the genomes of eukaryotes contain more than one domain. Reasons for high prevalence of multi-domain proteins in various organisms have been attributed to higher stability and functional and folding advantages over single-domain proteins. Despite these advantages, many proteins are composed of only one domain while their homologous domains are part of multi-domain proteins. In the study presented here, differences in the properties of protein domains in single-domain and multi-domain systems and their influence on functions are discussed. We studied 20 pairs of identical protein domains, which were crystallized in two forms (a) tethered to other proteins domains and (b) tethered to fewer protein domains than (a) or not tethered to any protein domain. Results suggest that tethering of domains in multi-domain proteins influences the structural, dynamic and energetic properties of the constituent protein domains. 50% of the protein domain pairs show significant structural deviations while 90% of the protein domain pairs show differences in dynamics and 12% of the residues show differences in the energetics. To gain further insights on the influence of tethering on the function of the domains, 4 pairs of homologous protein domains, where one of them is a full-length single-domain protein and the other protein domain is a part of a multi-domain protein, were studied. Analyses showed that identical and structurally equivalent functional residues show differential dynamics in homologous protein domains; though comparable dynamics between in-silico generated chimera protein and multi-domain proteins were observed. From these observations, the differences observed in the functions of homologous proteins could be attributed to the presence of tethered domain. Overall, we conclude that tethered domains in multi-domain proteins not only provide stability or folding advantages but also influence pathways resulting in differences in function or regulatory properties.
ProBiS-CHARMMing: Web Interface for Prediction and Optimization of Ligands in Protein Binding Sites.
Konc, Janez; Miller, Benjamin T; Štular, Tanja; Lešnik, Samo; Woodcock, H Lee; Brooks, Bernard R; Janežič, Dušanka
2015-11-23
Proteins often exist only as apo structures (unligated) in the Protein Data Bank, with their corresponding holo structures (with ligands) unavailable. However, apoproteins may not represent the amino-acid residue arrangement upon ligand binding well, which is especially problematic for molecular docking. We developed the ProBiS-CHARMMing web interface by connecting the ProBiS ( http://probis.cmm.ki.si ) and CHARMMing ( http://www.charmming.org ) web servers into one functional unit that enables prediction of protein-ligand complexes and allows for their geometry optimization and interaction energy calculation. The ProBiS web server predicts ligands (small compounds, proteins, nucleic acids, and single-atom ligands) that may bind to a query protein. This is achieved by comparing its surface structure against a nonredundant database of protein structures and finding those that have binding sites similar to that of the query protein. Existing ligands found in the similar binding sites are then transposed to the query according to predictions from ProBiS. The CHARMMing web server enables, among other things, minimization and potential energy calculation for a wide variety of biomolecular systems, and it is used here to optimize the geometry of the predicted protein-ligand complex structures using the CHARMM force field and to calculate their interaction energies with the corresponding query proteins. We show how ProBiS-CHARMMing can be used to predict ligands and their poses for a particular binding site, and minimize the predicted protein-ligand complexes to obtain representations of holoproteins. The ProBiS-CHARMMing web interface is freely available for academic users at http://probis.nih.gov.
Structural domains and main-chain flexibility in prion proteins.
Blinov, N; Berjanskii, M; Wishart, D S; Stepanova, M
2009-02-24
In this study we describe a novel approach to define structural domains and to characterize the local flexibility in both human and chicken prion proteins. The approach we use is based on a comprehensive theory of collective dynamics in proteins that was recently developed. This method determines the essential collective coordinates, which can be found from molecular dynamics trajectories via principal component analysis. Under this particular framework, we are able to identify the domains where atoms move coherently while at the same time to determine the local main-chain flexibility for each residue. We have verified this approach by comparing our results for the predicted dynamic domain systems with the computed main-chain flexibility profiles and the NMR-derived random coil indexes for human and chicken prion proteins. The three sets of data show excellent agreement. Additionally, we demonstrate that the dynamic domains calculated in this fashion provide a highly sensitive measure of protein collective structure and dynamics. Furthermore, such an analysis is capable of revealing structural and dynamic properties of proteins that are inaccessible to the conventional assessment of secondary structure. Using the collective dynamic simulation approach described here along with a high-temperature simulations of unfolding of human prion protein, we have explored whether locations of relatively low stability could be identified where the unfolding process could potentially be facilitated. According to our analysis, the locations of relatively low stability may be associated with the beta-sheet formed by strands S1 and S2 and the adjacent loops, whereas helix HC appears to be a relatively stable part of the protein. We suggest that this kind of structural analysis may provide a useful background for a more quantitative assessment of potential routes of spontaneous misfolding in prion proteins.
Molecular Dynamics Analysis of Lysozyme Protein in Ethanol- Water Mixed Solvent
2012-01-01
molecular dynamics simulations of solvent effect on lysozyme protein, using water, ethanol, and different concentrations of water-ethanol mixtures as...understood. This work focuses on detailed molecular dynamics simulations of solvent effect on lysozyme protein, using water, ethanol, and different...using GROMACS molecular dynamics simulation (MD) code. Compared to water environment, the lysozyme structure showed remarkable changes in water
Li-Byarlay, Hongmei; Pittendrigh, Barry R.; Murdock, Larry L.
2016-01-01
Plants produce proteins such as protease inhibitors and lectins as defenses against herbivorous insects and pathogens. However, no systematic studies have explored the structural responses in the midguts of insects when challenged with plant defensive proteins and lectins across different species. In this study, we fed two kinds of protease inhibitors and lectins to the fruit fly Drosophila melanogaster and alpha-amylase inhibitors and lectins to the cowpea bruchid Callosobruchus maculatus. We assessed the changes in midgut cell structures by comparing them with such structures in insects receiving normal diets or subjected to food deprivation. Using light and transmission electron microscopy in both species, we observed structural changes in the midgut peritrophic matrix as well as shortened microvilli on the surfaces of midgut epithelial cells in D. melanogaster. Dietary inhibitors and lectins caused similar lesions in the epithelial cells but not much change in the peritrophic matrix in both species. We also noted structural damages in the Drosophila midgut after six hours of starvation and changes were still present after 12 hours. Our study provided the first evidence of key structural changes of midguts using a comparative approach between a dipteran and a coleopteran. Our particular observation and discussion on plant–insect interaction and dietary stress are relevant for future mode of action studies of plant defensive protein in insect physiology. PMID:27594789
Li-Byarlay, Hongmei; Pittendrigh, Barry R; Murdock, Larry L
2016-01-01
Plants produce proteins such as protease inhibitors and lectins as defenses against herbivorous insects and pathogens. However, no systematic studies have explored the structural responses in the midguts of insects when challenged with plant defensive proteins and lectins across different species. In this study, we fed two kinds of protease inhibitors and lectins to the fruit fly Drosophila melanogaster and alpha-amylase inhibitors and lectins to the cowpea bruchid Callosobruchus maculatus. We assessed the changes in midgut cell structures by comparing them with such structures in insects receiving normal diets or subjected to food deprivation. Using light and transmission electron microscopy in both species, we observed structural changes in the midgut peritrophic matrix as well as shortened microvilli on the surfaces of midgut epithelial cells in D. melanogaster. Dietary inhibitors and lectins caused similar lesions in the epithelial cells but not much change in the peritrophic matrix in both species. We also noted structural damages in the Drosophila midgut after six hours of starvation and changes were still present after 12 hours. Our study provided the first evidence of key structural changes of midguts using a comparative approach between a dipteran and a coleopteran. Our particular observation and discussion on plant-insect interaction and dietary stress are relevant for future mode of action studies of plant defensive protein in insect physiology.
Local and global anatomy of antibody-protein antigen recognition.
Wang, Meryl; Zhu, David; Zhu, Jianwei; Nussinov, Ruth; Ma, Buyong
2018-05-01
Deciphering antibody-protein antigen recognition is of fundamental and practical significance. We constructed an antibody structural dataset, partitioned it into human and murine subgroups, and compared it with nonantibody protein-protein complexes. We investigated the physicochemical properties of regions on and away from the antibody-antigen interfaces, including net charge, overall antibody charge distributions, and their potential role in antigen interaction. We observed that amino acid preference in antibody-protein antigen recognition is entropy driven, with residues having low side-chain entropy appearing to compensate for the high backbone entropy in interaction with protein antigens. Antibodies prefer charged and polar antigen residues and bridging water molecules. They also prefer positive net charge, presumably to promote interaction with negatively charged protein antigens, which are common in proteomes. Antibody-antigen interfaces have large percentages of Tyr, Ser, and Asp, but little Lys. Electrostatic and hydrophobic interactions in the Ag binding sites might be coupled with Fab domains through organized charge and residue distributions away from the binding interfaces. Here we describe some features of antibody-antigen interfaces and of Fab domains as compared with nonantibody protein-protein interactions. The distributions of interface residues in human and murine antibodies do not differ significantly. Overall, our results provide not only a local but also a global anatomy of antibody structures. Copyright © 2017 John Wiley & Sons, Ltd.
Structure and non-structure of centrosomal proteins.
Dos Santos, Helena G; Abia, David; Janowski, Robert; Mortuza, Gulnahar; Bertero, Michela G; Boutin, Maïlys; Guarín, Nayibe; Méndez-Giraldez, Raúl; Nuñez, Alfonso; Pedrero, Juan G; Redondo, Pilar; Sanz, María; Speroni, Silvia; Teichert, Florian; Bruix, Marta; Carazo, José M; Gonzalez, Cayetano; Reina, José; Valpuesta, José M; Vernos, Isabelle; Zabala, Juan C; Montoya, Guillermo; Coll, Miquel; Bastolla, Ugo; Serrano, Luis
2013-01-01
Here we perform a large-scale study of the structural properties and the expression of proteins that constitute the human Centrosome. Centrosomal proteins tend to be larger than generic human proteins (control set), since their genes contain in average more exons (20.3 versus 14.6). They are rich in predicted disordered regions, which cover 57% of their length, compared to 39% in the general human proteome. They also contain several regions that are dually predicted to be disordered and coiled-coil at the same time: 55 proteins (15%) contain disordered and coiled-coil fragments that cover more than 20% of their length. Helices prevail over strands in regions homologous to known structures (47% predicted helical residues against 17% predicted as strands), and even more in the whole centrosomal proteome (52% against 7%), while for control human proteins 34.5% of the residues are predicted as helical and 12.8% are predicted as strands. This difference is mainly due to residues predicted as disordered and helical (30% in centrosomal and 9.4% in control proteins), which may correspond to alpha-helix forming molecular recognition features (α-MoRFs). We performed expression assays for 120 full-length centrosomal proteins and 72 domain constructs that we have predicted to be globular. These full-length proteins are often insoluble: Only 39 out of 120 expressed proteins (32%) and 19 out of 72 domains (26%) were soluble. We built or retrieved structural models for 277 out of 361 human proteins whose centrosomal localization has been experimentally verified. We could not find any suitable structural template with more than 20% sequence identity for 84 centrosomal proteins (23%), for which around 74% of the residues are predicted to be disordered or coiled-coils. The three-dimensional models that we built are available at http://ub.cbm.uam.es/centrosome/models/index.php.
Deng, Lei; Fan, Chao; Zeng, Zhiwen
2017-12-28
Direct prediction of the three-dimensional (3D) structures of proteins from one-dimensional (1D) sequences is a challenging problem. Significant structural characteristics such as solvent accessibility and contact number are essential for deriving restrains in modeling protein folding and protein 3D structure. Thus, accurately predicting these features is a critical step for 3D protein structure building. In this study, we present DeepSacon, a computational method that can effectively predict protein solvent accessibility and contact number by using a deep neural network, which is built based on stacked autoencoder and a dropout method. The results demonstrate that our proposed DeepSacon achieves a significant improvement in the prediction quality compared with the state-of-the-art methods. We obtain 0.70 three-state accuracy for solvent accessibility, 0.33 15-state accuracy and 0.74 Pearson Correlation Coefficient (PCC) for the contact number on the 5729 monomeric soluble globular protein dataset. We also evaluate the performance on the CASP11 benchmark dataset, DeepSacon achieves 0.68 three-state accuracy and 0.69 PCC for solvent accessibility and contact number, respectively. We have shown that DeepSacon can reliably predict solvent accessibility and contact number with stacked sparse autoencoder and a dropout approach.
Yang, Jing; He, Bao-Ji; Jang, Richard; Zhang, Yang; Shen, Hong-Bin
2015-01-01
Abstract Motivation: Cysteine-rich proteins cover many important families in nature but there are currently no methods specifically designed for modeling the structure of these proteins. The accuracy of disulfide connectivity pattern prediction, particularly for the proteins of higher-order connections, e.g. >3 bonds, is too low to effectively assist structure assembly simulations. Results: We propose a new hierarchical order reduction protocol called Cyscon for disulfide-bonding prediction. The most confident disulfide bonds are first identified and bonding prediction is then focused on the remaining cysteine residues based on SVR training. Compared with purely machine learning-based approaches, Cyscon improved the average accuracy of connectivity pattern prediction by 21.9%. For proteins with more than 5 disulfide bonds, Cyscon improved the accuracy by 585% on the benchmark set of PDBCYS. When applied to 158 non-redundant cysteine-rich proteins, Cyscon predictions helped increase (or decrease) the TM-score (or RMSD) of the ab initio QUARK modeling by 12.1% (or 14.4%). This result demonstrates a new avenue to improve the ab initio structure modeling for cysteine-rich proteins. Availability and implementation: http://www.csbio.sjtu.edu.cn/bioinf/Cyscon/ Contact: zhng@umich.edu or hbshen@sjtu.edu.cn Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26254435
Nguyen, Hai; Pérez, Alberto; Bermeo, Sherry; Simmerling, Carlos
2016-01-01
The Generalized Born (GB) implicit solvent model has undergone significant improvements in accuracy for modeling of proteins and small molecules. However, GB still remains a less widely explored option for nucleic acid simulations, in part because fast GB models are often unable to maintain stable nucleic acid structures, or they introduce structural bias in proteins, leading to difficulty in application of GB models in simulations of protein-nucleic acid complexes. Recently, GB-neck2 was developed to improve the behavior of protein simulations. In an effort to create a more accurate model for nucleic acids, a similar procedure to the development of GB-neck2 is described here for nucleic acids. The resulting parameter set significantly reduces absolute and relative energy error relative to Poisson Boltzmann for both nucleic acids and nucleic acid-protein complexes, when compared to its predecessor GB-neck model. This improvement in solvation energy calculation translates to increased structural stability for simulations of DNA and RNA duplexes, quadruplexes, and protein-nucleic acid complexes. The GB-neck2 model also enables successful folding of small DNA and RNA hairpins to near native structures as determined from comparison with experiment. The functional form and all required parameters are provided here and also implemented in the AMBER software. PMID:26574454
NASA Astrophysics Data System (ADS)
Gaines, J. C.; Clark, A. H.; Regan, L.; O'Hern, C. S.
2017-07-01
Proteins are biological polymers that underlie all cellular functions. The first high-resolution protein structures were determined by x-ray crystallography in the 1960s. Since then, there has been continued interest in understanding and predicting protein structure and stability. It is well-established that a large contribution to protein stability originates from the sequestration from solvent of hydrophobic residues in the protein core. How are such hydrophobic residues arranged in the core; how can one best model the packing of these residues, and are residues loosely packed with multiple allowed side chain conformations or densely packed with a single allowed side chain conformation? Here we show that to properly model the packing of residues in protein cores it is essential that amino acids are represented by appropriately calibrated atom sizes, and that hydrogen atoms are explicitly included. We show that protein cores possess a packing fraction of φ ≈ 0.56 , which is significantly less than the typically quoted value of 0.74 obtained using the extended atom representation. We also compare the results for the packing of amino acids in protein cores to results obtained for jammed packings from discrete element simulations of spheres, elongated particles, and composite particles with bumpy surfaces. We show that amino acids in protein cores pack as densely as disordered jammed packings of particles with similar values for the aspect ratio and bumpiness as found for amino acids. Knowing the structural properties of protein cores is of both fundamental and practical importance. Practically, it enables the assessment of changes in the structure and stability of proteins arising from amino acid mutations (such as those identified as a result of the massive human genome sequencing efforts) and the design of new folded, stable proteins and protein-protein interactions with tunable specificity and affinity.
A Model Comparison for Characterizing Protein Motions from Structure
NASA Astrophysics Data System (ADS)
David, Charles; Jacobs, Donald
2011-10-01
A comparative study is made using three computational models that characterize native state dynamics starting from known protein structures taken from four distinct SCOP classifications. A geometrical simulation is performed, and the results are compared to the elastic network model and molecular dynamics. The essential dynamics is quantified by a direct analysis of a mode subspace constructed from ANM and a principal component analysis on both the FRODA and MD trajectories using root mean square inner product and principal angles. Relative subspace sizes and overlaps are visualized using the projection of displacement vectors on the model modes. Additionally, a mode subspace is constructed from PCA on an exemplar set of X-ray crystal structures in order to determine similarly with respect to the generated ensembles. Quantitative analysis reveals there is significant overlap across the three model subspaces and the model independent subspace. These results indicate that structure is the key determinant for native state dynamics.
Ligation site in proteins recognized in silico
Brylinski, Michal; Konieczny, Leszek; Roterman, Irena
2006-01-01
Recognition of a ligation site in a protein molecule is important for identifying its biological activity. The model for in silico recognition of ligation sites in proteins is presented. The idealized hydrophobic core stabilizing protein structure is represented by a three-dimensional Gaussian function. The experimentally observed distribution of hydrophobicity compared with the theoretical distribution reveals differences. The area of high differences indicates the ligation site. Availability http://bioinformatics.cm-uj.krakow.pl/activesite PMID:17597871
Lenton, Samuel; Walsh, Danielle L; Rhys, Natasha H; Soper, Alan K; Dougan, Lorna
2016-07-21
Halophilic organisms have adapted to survive in high salt environments, where mesophilic organisms would perish. One of the biggest challenges faced by halophilic proteins is the ability to maintain both the structure and function at molar concentrations of salt. A distinct adaptation of halophilic proteins, compared to mesophilic homologues, is the abundance of aspartic acid on the protein surface. Mutagenesis and crystallographic studies of halophilic proteins suggest an important role for solvent interactions with the surface aspartic acid residues. This interaction, between the regions of the acidic protein surface and the solvent, is thought to maintain a hydration layer around the protein at molar salt concentrations thereby allowing halophilic proteins to retain their functional state. Here we present neutron diffraction data of the monomeric zwitterionic form of aspartic acid solutions at physiological pH in 0.25 M and 2.5 M concentration of potassium chloride, to mimic mesophilic and halophilic-like environmental conditions. We have used isotopic substitution in combination with empirical potential structure refinement to extract atomic-scale information from the data. Our study provides structural insights that support the hypothesis that carboxyl groups on acidic residues bind water more tightly under high salt conditions, in support of the residue-ion interaction model of halophilic protein stabilisation. Furthermore our data show that in the presence of high salt the self-association between the zwitterionic form of aspartic acid molecules is reduced, suggesting a possible mechanism through which protein aggregation is prevented.
Mode localization in the cooperative dynamics of protein recognition
NASA Astrophysics Data System (ADS)
Copperman, J.; Guenza, M. G.
2016-07-01
The biological function of proteins is encoded in their structure and expressed through the mediation of their dynamics. This paper presents a study on the correlation between local fluctuations, binding, and biological function for two sample proteins, starting from the Langevin Equation for Protein Dynamics (LE4PD). The LE4PD is a microscopic and residue-specific coarse-grained approach to protein dynamics, which starts from the static structural ensemble of a protein and predicts the dynamics analytically. It has been shown to be accurate in its prediction of NMR relaxation experiments and Debye-Waller factors. The LE4PD is solved in a set of diffusive modes which span a vast range of time scales of the protein dynamics, and provides a detailed picture of the mode-dependent localization of the fluctuation as a function of the primary structure of the protein. To investigate the dynamics of protein complexes, the theory is implemented here to treat the coarse-grained dynamics of interacting macromolecules. As an example, calculations of the dynamics of monomeric and dimerized HIV protease and the free Insulin Growth Factor II Receptor (IGF2R) domain 11 and its IGF2R:IGF2 complex are presented. Either simulation-derived or experimentally measured NMR conformers are used as input structural ensembles to the theory. The picture that emerges suggests a dynamical heterogeneous protein where biologically active regions provide energetically comparable conformational states that are trapped by a reacting partner in agreement with the conformation-selection mechanism of binding.
SCOWLP classification: Structural comparison and analysis of protein binding regions
Teyra, Joan; Paszkowski-Rogacz, Maciej; Anders, Gerd; Pisabarro, M Teresa
2008-01-01
Background Detailed information about protein interactions is critical for our understanding of the principles governing protein recognition mechanisms. The structures of many proteins have been experimentally determined in complex with different ligands bound either in the same or different binding regions. Thus, the structural interactome requires the development of tools to classify protein binding regions. A proper classification may provide a general view of the regions that a protein uses to bind others and also facilitate a detailed comparative analysis of the interacting information for specific protein binding regions at atomic level. Such classification might be of potential use for deciphering protein interaction networks, understanding protein function, rational engineering and design. Description Protein binding regions (PBRs) might be ideally described as well-defined separated regions that share no interacting residues one another. However, PBRs are often irregular, discontinuous and can share a wide range of interacting residues among them. The criteria to define an individual binding region can be often arbitrary and may differ from other binding regions within a protein family. Therefore, the rational behind protein interface classification should aim to fulfil the requirements of the analysis to be performed. We extract detailed interaction information of protein domains, peptides and interfacial solvent from the SCOWLP database and we classify the PBRs of each domain family. For this purpose, we define a similarity index based on the overlapping of interacting residues mapped in pair-wise structural alignments. We perform our classification with agglomerative hierarchical clustering using the complete-linkage method. Our classification is calculated at different similarity cut-offs to allow flexibility in the analysis of PBRs, feature especially interesting for those protein families with conflictive binding regions. The hierarchical classification of PBRs is implemented into the SCOWLP database and extends the SCOP classification with three additional family sub-levels: Binding Region, Interface and Contacting Domains. SCOWLP contains 9,334 binding regions distributed within 2,561 families. In 65% of the cases we observe families containing more than one binding region. Besides, 22% of the regions are forming complex with more than one different protein family. Conclusion The current SCOWLP classification and its web application represent a framework for the study of protein interfaces and comparative analysis of protein family binding regions. This comparison can be performed at atomic level and allows the user to study interactome conservation and variability. The new SCOWLP classification may be of great utility for reconstruction of protein complexes, understanding protein networks and ligand design. SCOWLP will be updated with every SCOP release. The web application is available at . PMID:18182098
D'Onofrio, Giuseppe; Ghosh, Tapash Chandra
2005-01-17
Fluctuations and increments of both C(3) and G(3) levels along the human coding sequences were investigated comparing two sets of Xenopus/human orthologous genes. The first set of genes shows minor differences of the GC(3) levels, the second shows considerable increments of the GC(3) levels in the human genes. In both data sets, the fluctuations of C(3) and G(3) levels along the coding sequences correlated with the secondary structures of the encoded proteins. The human genes that underwent the compositional transition showed a different increment of the C(3) and G(3) levels within and among the structural units of the proteins. The relative synonymous codon usage (RSCU) of several amino acids were also affected during the compositional transition, showing that there exists a correlation between RSCU and protein secondary structures in human genes. The importance of natural selection for the formation of isochore organization of the human genome has been discussed on the basis of these results.
Structural and functional characterization of the hazelnut allergen Cor a 8
DOE Office of Scientific and Technical Information (OSTI.GOV)
Offermann, Lesa R.; Bublin, Merima; Perdue, Makenzie L.
Nonspecific lipid transfer proteins (nsLTPs) are basic proteins, stabilized by four disulfide bonds, and are expressed throughout the plant kingdom. These proteins are also known as important allergens in fruits and tree nuts. In this study, the nsLTP from hazelnuts, Cor a 8, was purified and its crystal structure determined. The protein is stable at low pH and refolds after thermal denaturation. Molecular dynamics simulations were used to provide an insight into conformational changes of Cor a 8 upon ligand binding. When known epitope areas from Pru p 3 were compared to those of Cor a 8, differences were obvious,more » which may contribute to limited cross-reactivity between peach and hazelnut allergens. The differences in epitope regions may contribute to limited cross-reactivity between Cor a 8 and nsLTPs from other plant sources. The structure of Cor a 8 represents the first resolved structure of a hazelnut allergen.« less
Structural and functional characterization of the hazelnut allergen Cor a 8
Offermann, Lesa R.; Bublin, Merima; Perdue, Makenzie L.; ...
2015-09-28
Nonspecific lipid transfer proteins (nsLTPs) are basic proteins, stabilized by four disulfide bonds, and are expressed throughout the plant kingdom. These proteins are also known as important allergens in fruits and tree nuts. In this study, the nsLTP from hazelnuts, Cor a 8, was purified and its crystal structure determined. The protein is stable at low pH and refolds after thermal denaturation. Molecular dynamics simulations were used to provide an insight into conformational changes of Cor a 8 upon ligand binding. When known epitope areas from Pru p 3 were compared to those of Cor a 8, differences were obvious,more » which may contribute to limited cross-reactivity between peach and hazelnut allergens. The differences in epitope regions may contribute to limited cross-reactivity between Cor a 8 and nsLTPs from other plant sources. The structure of Cor a 8 represents the first resolved structure of a hazelnut allergen.« less
Gastric protein hydrolysis of raw and roasted almonds in the growing pig.
Bornhorst, Gail M; Drechsler, Krista C; Montoya, Carlos A; Rutherfurd, Shane M; Moughan, Paul J; Singh, R Paul
2016-11-15
Gastric protein hydrolysis may influence gastric emptying rate and subsequent protein digestibility in the small intestine. This study examined the gastric hydrolysis of dietary protein from raw and roasted almonds in the growing pig as a model for the adult human. The gastric hydrolysis of almond proteins was quantified by performing tricine-sodium dodecyl sulfate-polyacrylamide gel electrophoresis and subsequent image analysis. There was an interaction between digestion time, stomach region, and almond type for gastric protein hydrolysis (p<0.05). Gastric emptying rate of protein was a significant (p<0.05) covariate in the gastric protein hydrolysis. In general, greater gastric protein hydrolysis was observed in raw almonds (compared to roasted almonds), hypothesized to be related to structural changes in almond proteins during roasting. Greater gastric protein hydrolysis was observed in the distal stomach (compared to the proximal stomach), likely related to the lower pH in the distal stomach. Copyright © 2016 Elsevier Ltd. All rights reserved.
Global genetic diversity of the Plasmodium vivax transmission-blocking vaccine candidate Pvs48/45.
Vallejo, Andres F; Martinez, Nora L; Tobon, Alejandra; Alger, Jackeline; Lacerda, Marcus V; Kajava, Andrey V; Arévalo-Herrera, Myriam; Herrera, Sócrates
2016-04-12
Plasmodium vivax 48/45 protein is expressed on the surface of gametocytes/gametes and plays a key role in gamete fusion during fertilization. This protein was recently expressed in Escherichia coli host as a recombinant product that was highly immunogenic in mice and monkeys and induced antibodies with high transmission-blocking activity, suggesting its potential as a P. vivax transmission-blocking vaccine candidate. To determine sequence polymorphism of natural parasite isolates and its potential influence on the protein structure, all pvs48/45 sequences reported in databases from around the world as well as those from low-transmission settings of Latin America were compared. Plasmodium vivax parasite isolates from malaria-endemic regions of Colombia, Brazil and Honduras (n = 60) were used to sequence the Pvs48/45 gene, and compared to those previously reported to GenBank and PlasmoDB (n = 222). Pvs48/45 gene haplotypes were analysed to determine the functional significance of genetic variation in protein structure and vaccine potential. Nine non-synonymous substitutions (E35K, Y196H, H211N, K250N, D335Y, E353Q, A376T, K390T, K418R) and three synonymous substitutions (I73, T149, C156) that define seven different haplotypes were found among the 282 isolates from nine countries when compared with the Sal I reference sequence. Nucleotide diversity (π) was 0.00173 for worldwide samples (range 0.00033-0.00216), resulting in relatively high diversity in Myanmar and Colombia, and low diversity in Mexico, Peru and South Korea. The two most frequent substitutions (E353Q: 41.9 %, K250N: 39.5 %) were predicted to be located in antigenic regions without affecting putative B cell epitopes or the tertiary protein structure. There is limited sequence polymorphism in pvs48/45 with noted geographical clustering among Asian and American isolates. The low genetic diversity of the protein does not influence the predicted antigenicity or protein structure and, therefore, supports its further development as transmission-blocking vaccine candidate.
Lery, Letícia M S; Bitar, Mainá; Costa, Mauricio G S; Rössle, Shaila C S; Bisch, Paulo M
2010-12-22
G. diazotrophicus and A. vinelandii are aerobic nitrogen-fixing bacteria. Although oxygen is essential for the survival of these organisms, it irreversibly inhibits nitrogenase, the complex responsible for nitrogen fixation. Both microorganisms deal with this paradox through compensatory mechanisms. In A. vinelandii a conformational protection mechanism occurs through the interaction between the nitrogenase complex and the FeSII protein. Previous studies suggested the existence of a similar system in G. diazotrophicus, but the putative protein involved was not yet described. This study intends to identify the protein coding gene in the recently sequenced genome of G. diazotrophicus and also provide detailed structural information of nitrogenase conformational protection in both organisms. Genomic analysis of G. diazotrophicus sequences revealed a protein coding ORF (Gdia0615) enclosing a conserved "fer2" domain, typical of the ferredoxin family and found in A. vinelandii FeSII. Comparative models of both FeSII and Gdia0615 disclosed a conserved beta-grasp fold. Cysteine residues that coordinate the 2[Fe-S] cluster are in conserved positions towards the metallocluster. Analysis of solvent accessible residues and electrostatic surfaces unveiled an hydrophobic dimerization interface. Dimers assembled by molecular docking presented a stable behaviour and a proper accommodation of regions possibly involved in binding of FeSII to nitrogenase throughout molecular dynamics simulations in aqueous solution. Molecular modeling of the nitrogenase complex of G. diazotrophicus was performed and models were compared to the crystal structure of A. vinelandii nitrogenase. Docking experiments of FeSII and Gdia0615 with its corresponding nitrogenase complex pointed out in both systems a putative binding site presenting shape and charge complementarities at the Fe-protein/MoFe-protein complex interface. The identification of the putative FeSII coding gene in G. diazotrophicus genome represents a large step towards the understanding of the conformational protection mechanism of nitrogenase against oxygen. In addition, this is the first study regarding the structural complementarities of FeSII-nitrogenase interactions in diazotrophic bacteria. The combination of bioinformatic tools for genome analysis, comparative protein modeling, docking calculations and molecular dynamics provided a powerful strategy for the elucidation of molecular mechanisms and structural features of FeSII-nitrogenase interaction.
Serial Femtosecond Crystallography of G Protein-Coupled Receptors
Liu, Wei; Wacker, Daniel; Gati, Cornelius; Han, Gye Won; James, Daniel; Wang, Dingjie; Nelson, Garrett; Weierstall, Uwe; Katritch, Vsevolod; Barty, Anton; Zatsepin, Nadia A.; Li, Dianfan; Messerschmidt, Marc; Boutet, Sébastien; Williams, Garth J.; Koglin, Jason E.; Seibert, M. Marvin; Wang, Chong; Shah, Syed T.A.; Basu, Shibom; Fromme, Raimund; Kupitz, Christopher; Rendek, Kimberley N.; Grotjohann, Ingo; Fromme, Petra; Kirian, Richard A.; Beyerlein, Kenneth R.; White, Thomas A.; Chapman, Henry N.; Caffrey, Martin; Spence, John C.H.; Stevens, Raymond C.; Cherezov, Vadim
2014-01-01
X-ray crystallography of G protein-coupled receptors and other membrane proteins is hampered by difficulties associated with growing sufficiently large crystals that withstand radiation damage and yield high-resolution data at synchrotron sources. Here we used an x-ray free-electron laser (XFEL) with individual 50-fs duration x-ray pulses to minimize radiation damage and obtained a high-resolution room temperature structure of a human serotonin receptor using sub-10 µm microcrystals grown in a membrane mimetic matrix known as lipidic cubic phase. Compared to the structure solved by traditional microcrystallography from cryo-cooled crystals of about two orders of magnitude larger volume, the room temperature XFEL structure displays a distinct distribution of thermal motions and conformations of residues that likely more accurately represent the receptor structure and dynamics in a cellular environment. PMID:24357322
An automated method for modeling proteins on known templates using distance geometry.
Srinivasan, S; March, C J; Sudarsanam, S
1993-02-01
We present an automated method incorporated into a software package, FOLDER, to fold a protein sequence on a given three-dimensional (3D) template. Starting with the sequence alignment of a family of homologous proteins, tertiary structures are modeled using the known 3D structure of one member of the family as a template. Homologous interatomic distances from the template are used as constraints. For nonhomologous regions in the model protein, the lower and the upper bounds for the interatomic distances are imposed by steric constraints and the globular dimensions of the template, respectively. Distance geometry is used to embed an ensemble of structures consistent with these distance bounds. Structures are selected from this ensemble based on minimal distance error criteria, after a penalty function optimization step. These structures are then refined using energy optimization methods. The method is tested by simulating the alpha-chain of horse hemoglobin using the alpha-chain of human hemoglobin as the template and by comparing the generated models with the crystal structure of the alpha-chain of horse hemoglobin. We also test the packing efficiency of this method by reconstructing the atomic positions of the interior side chains beyond C beta atoms of a protein domain from a known 3D structure. In both test cases, models retain the template constraints and any additionally imposed constraints while the packing of the interior residues is optimized with no short contacts or bond deformations. To demonstrate the use of this method in simulating structures of proteins with nonhomologous disulfides, we construct a model of murine interleukin (IL)-4 using the NMR structure of human IL-4 as the template. The resulting geometry of the nonhomologous disulfide in the model structure for murine IL-4 is consistent with standard disulfide geometry.
NASA Astrophysics Data System (ADS)
Georlette, D.; Bentahir, M.; Claverie, P.; Collins, T.; D'amico, S.; Delille, D.; Feller, G.; Gratia, E.; Hoyoux, A.; Lonhienne, T.; Meuwis, M.-a.; Zecchinon, L.; Gerday, Ch.
In the last few years, increased attention has been focused on enzymes produced by cold-adapted micro-organisms. It has emerged that psychrophilic enzymes represent an extremely powerful tool in both protein folding investigations and for biotechnological purposes. Such enzymes are characterised by an increased thermosensitivity and, most of them, by a higher catalytic efficiency at low and moderate temperatures, when compared to their mesophilic counterparts. The high thermosensitivity probably originates from an increased flexibility of either a selected area of the molecular edifice or the overall protein structure, providing enhanced abilities to undergo conformational changes during catalysis at low temperatures. Structure modelling and recent crystallographic data have allowed to elucidate the structural parameters that could be involved in this higher resilience. It was demonstrated that each psychrophilic enzyme adopts its own adaptive strategy. It appears, moreover, that there is a continuum in the strategy of protein adaptation to temperature, as the previously mentioned structural parameters are implicated in the stability of thermophilic proteins. Additional 3D crystal structures, site-directed and random mutagenesis experiments should now be undertaken to further investigate the stability-flexibility-activity relationship.
Ye, Huijun; Wang, Libing; Huang, Renliang; Su, Rongxin; Liu, Boshi; Qi, Wei; He, Zhimin
2015-10-14
The aim of this study was to explore the influence of amphiphilic and zwitterionic structures on the resistance of protein adsorption to peptide self-assembled monolayers (SAMs) and gain insight into the associated antifouling mechanism. Two kinds of cysteine-terminated heptapeptides were studied. One peptide had alternating hydrophobic and hydrophilic residues with an amphiphilic sequence of CYSYSYS. The other peptide (CRERERE) was zwitterionic. Both peptides were covalently attached onto gold substrates via gold-thiol bond formation. Surface plasmon resonance analysis results showed that both peptide SAMs had ultralow or low protein adsorption amounts of 1.97-11.78 ng/cm2 in the presence of single proteins. The zwitterionic peptide showed relatively higher antifouling ability with single proteins and natural complex protein media. We performed molecular dynamics simulations to understand their respective antifouling behaviors. The results indicated that strong surface hydration of peptide SAMs contributes to fouling resistance by impeding interactions with proteins. Compared to the CYSYSYS peptide, more water molecules were predicted to form hydrogen-bonding interactions with the zwitterionic CRERERE peptide, which is in agreement with the antifouling test results. These findings reveal a clear relation between peptide structures and resistance to protein adsorption, facilitating the development of novel peptide-containing antifouling materials.
i3Drefine Software for Protein 3D Structure Refinement and Its Assessment in CASP10
Bhattacharya, Debswapna; Cheng, Jianlin
2013-01-01
Protein structure refinement refers to the process of improving the qualities of protein structures during structure modeling processes to bring them closer to their native states. Structure refinement has been drawing increasing attention in the community-wide Critical Assessment of techniques for Protein Structure prediction (CASP) experiments since its addition in 8th CASP experiment. During the 9th and recently concluded 10th CASP experiments, a consistent growth in number of refinement targets and participating groups has been witnessed. Yet, protein structure refinement still remains a largely unsolved problem with majority of participating groups in CASP refinement category failed to consistently improve the quality of structures issued for refinement. In order to alleviate this need, we developed a completely automated and computationally efficient protein 3D structure refinement method, i3Drefine, based on an iterative and highly convergent energy minimization algorithm with a powerful all-atom composite physics and knowledge-based force fields and hydrogen bonding (HB) network optimization technique. In the recent community-wide blind experiment, CASP10, i3Drefine (as ‘MULTICOM-CONSTRUCT’) was ranked as the best method in the server section as per the official assessment of CASP10 experiment. Here we provide the community with free access to i3Drefine software and systematically analyse the performance of i3Drefine in strict blind mode on the refinement targets issued in CASP10 refinement category and compare with other state-of-the-art refinement methods participating in CASP10. Our analysis demonstrates that i3Drefine is only fully-automated server participating in CASP10 exhibiting consistent improvement over the initial structures in both global and local structural quality metrics. Executable version of i3Drefine is freely available at http://protein.rnet.missouri.edu/i3drefine/. PMID:23894517
The impact of protein disulfide bonds on the amyloid fibril morphology
Kurouski, Dmitry
2014-01-01
Amyloid fibrils are associated with many neurodegenerative diseases. Being formed from more than 20 different proteins that are functionally or structurally unrelated, amyloid fibrils share a common cross-β core structure. It is a well-accepted hypothesis that fibril biological activity and the associated toxicity vary with their morphology. Partial denaturation of a native protein usually precedes the initial stage of fibrillation, namely the nucleation process. Low pH and elevated temperature, typical conditions of amyloid fibril formation in vitro, resulted in partial denaturation of the proteins. Cleavage of disulfide bonds results typically in significant disruption of protein native structure and in the formation of the molten global state. Herein we report on a comparative investigation of fibril formation by apo-α-lactalbumin and its analog that contains only one of the four original disulfide bonds using deep UV resonance and non-resonance Raman spectroscopy and atomic force microscopy. Significant differences in the aggregation mechanism and the resulting fibril morphology were found. PMID:24693331
Li, Bai; Lin, Mu; Liu, Qiao; Li, Ya; Zhou, Changjun
2015-10-01
Protein folding is a fundamental topic in molecular biology. Conventional experimental techniques for protein structure identification or protein folding recognition require strict laboratory requirements and heavy operating burdens, which have largely limited their applications. Alternatively, computer-aided techniques have been developed to optimize protein structures or to predict the protein folding process. In this paper, we utilize a 3D off-lattice model to describe the original protein folding scheme as a simplified energy-optimal numerical problem, where all types of amino acid residues are binarized into hydrophobic and hydrophilic ones. We apply a balance-evolution artificial bee colony (BE-ABC) algorithm as the minimization solver, which is featured by the adaptive adjustment of search intensity to cater for the varying needs during the entire optimization process. In this work, we establish a benchmark case set with 13 real protein sequences from the Protein Data Bank database and evaluate the convergence performance of BE-ABC algorithm through strict comparisons with several state-of-the-art ABC variants in short-term numerical experiments. Besides that, our obtained best-so-far protein structures are compared to the ones in comprehensive previous literature. This study also provides preliminary insights into how artificial intelligence techniques can be applied to reveal the dynamics of protein folding. Graphical Abstract Protein folding optimization using 3D off-lattice model and advanced optimization techniques.
Oberli, Marion; Douard, Véronique; Beaumont, Martin; Jaoui, Daphné; Devime, Fabienne; Laurent, Sandy; Chaumontet, Catherine; Mat, Damien; Le Feunteun, Steven; Michon, Camille; Davila, Anne-Marie; Fromentin, Gilles; Tomé, Daniel; Souchon, Isabelle; Leclerc, Marion; Gaudichon, Claire; Blachier, François
2018-01-01
Food structure is a key factor controlling digestion and nutrient absorption. We test the hypothesis that protein emulsion structure in the diet may affect digestive and absorptive processes. Rats (n = 40) are fed for 3 weeks with two diets chemically identical but based on lipid-protein liquid-fine (LFE) or gelled-coarse (GCE) emulsions that differ at the macro- and microstructure levels. After an overnight fasting, they ingest a 15 N-labeled LFE or GCE test meal and are euthanized 0, 15 min, 1 h, and 5 h later. 15 N enrichment in intestinal contents and blood are measured. Gastric emptying, protein digestion kinetics, 15 N absorption, and incorporation in blood protein and urea are faster with LFE than GCE. At 15 min time point, LFE group shows higher increase in GIP portal levels than GCE. Three weeks of dietary adaptation leads to higher expression of cationic amino acid transporters in ileum of LFE compared to GCE. LFE diet raises cecal butyrate and isovalerate proportion relative to GCE, suggesting increased protein fermentation. LFE diet increases fecal Parabacteroides relative abundance but decreases Bifidobacterium, Sutterella, Parasutterella genera, and Clostridium cluster XIV abundance. Protein emulsion structure regulates digestion kinetics and gastrointestinal physiology, and could be targeted to improve food health value. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
SBION: A Program for Analyses of Salt-Bridges from Multiple Structure Files.
Gupta, Parth Sarthi Sen; Mondal, Sudipta; Mondal, Buddhadev; Islam, Rifat Nawaz Ul; Banerjee, Shyamashree; Bandyopadhyay, Amal K
2014-01-01
Salt-bridge and network salt-bridge are specific electrostatic interactions that contribute to the overall stability of proteins. In hierarchical protein folding model, these interactions play crucial role in nucleation process. The advent and growth of protein structure database and its availability in public domain made an urgent need for context dependent rapid analysis of salt-bridges. While these analyses on single protein is cumbersome and time-consuming, batch analyses need efficient software for rapid topological scan of a large number of protein for extracting details on (i) fraction of salt-bridge residues (acidic and basic). (ii) Chain specific intra-molecular salt-bridges, (iii) inter-molecular salt-bridges (protein-protein interactions) in all possible binary combinations (iv) network salt-bridges and (v) secondary structure distribution of salt-bridge residues. To the best of our knowledge, such efficient software is not available in public domain. At this juncture, we have developed a program i.e. SBION which can perform all the above mentioned computations for any number of protein with any number of chain at any given distance of ion-pair. It is highly efficient, fast, error-free and user friendly. Finally we would say that our SBION indeed possesses potential for applications in the field of structural and comparative bioinformatics studies. SBION is freely available for non-commercial/academic institutions on formal request to the corresponding author (akbanerjee@biotech.buruniv.ac.in).
Barik, Sailen
2008-01-01
The significance of the intron-exon structure of genes is a mystery. As eukaryotic proteins are made up of modular functional domains, each exon was suspected to encode some form of module; however, the definition of a module remained vague. Comparison of pre-mRNA splice junctions with the three-dimensional architecture of its protein product from different eukaryotes revealed that the junctions were far less likely to occur inside the α-helices and β-strands of proteins than within the more flexible linker regions (‘turns’ and ‘loops’) connecting them. The splice junctions were equally distributed in the different types of linkers and throughout the linker sequence, although a slight preference for the central region of the linker was observed. The avoidance of the α-helix and the β-strand by splice junctions suggests the existence of a selection pressure against their disruption, perhaps underscoring the investment made by nature in building these intricate secondary structures. A corollary is that the helix and the strand are the smallest integral architectural units of a protein and represent the minimal modules in the evolution of protein structure. These results should find use in comparative genomics, designing of cloning strategies, and in the mutual verification of genome sequences with protein structures. PMID:15381847
Barik, Sailen
2004-09-01
The significance of the intron-exon structure of genes is a mystery. As eukaryotic proteins are made up of modular functional domains, each exon was suspected to encode some form of module; however, the definition of a module remained vague. Comparison of pre-mRNA splice junctions with the three-dimensional architecture of its protein product from different eukaryotes revealed that the junctions were far less likely to occur inside the alpha-helices and beta-strands of proteins than within the more flexible linker regions ('turns' and 'loops') connecting them. The splice junctions were equally distributed in the different types of linkers and throughout the linker sequence, although a slight preference for the central region of the linker was observed. The avoidance of the alpha-helix and the beta-strand by splice junctions suggests the existence of a selection pressure against their disruption, perhaps underscoring the investment made by nature in building these intricate secondary structures. A corollary is that the helix and the strand are the smallest integral architectural units of a protein and represent the minimal modules in the evolution of protein structure. These results should find use in comparative genomics, designing of cloning strategies, and in the mutual verification of genome sequences with protein structures.
Effects of immunosuppressive treatment on protein expression in rat kidney
Kędzierska, Karolina; Sporniak-Tutak, Katarzyna; Sindrewicz, Krzysztof; Bober, Joanna; Domański, Leszek; Parafiniuk, Mirosław; Urasińska, Elżbieta; Ciechanowicz, Andrzej; Domański, Maciej; Smektała, Tomasz; Masiuk, Marek; Skrzypczak, Wiesław; Ożgo, Małgorzata; Kabat-Koperska, Joanna; Ciechanowski, Kazimierz
2014-01-01
The structural proteins of renal tubular epithelial cells may become a target for the toxic metabolites of immunosuppressants. These metabolites can modify the properties of the proteins, thereby affecting cell function, which is a possible explanation for the mechanism of immunosuppressive agents’ toxicity. In our study, we evaluated the effect of two immunosuppressive strategies on protein expression in the kidneys of Wistar rats. Fragments of the rat kidneys were homogenized after cooling in liquid nitrogen and then dissolved in lysis buffer. The protein concentration in the samples was determined using a protein assay kit, and the proteins were separated by two-dimensional electrophoresis. The obtained gels were then stained with Coomassie Brilliant Blue, and their images were analyzed to evaluate differences in protein expression. Identification of selected proteins was then performed using mass spectrometry. We found that the immunosuppressive drugs used in popular regimens induce a series of changes in protein expression in target organs. The expression of proteins involved in drug, glucose, amino acid, and lipid metabolism was pronounced. However, to a lesser extent, we also observed changes in nuclear, structural, and transport proteins’ synthesis. Very slight differences were observed between the group receiving cyclosporine, mycophenolate mofetil, and glucocorticoids (CMG) and the control group. In contrast, compared to the control group, animals receiving tacrolimus, mycophenolate mofetil, and glucocorticoids (TMG) exhibited higher expression of proteins responsible for renal drug metabolism and lower expression levels of cytoplasmic actin and the major urinary protein. In the TMG group, we observed higher expression of proteins responsible for drug metabolism and a decrease in the expression of respiratory chain enzymes (thioredoxin-2) and markers of distal renal tubular damage (heart fatty acid-binding protein) compared to expression in the CMG group. The consequences of the reported changes in protein expression require further study. PMID:25328384
Ultrafast protein structure-based virtual screening with Panther
NASA Astrophysics Data System (ADS)
Niinivehmas, Sanna P.; Salokas, Kari; Lätti, Sakari; Raunio, Hannu; Pentikäinen, Olli T.
2015-10-01
Molecular docking is by far the most common method used in protein structure-based virtual screening. This paper presents Panther, a novel ultrafast multipurpose docking tool. In Panther, a simple shape-electrostatic model of the ligand-binding area of the protein is created by utilizing the protein crystal structure. The features of the possible ligands are then compared to the model by using a similarity search algorithm. On average, one ligand can be processed in a few minutes by using classical docking methods, whereas using Panther processing takes <1 s. The presented Panther protocol can be used in several applications, such as speeding up the early phases of drug discovery projects, reducing the number of failures in the clinical phase of the drug development process, and estimating the environmental toxicity of chemicals. Panther-code is available in our web pages (http://www.jyu.fi/panther) free of charge after registration.
Ultrafast protein structure-based virtual screening with Panther.
Niinivehmas, Sanna P; Salokas, Kari; Lätti, Sakari; Raunio, Hannu; Pentikäinen, Olli T
2015-10-01
Molecular docking is by far the most common method used in protein structure-based virtual screening. This paper presents Panther, a novel ultrafast multipurpose docking tool. In Panther, a simple shape-electrostatic model of the ligand-binding area of the protein is created by utilizing the protein crystal structure. The features of the possible ligands are then compared to the model by using a similarity search algorithm. On average, one ligand can be processed in a few minutes by using classical docking methods, whereas using Panther processing takes <1 s. The presented Panther protocol can be used in several applications, such as speeding up the early phases of drug discovery projects, reducing the number of failures in the clinical phase of the drug development process, and estimating the environmental toxicity of chemicals. Panther-code is available in our web pages (http://www.jyu.fi/panther) free of charge after registration.
Solution x-ray scattering and structure formation in protein dynamics
NASA Astrophysics Data System (ADS)
Nasedkin, Alexandr; Davidsson, Jan; Niemi, Antti J.; Peng, Xubiao
2017-12-01
We propose a computationally effective approach that builds on Landau mean-field theory in combination with modern nonequilibrium statistical mechanics to model and interpret protein dynamics and structure formation in small- to wide-angle x-ray scattering (S/WAXS) experiments. We develop the methodology by analyzing experimental data in the case of Engrailed homeodomain protein as an example. We demonstrate how to interpret S/WAXS data qualitatively with a good precision and over an extended temperature range. We explain experimental observations in terms of protein phase structure, and we make predictions for future experiments and for how to analyze data at different ambient temperature values. We conclude that the approach we propose has the potential to become a highly accurate, computationally effective, and predictive tool for analyzing S/WAXS data. For this, we compare our results with those obtained previously in an all-atom molecular dynamics simulation.
KFC Server: interactive forecasting of protein interaction hot spots
Darnell, Steven J.; LeGault, Laura; Mitchell, Julie C.
2008-01-01
The KFC Server is a web-based implementation of the KFC (Knowledge-based FADE and Contacts) model—a machine learning approach for the prediction of binding hot spots, or the subset of residues that account for most of a protein interface's; binding free energy. The server facilitates the automated analysis of a user submitted protein–protein or protein–DNA interface and the visualization of its hot spot predictions. For each residue in the interface, the KFC Server characterizes its local structural environment, compares that environment to the environments of experimentally determined hot spots and predicts if the interface residue is a hot spot. After the computational analysis, the user can visualize the results using an interactive job viewer able to quickly highlight predicted hot spots and surrounding structural features within the protein structure. The KFC Server is accessible at http://kfc.mitchell-lab.org. PMID:18539611
FRAGSION: ultra-fast protein fragment library generation by IOHMM sampling.
Bhattacharya, Debswapna; Adhikari, Badri; Li, Jilong; Cheng, Jianlin
2016-07-01
Speed, accuracy and robustness of building protein fragment library have important implications in de novo protein structure prediction since fragment-based methods are one of the most successful approaches in template-free modeling (FM). Majority of the existing fragment detection methods rely on database-driven search strategies to identify candidate fragments, which are inherently time-consuming and often hinder the possibility to locate longer fragments due to the limited sizes of databases. Also, it is difficult to alleviate the effect of noisy sequence-based predicted features such as secondary structures on the quality of fragment. Here, we present FRAGSION, a database-free method to efficiently generate protein fragment library by sampling from an Input-Output Hidden Markov Model. FRAGSION offers some unique features compared to existing approaches in that it (i) is lightning-fast, consuming only few seconds of CPU time to generate fragment library for a protein of typical length (300 residues); (ii) can generate dynamic-size fragments of any length (even for the whole protein sequence) and (iii) offers ways to handle noise in predicted secondary structure during fragment sampling. On a FM dataset from the most recent Critical Assessment of Structure Prediction, we demonstrate that FGRAGSION provides advantages over the state-of-the-art fragment picking protocol of ROSETTA suite by speeding up computation by several orders of magnitude while achieving comparable performance in fragment quality. Source code and executable versions of FRAGSION for Linux and MacOS is freely available to non-commercial users at http://sysbio.rnet.missouri.edu/FRAGSION/ It is bundled with a manual and example data. chengji@missouri.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
NASA Astrophysics Data System (ADS)
Nagy, Julia; Eilert, Tobias; Michaelis, Jens
2018-03-01
Modern hybrid structural analysis methods have opened new possibilities to analyze and resolve flexible protein complexes where conventional crystallographic methods have reached their limits. Here, the Fast-Nano-Positioning System (Fast-NPS), a Bayesian parameter estimation-based analysis method and software, is an interesting method since it allows for the localization of unknown fluorescent dye molecules attached to macromolecular complexes based on single-molecule Förster resonance energy transfer (smFRET) measurements. However, the precision, accuracy, and reliability of structural models derived from results based on such complex calculation schemes are oftentimes difficult to evaluate. Therefore, we present two proof-of-principle benchmark studies where we use smFRET data to localize supposedly unknown positions on a DNA as well as on a protein-nucleic acid complex. Since we use complexes where structural information is available, we can compare Fast-NPS localization to the existing structural data. In particular, we compare different dye models and discuss how both accuracy and precision can be optimized.
Prediction and Dissection of Protein-RNA Interactions by Molecular Descriptors.
Liu, Zhi-Ping; Chen, Luonan
2016-01-01
Protein-RNA interactions play crucial roles in numerous biological processes. However, detecting the interactions and binding sites between protein and RNA by traditional experiments is still time consuming and labor costing. Thus, it is of importance to develop bioinformatics methods for predicting protein-RNA interactions and binding sites. Accurate prediction of protein-RNA interactions and recognitions will highly benefit to decipher the interaction mechanisms between protein and RNA, as well as to improve the RNA-related protein engineering and drug design. In this work, we summarize the current bioinformatics strategies of predicting protein-RNA interactions and dissecting protein-RNA interaction mechanisms from local structure binding motifs. In particular, we focus on the feature-based machine learning methods, in which the molecular descriptors of protein and RNA are extracted and integrated as feature vectors of representing the interaction events and recognition residues. In addition, the available methods are classified and compared comprehensively. The molecular descriptors are expected to elucidate the binding mechanisms of protein-RNA interaction and reveal the functional implications from structural complementary perspective.
Structure of the buffalo secretory signalling glycoprotein at 2.8 Å resolution
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ethayathulla, Abdul S.; Srivastava, Devendra B.; Kumar, Janesh
2007-04-01
The crystal structure of a signalling glycoprotein isolated from buffalo dry secretions (SPB-40) has been determined at 2.8 Å resolution. Two unique residues, Tyr120 and Glu269, found in SPB-40 distort the shape of the sugar-binding groove considerably. The water structure in the groove is also different. The conformations of three flexible loops, His188–His197, Phe202–Arg212 and Tyr244–Pro260, also differ from those found in other structurally similar proteins. The crystal structure of a 40 kDa signalling glycoprotein from buffalo (SPB-40) has been determined at 2.8 Å resolution. SPB-40 acts as a protective signalling factor by binding to viable cells during the earlymore » phase of involution, during which extensive tissue remodelling occurs. It was isolated from the dry secretions of Murrah buffalo. It was purified and crystallized using the hanging-drop vapour-diffusion method with 19% ethanol as the precipitant. The protein was also cloned and its complete nucleotide and amino-acid sequences were determined. When compared with the sequences of other members of the family, the sequence of SPB-40 revealed two very important mutations in the sugar-binding region, in which Tyr120 changed to Trp120 and Glu269 changed to Trp269. The structure showed a significant distortion in the shape of the sugar-binding groove. The water structure in the groove is also drastically altered. The folding of the protein chain in the flexible region comprising segments His188–His197, Phe202–Arg212 and Tyr244–Pro260 shows large variations when compared with other proteins of the family.« less
Porphyrin mediated photo-modification of the structure and function of human serum albumin
NASA Astrophysics Data System (ADS)
Rozinek, Sarah C.
Photosensitization reactions involve irradiating (with visible light) molecules with a high efficiency for either electron transfer or entering an excited triplet state (photosensitizer). Such reactions are applied to photodynamic cancer therapy, many medical laser-treatments, and a potential array of disinfection and pest elimination techniques. To understand the biophysical mechanisms of how these applications are effective at the protein level, the group of Dr. Brancaleon (UTSA) has investigated the irradiation of several dye-protein combinations, and discovered effects on protein structure and function. To further that work, we have investigated irradiation of the protein, human serum albumin (HSA), photosensitized by either protoporphyrin IX (PPIX) or meso-tetrakis(4-sulfonatophenyl)porphyrin (TSPP). HSA is the most abundant plasma protein, making it a likely substrate in PDT, and it possesses a specific binding pocket for iron-PPIX (heme) and possibly other porphyrin derivatives. The results of our research are summarized as follows. First, a thorough characterization of the binding of each photosensitizer to albumin was completed, elucidating a probable binding location for TSPP. Next, fluorescence lifetime emission of the single tryptophan residue, alongside circular dichroism, found tertiary structural changes around tryptophan and an overall 20% decrease in protein secondary structure after irradiation with TSPP bound. Finally, to determine if protein function was lost after photosensitization, size exclusion chromatography found modified albumin still recognizable by its receptor-protein, and comparative ex vivo up-take studies revealed that modified albumin is not processed the same way as native albumin in live tapeworm larva (Mesocestoides corti). Thus we found that visible light can induce partial unfolding of a protein by using a photo-activated ligand. These small structural modifications were sufficient to affect the protein's biological function.
Solution structure of leptospiral LigA4 Big domain
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mei, Song; Zhang, Jiahai; Zhang, Xuecheng
Pathogenic Leptospiraspecies express immunoglobulin-like proteins which serve as adhesins to bind to the extracellular matrices of host cells. Leptospiral immunoglobulin-like protein A (LigA), a surface exposed protein containing tandem repeats of bacterial immunoglobulin-like (Big) domains, has been proved to be involved in the interaction of pathogenic Leptospira with mammalian host. In this study, the solution structure of the fourth Big domain of LigA (LigA4 Big domain) from Leptospira interrogans was solved by nuclear magnetic resonance (NMR). The structure of LigA4 Big domain displays a similar bacterial immunoglobulin-like fold compared with other Big domains, implying some common structural aspects of Bigmore » domain family. On the other hand, it displays some structural characteristics significantly different from classic Ig-like domain. Furthermore, Stains-all assay and NMR chemical shift perturbation revealed the Ca{sup 2+} binding property of LigA4 Big domain. - Highlights: • Determining the solution structure of a bacterial immunoglobulin-like domain from a surface protein of Leptospira. • The solution structure shows some structural characteristics significantly different from the classic Ig-like domains. • A potential Ca{sup 2+}-binding site was identified by strains-all and NMR chemical shift perturbation.« less
Stability of local secondary structure determines selectivity of viral RNA chaperones.
Bravo, Jack P K; Borodavka, Alexander; Barth, Anders; Calabrese, Antonio N; Mojzes, Peter; Cockburn, Joseph J B; Lamb, Don C; Tuma, Roman
2018-05-18
To maintain genome integrity, segmented double-stranded RNA viruses of the Reoviridae family must accurately select and package a complete set of up to a dozen distinct genomic RNAs. It is thought that the high fidelity segmented genome assembly involves multiple sequence-specific RNA-RNA interactions between single-stranded RNA segment precursors. These are mediated by virus-encoded non-structural proteins with RNA chaperone-like activities, such as rotavirus (RV) NSP2 and avian reovirus σNS. Here, we compared the abilities of NSP2 and σNS to mediate sequence-specific interactions between RV genomic segment precursors. Despite their similar activities, NSP2 successfully promotes inter-segment association, while σNS fails to do so. To understand the mechanisms underlying such selectivity in promoting inter-molecular duplex formation, we compared RNA-binding and helix-unwinding activities of both proteins. We demonstrate that octameric NSP2 binds structured RNAs with high affinity, resulting in efficient intramolecular RNA helix disruption. Hexameric σNS oligomerizes into an octamer that binds two RNAs, yet it exhibits only limited RNA-unwinding activity compared to NSP2. Thus, the formation of intersegment RNA-RNA interactions is governed by both helix-unwinding capacity of the chaperones and stability of RNA structure. We propose that this protein-mediated RNA selection mechanism may underpin the high fidelity assembly of multi-segmented RNA genomes in Reoviridae.
Evolution driven structural changes in CENP-E motor domain.
Kumar, Ambuj; Kamaraj, Balu; Sethumadhavan, Rao; Purohit, Rituraj
2013-06-01
Genetic evolution corresponds to various biochemical changes that are vital development of new functional traits. Phylogenetic analysis has provided an important insight into the genetic closeness among species and their evolutionary relationships. Centromere-associated protein-E (CENP-E) protein is vital for maintaining cell cycle and checkpoint signal mechanisms are vital for recruitment process of other essential kinetochore proteins. In this study we have focussed on the evolution driven structural changes in CENP-E motor domain among primate lineage. Through molecular dynamics simulation and computational chemistry approaches we examined the changes in ATP binding affinity and conformational deviations in human CENP-E motor domain as compared to the other primates. Root mean square deviation (RMSD), Root mean square fluctuation (RMSF), Radius of gyration (Rg) and principle component analysis (PCA) results together suggested a gain in stability level as we move from tarsier towards human. This study provides a significant insight into how the cell cycle proteins and their corresponding biochemical activities are evolving and illustrates the potency of a theoretical approach for assessing, in a single study, the structural, functional, and dynamical aspects of protein evolution.
A Comparative Study of Human Saposins.
Garrido-Arandia, María; Cuevas-Zuviría, Bruno; Díaz-Perales, Araceli; Pacios, Luis F
2018-02-14
Saposins are small proteins implicated in trafficking and loading of lipids onto Cluster of Differentiation 1 (CD1) receptor proteins that in turn present lipid antigens to T cells and a variety of T-cell receptors, thus playing a crucial role in innate and adaptive immune responses in humans. Despite their low sequence identity, the four types of human saposins share a similar folding pattern consisting of four helices linked by three conserved disulfide bridges. However, their lipid-binding abilities as well as their activities in extracting, transporting and loading onto CD1 molecules a variety of sphingo- and phospholipids in biological membranes display two striking characteristics: a strong pH-dependence and a structural change between a compact, closed conformation and an open conformation. In this work, we present a comparative computational study of structural, electrostatic, and dynamic features of human saposins based upon their available experimental structures. By means of structural alignments, surface analyses, calculation of pH-dependent protonation states, Poisson-Boltzmann electrostatic potentials, and molecular dynamics simulations at three pH values representative of biological media where saposins fulfill their function, our results shed light into their intrinsic features. The similarities and differences in this class of proteins depend on tiny variations of local structural details that allow saposins to be key players in triggering responses in the human immune system.
2014-01-01
Background A limiting factor in performing proteomics analysis on cancerous cells is the difficulty in obtaining sufficient amounts of starting material. Cell lines can be used as a simplified model system for studying changes that accompany tumorigenesis. This study used two-dimensional gel electrophoresis (2DE) to compare the whole cell proteome of oral cancer cell lines vs normal cells in an attempt to identify cancer associated proteins. Results Three primary cell cultures of normal cells with a limited lifespan without hTERT immortalization have been successfully established. 2DE was used to compare the whole cell proteome of these cells with that of three oral cancer cell lines. Twenty four protein spots were found to have changed in abundance. MALDI TOF/TOF was then used to determine the identity of these proteins. Identified proteins were classified into seven functional categories – structural proteins, enzymes, regulatory proteins, chaperones and others. IPA core analysis predicted that 18 proteins were related to cancer with involvements in hyperplasia, metastasis, invasion, growth and tumorigenesis. The mRNA expressions of two proteins – 14-3-3 protein sigma and Stress-induced-phosphoprotein 1 – were found to correlate with the corresponding proteins’ abundance. Conclusions The outcome of this analysis demonstrated that a comparative study of whole cell proteome of cancer versus normal cell lines can be used to identify cancer associated proteins. PMID:24422745
Mohammad Zadeh, Elham; O'Keefe, Sean F; Kim, Young-Teck; Cho, Jin-Hun
2018-04-01
The effects of transglutaminase on soy protein isolate (SPI) film forming solution and films were investigated by rheological behavior and physicochemical properties based on different manufacturing conditions (enzyme treatments, enzyme incubation times, and protein denaturation temperatures). Enzymatic crosslinking reaction and changes in molecular weight distribution were confirmed by viscosity measurement and SDS-PAGE, respectively, compared to 2 controls: the nonenzyme treated and the deactivated enzyme treated. Films treated with both the enzyme and the deactivated enzyme showed significant increase in tensile strength (TS), percent elongation (%E), and initial contact angle of films compared to the nonenzyme control film due to the bulk stabilizers in the commercial enzyme. Water absorption property, protein solubility, Fourier transform infrared (FTIR) and X-ray diffraction (XRD) spectroscopy revealed that enzyme treated SPI film matrix in the molecular structure level, resulted in the changes in physicochemical properties. Based on our observation, the enzymatic treatment at appropriate conditions is a practical and feasible way to control the physical properties of protein based biopolymeric film for many different scientific and industrial areas. Enzymes can make bridges selectively among different amino acids in the structure of protein matrix. Therefore, protein network is changed after enzyme treatment. The behavior of biopolymeric materials is dependent on the network structure to be suitable in different applications such as bioplastics applied in food and pharmaceutical products. In the current research, transglutaminase, as an enzyme, applied in soy protein matrix in different types of forms, activated and deactivated, and different preparation conditions to investigate its effects on different properties of the new bioplastic film. © 2018 Institute of Food Technologists®.
Teo, T C; DeMichele, S J; Selleck, K M; Babayan, V K; Blackburn, G L; Bistrian, B R
1989-01-01
The effects of enteral feeding with safflower oil or a structured lipid (SL) derived from 60% medium-chain triglyceride (MCT) and 40% fish oil (MCT/fish oil) on protein and energy metabolism were compared in gastrostomy-fed burned rats (30% body surface area) by measuring oxygen consumption, carbon dioxide production, nitrogen balance, total liver protein, whole-body leucine kinetics, and rectus muscle and liver protein fractional synthetic rates (FSR, %/day). Male Sprague-Dawley rats (195 +/- 5g) received 50 ml/day of an enteral regimen containing 50 kcal, 2 g amino acids, and 40% nonprotein calories as lipid for three days. Protein kinetics were estimated by using a continuous L-[1-14C] leucine infusion technique on day 2. Thermally injured rats enterally fed MCT/fish oil yielded significantly higher daily and cumulative nitrogen balances (p less than or equal to 0.025) and rectus muscle (39%) FSR (p less than or equal to 0.05) when compared with safflower oil. MCT/fish oil showed a 22% decrease (p less than or equal to 0.005) in per cent flux oxidized and a 7% (p less than or equal to 0.05) decrease in total energy expenditure (TEE) versus safflower oil. A 15% increase in liver FSR was accompanied by a significant elevation (p less than or equal to 0.025) in total liver protein with MCT/fish oil. This novel SL shares the properties of other structured lipids in that it reduces the net protein catabolic effects of burn injury, in part, by influencing tissue protein synthetic rates. The reduction in TEE is unique to MCT/fish oil and may relate to the ability of fish oil to diminish the injury response. PMID:2500898