NASA Astrophysics Data System (ADS)
Huang, Zhao
2011-12-01
Compared to 'conventional' materials made from metal, glass, or ceramics, protein-based materials have unique mechanical properties. Furthermore, the morphology, mechanical properties, and functionality of protein-based materials may be optimized via sequence engineering for use in a variety of applications, including textile materials, biosensors, and tissue engineering scaffolds. The development of recombinant DNA technology has enabled the production and engineering of protein-based materials ex vivo. However, harsh production conditions can compromise the mechanical properties of protein-based materials and diminish their ability to incorporate functional proteins. Developing a new generation of protein-based materials is crucial to (i) improve materials assembly conditions, (ii) create novel mechanical properties, and (iii) expand the capacity to carry functional protein/peptide sequences. This thesis describes development of novel protein-based materials using Ultrabithorax, a member of the Hox family of proteins that regulate developmental pathways in Drosophila melanogaster. The experiments presented (i) establish the conditions required for the assembly of Ubx-based materials, (ii) generate a wide range of Ubx morphologies, (iii) examine the mechanical properties of Ubx fibers, (iv) incorporate protein functions to Ubx-based materials via gene fusion, (v) pattern protein functions within the Ubx materials, and (vi) examine the biocompatibility of Ubx materials in vitro. Ubx-based materials assemble at mild conditions compatible with protein folding and activity, which enables Ubx chimeric materials to retain the function of appended proteins in spatial patterns determined by materials assembly. Ubx-based materials also display mechanical properties comparable to existing protein-based materials and demonstrate good biocompatibility with living cells in vitro. Taken together, this research demonstrates the unique features and future potential of novel Ubx-based materials.
Network-based function prediction and interactomics: the case for metabolic enzymes.
Janga, S C; Díaz-Mejía, J Javier; Moreno-Hagelsieb, G
2011-01-01
As sequencing technologies increase in power, determining the functions of unknown proteins encoded by the DNA sequences so produced becomes a major challenge. Functional annotation is commonly done on the basis of amino-acid sequence similarity alone. Long after sequence similarity becomes undetectable by pair-wise comparison, profile-based identification of homologs can often succeed due to the conservation of position-specific patterns, important for a protein's three dimensional folding and function. Nevertheless, prediction of protein function from homology-driven approaches is not without problems. Homologous proteins might evolve different functions and the power of homology detection has already started to reach its maximum. Computational methods for inferring protein function, which exploit the context of a protein in cellular networks, have come to be built on top of homology-based approaches. These network-based functional inference techniques provide both a first hand hint into a proteins' functional role and offer complementary insights to traditional methods for understanding the function of uncharacterized proteins. Most recent network-based approaches aim to integrate diverse kinds of functional interactions to boost both coverage and confidence level. These techniques not only promise to solve the moonlighting aspect of proteins by annotating proteins with multiple functions, but also increase our understanding on the interplay between different functional classes in a cell. In this article we review the state of the art in network-based function prediction and describe some of the underlying difficulties and successes. Given the volume of high-throughput data that is being reported the time is ripe to employ these network-based approaches, which can be used to unravel the functions of the uncharacterized proteins accumulating in the genomic databases. © 2010 Elsevier Inc. All rights reserved.
Protein function prediction using neighbor relativity in protein-protein interaction network.
Moosavi, Sobhan; Rahgozar, Masoud; Rahimi, Amir
2013-04-01
There is a large gap between the number of discovered proteins and the number of functionally annotated ones. Due to the high cost of determining protein function by wet-lab research, function prediction has become a major task for computational biology and bioinformatics. Some researches utilize the proteins interaction information to predict function for un-annotated proteins. In this paper, we propose a novel approach called "Neighbor Relativity Coefficient" (NRC) based on interaction network topology which estimates the functional similarity between two proteins. NRC is calculated for each pair of proteins based on their graph-based features including distance, common neighbors and the number of paths between them. In order to ascribe function to an un-annotated protein, NRC estimates a weight for each neighbor to transfer its annotation to the unknown protein. Finally, the unknown protein will be annotated by the top score transferred functions. We also investigate the effect of using different coefficients for various types of functions. The proposed method has been evaluated on Saccharomyces cerevisiae and Homo sapiens interaction networks. The performance analysis demonstrates that NRC yields better results in comparison with previous protein function prediction approaches that utilize interaction network. Copyright © 2012 Elsevier Ltd. All rights reserved.
Exploring Mouse Protein Function via Multiple Approaches.
Huang, Guohua; Chu, Chen; Huang, Tao; Kong, Xiangyin; Zhang, Yunhua; Zhang, Ning; Cai, Yu-Dong
2016-01-01
Although the number of available protein sequences is growing exponentially, functional protein annotations lag far behind. Therefore, accurate identification of protein functions remains one of the major challenges in molecular biology. In this study, we presented a novel approach to predict mouse protein functions. The approach was a sequential combination of a similarity-based approach, an interaction-based approach and a pseudo amino acid composition-based approach. The method achieved an accuracy of about 0.8450 for the 1st-order predictions in the leave-one-out and ten-fold cross-validations. For the results yielded by the leave-one-out cross-validation, although the similarity-based approach alone achieved an accuracy of 0.8756, it was unable to predict the functions of proteins with no homologues. Comparatively, the pseudo amino acid composition-based approach alone reached an accuracy of 0.6786. Although the accuracy was lower than that of the previous approach, it could predict the functions of almost all proteins, even proteins with no homologues. Therefore, the combined method balanced the advantages and disadvantages of both approaches to achieve efficient performance. Furthermore, the results yielded by the ten-fold cross-validation indicate that the combined method is still effective and stable when there are no close homologs are available. However, the accuracy of the predicted functions can only be determined according to known protein functions based on current knowledge. Many protein functions remain unknown. By exploring the functions of proteins for which the 1st-order predicted functions are wrong but the 2nd-order predicted functions are correct, the 1st-order wrongly predicted functions were shown to be closely associated with the genes encoding the proteins. The so-called wrongly predicted functions could also potentially be correct upon future experimental verification. Therefore, the accuracy of the presented method may be much higher in reality.
Exploring Mouse Protein Function via Multiple Approaches
Huang, Tao; Kong, Xiangyin; Zhang, Yunhua; Zhang, Ning
2016-01-01
Although the number of available protein sequences is growing exponentially, functional protein annotations lag far behind. Therefore, accurate identification of protein functions remains one of the major challenges in molecular biology. In this study, we presented a novel approach to predict mouse protein functions. The approach was a sequential combination of a similarity-based approach, an interaction-based approach and a pseudo amino acid composition-based approach. The method achieved an accuracy of about 0.8450 for the 1st-order predictions in the leave-one-out and ten-fold cross-validations. For the results yielded by the leave-one-out cross-validation, although the similarity-based approach alone achieved an accuracy of 0.8756, it was unable to predict the functions of proteins with no homologues. Comparatively, the pseudo amino acid composition-based approach alone reached an accuracy of 0.6786. Although the accuracy was lower than that of the previous approach, it could predict the functions of almost all proteins, even proteins with no homologues. Therefore, the combined method balanced the advantages and disadvantages of both approaches to achieve efficient performance. Furthermore, the results yielded by the ten-fold cross-validation indicate that the combined method is still effective and stable when there are no close homologs are available. However, the accuracy of the predicted functions can only be determined according to known protein functions based on current knowledge. Many protein functions remain unknown. By exploring the functions of proteins for which the 1st-order predicted functions are wrong but the 2nd-order predicted functions are correct, the 1st-order wrongly predicted functions were shown to be closely associated with the genes encoding the proteins. The so-called wrongly predicted functions could also potentially be correct upon future experimental verification. Therefore, the accuracy of the presented method may be much higher in reality. PMID:27846315
Modular protein domains: an engineering approach toward functional biomaterials.
Lin, Charng-Yu; Liu, Julie C
2016-08-01
Protein domains and peptide sequences are a powerful tool for conferring specific functions to engineered biomaterials. Protein sequences with a wide variety of functionalities, including structure, bioactivity, protein-protein interactions, and stimuli responsiveness, have been identified, and advances in molecular biology continue to pinpoint new sequences. Protein domains can be combined to make recombinant proteins with multiple functionalities. The high fidelity of the protein translation machinery results in exquisite control over the sequence of recombinant proteins and the resulting properties of protein-based materials. In this review, we discuss protein domains and peptide sequences in the context of functional protein-based materials, composite materials, and their biological applications. Copyright © 2016 Elsevier Ltd. All rights reserved.
Wan, Cen; Lees, Jonathan G; Minneci, Federico; Orengo, Christine A; Jones, David T
2017-10-01
Accurate gene or protein function prediction is a key challenge in the post-genome era. Most current methods perform well on molecular function prediction, but struggle to provide useful annotations relating to biological process functions due to the limited power of sequence-based features in that functional domain. In this work, we systematically evaluate the predictive power of temporal transcription expression profiles for protein function prediction in Drosophila melanogaster. Our results show significantly better performance on predicting protein function when transcription expression profile-based features are integrated with sequence-derived features, compared with the sequence-derived features alone. We also observe that the combination of expression-based and sequence-based features leads to further improvement of accuracy on predicting all three domains of gene function. Based on the optimal feature combinations, we then propose a novel multi-classifier-based function prediction method for Drosophila melanogaster proteins, FFPred-fly+. Interpreting our machine learning models also allows us to identify some of the underlying links between biological processes and developmental stages of Drosophila melanogaster.
Le, Duc-Hau
2015-01-01
Protein complexes formed by non-covalent interaction among proteins play important roles in cellular functions. Computational and purification methods have been used to identify many protein complexes and their cellular functions. However, their roles in terms of causing disease have not been well discovered yet. There exist only a few studies for the identification of disease-associated protein complexes. However, they mostly utilize complicated heterogeneous networks which are constructed based on an out-of-date database of phenotype similarity network collected from literature. In addition, they only apply for diseases for which tissue-specific data exist. In this study, we propose a method to identify novel disease-protein complex associations. First, we introduce a framework to construct functional similarity protein complex networks where two protein complexes are functionally connected by either shared protein elements, shared annotating GO terms or based on protein interactions between elements in each protein complex. Second, we propose a simple but effective neighborhood-based algorithm, which yields a local similarity measure, to rank disease candidate protein complexes. Comparing the predictive performance of our proposed algorithm with that of two state-of-the-art network propagation algorithms including one we used in our previous study, we found that it performed statistically significantly better than that of these two algorithms for all the constructed functional similarity protein complex networks. In addition, it ran about 32 times faster than these two algorithms. Moreover, our proposed method always achieved high performance in terms of AUC values irrespective of the ways to construct the functional similarity protein complex networks and the used algorithms. The performance of our method was also higher than that reported in some existing methods which were based on complicated heterogeneous networks. Finally, we also tested our method with prostate cancer and selected the top 100 highly ranked candidate protein complexes. Interestingly, 69 of them were evidenced since at least one of their protein elements are known to be associated with prostate cancer. Our proposed method, including the framework to construct functional similarity protein complex networks and the neighborhood-based algorithm on these networks, could be used for identification of novel disease-protein complex associations.
Identification of widespread adenosine nucleotide binding in Mycobacterium tuberculosis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ansong, Charles; Ortega, Corrie; Payne, Samuel H.
The annotation of protein function is almost completely performed by in silico approaches. However, computational prediction of protein function is frequently incomplete and error prone. In Mycobacterium tuberculosis (Mtb), ~25% of all genes have no predicted function and are annotated as hypothetical proteins. This lack of functional information severely limits our understanding of Mtb pathogenicity. Current tools for experimental functional annotation are limited and often do not scale to entire protein families. Here, we report a generally applicable chemical biology platform to functionally annotate bacterial proteins by combining activity-based protein profiling (ABPP) and quantitative LC-MS-based proteomics. As an example ofmore » this approach for high-throughput protein functional validation and discovery, we experimentally annotate the families of ATP-binding proteins in Mtb. Our data experimentally validate prior in silico predictions of >250 ATPases and adenosine nucleotide-binding proteins, and reveal 73 hypothetical proteins as novel ATP-binding proteins. We identify adenosine cofactor interactions with many hypothetical proteins containing a diversity of unrelated sequences, providing a new and expanded view of adenosine nucleotide binding in Mtb. Furthermore, many of these hypothetical proteins are both unique to Mycobacteria and essential for infection, suggesting specialized functions in mycobacterial physiology and pathogenicity. Thus, we provide a generally applicable approach for high throughput protein function discovery and validation, and highlight several ways in which application of activity-based proteomics data can improve the quality of functional annotations to facilitate novel biological insights.« less
Zhang, Chengxin; Zheng, Wei; Freddolino, Peter L; Zhang, Yang
2018-03-10
Homology-based transferal remains the major approach to computational protein function annotations, but it becomes increasingly unreliable when the sequence identity between query and template decreases below 30%. We propose a novel pipeline, MetaGO, to deduce Gene Ontology attributes of proteins by combining sequence homology-based annotation with low-resolution structure prediction and comparison, and partner's homology-based protein-protein network mapping. The pipeline was tested on a large-scale set of 1000 non-redundant proteins from the CAFA3 experiment. Under the stringent benchmark conditions where templates with >30% sequence identity to the query are excluded, MetaGO achieves average F-measures of 0.487, 0.408, and 0.598, for Molecular Function, Biological Process, and Cellular Component, respectively, which are significantly higher than those achieved by other state-of-the-art function annotations methods. Detailed data analysis shows that the major advantage of the MetaGO lies in the new functional homolog detections from partner's homology-based network mapping and structure-based local and global structure alignments, the confidence scores of which can be optimally combined through logistic regression. These data demonstrate the power of using a hybrid model incorporating protein structure and interaction networks to deduce new functional insights beyond traditional sequence homology-based referrals, especially for proteins that lack homologous function templates. The MetaGO pipeline is available at http://zhanglab.ccmb.med.umich.edu/MetaGO/. Copyright © 2018. Published by Elsevier Ltd.
Revealing protein functions based on relationships of interacting proteins and GO terms.
Teng, Zhixia; Guo, Maozu; Liu, Xiaoyan; Tian, Zhen; Che, Kai
2017-09-20
In recent years, numerous computational methods predicted protein function based on the protein-protein interaction (PPI) network. These methods supposed that two proteins share the same function if they interact with each other. However, it is reported by recent studies that the functions of two interacting proteins may be just related. It will mislead the prediction of protein function. Therefore, there is a need for investigating the functional relationship between interacting proteins. In this paper, the functional relationship between interacting proteins is studied and a novel method, called as GoDIN, is advanced to annotate functions of interacting proteins in Gene Ontology (GO) context. It is assumed that the functional difference between interacting proteins can be expressed by semantic difference between GO term and its relatives. Thus, the method uses GO term and its relatives to annotate the interacting proteins separately according to their functional roles in the PPI network. The method is validated by a series of experiments and compared with the concerned method. The experimental results confirm the assumption and suggest that GoDIN is effective on predicting functions of protein. This study demonstrates that: (1) interacting proteins are not equal in the PPI network, and their function may be same or similar, or just related; (2) functional difference between interacting proteins can be measured by their degrees in the PPI network; (3) functional relationship between interacting proteins can be expressed by relationship between GO term and its relatives.
Genome-wide protein-protein interactions and protein function exploration in cyanobacteria
Lv, Qi; Ma, Weimin; Liu, Hui; Li, Jiang; Wang, Huan; Lu, Fang; Zhao, Chen; Shi, Tieliu
2015-01-01
Genome-wide network analysis is well implemented to study proteins of unknown function. Here, we effectively explored protein functions and the biological mechanism based on inferred high confident protein-protein interaction (PPI) network in cyanobacteria. We integrated data from seven different sources and predicted 1,997 PPIs, which were evaluated by experiments in molecular mechanism, text mining of literatures in proved direct/indirect evidences, and “interologs” in conservation. Combined the predicted PPIs with known PPIs, we obtained 4,715 no-redundant PPIs (involving 3,231 proteins covering over 90% of genome) to generate the PPI network. Based on the PPI network, terms in Gene ontology (GO) were assigned to function-unknown proteins. Functional modules were identified by dissecting the PPI network into sub-networks and analyzing pathway enrichment, with which we investigated novel function of underlying proteins in protein complexes and pathways. Examples of photosynthesis and DNA repair indicate that the network approach is a powerful tool in protein function analysis. Overall, this systems biology approach provides a new insight into posterior functional analysis of PPIs in cyanobacteria. PMID:26490033
Protein-protein interaction network-based detection of functionally similar proteins within species.
Song, Baoxing; Wang, Fen; Guo, Yang; Sang, Qing; Liu, Min; Li, Dengyun; Fang, Wei; Zhang, Deli
2012-07-01
Although functionally similar proteins across species have been widely studied, functionally similar proteins within species showing low sequence similarity have not been examined in detail. Identification of these proteins is of significant importance for understanding biological functions, evolution of protein families, progression of co-evolution, and convergent evolution and others which cannot be obtained by detection of functionally similar proteins across species. Here, we explored a method of detecting functionally similar proteins within species based on graph theory. After denoting protein-protein interaction networks using graphs, we split the graphs into subgraphs using the 1-hop method. Proteins with functional similarities in a species were detected using a method of modified shortest path to compare these subgraphs and to find the eligible optimal results. Using seven protein-protein interaction networks and this method, some functionally similar proteins with low sequence similarity that cannot detected by sequence alignment were identified. By analyzing the results, we found that, sometimes, it is difficult to separate homologous from convergent evolution. Evaluation of the performance of our method by gene ontology term overlap showed that the precision of our method was excellent. Copyright © 2012 Wiley Periodicals, Inc.
Structure and function of seed storage proteins in faba bean (Vicia faba L.).
Liu, Yujiao; Wu, Xuexia; Hou, Wanwei; Li, Ping; Sha, Weichao; Tian, Yingying
2017-05-01
The protein subunit is the most important basic unit of protein, and its study can unravel the structure and function of seed storage proteins in faba bean. In this study, we identified six specific protein subunits in Faba bean (cv. Qinghai 13) combining liquid chromatography (LC), liquid chromatography-electronic spray ionization mass (LC-ESI-MS/MS) and bio-information technology. The results suggested a diversity of seed storage proteins in faba bean, and a total of 16 proteins (four GroEL molecular chaperones and 12 plant-specific proteins) were identified from 97-, 96-, 64-, 47-, 42-, and 38-kD-specific protein subunits in faba bean based on the peptide sequence. We also analyzed the composition and abundance of the amino acids, the physicochemical characteristics, secondary structure, three-dimensional structure, transmembrane domain, and possible subcellular localization of these identified proteins in faba bean seed, and finally predicted function and structure. The three-dimensional structures were generated based on homologous modeling, and the protein function was analyzed based on the annotation from the non-redundant protein database (NR database, NCBI) and function analysis of optimal modeling. The objective of this study was to identify the seed storage proteins in faba bean and confirm the structure and function of these proteins. Our results can be useful for the study of protein nutrition and achieve breeding goals for optimal protein quality in faba bean.
Leuthaeuser, Janelle B; Knutson, Stacy T; Kumar, Kiran; Babbitt, Patricia C; Fetrow, Jacquelyn S
2015-09-01
The development of accurate protein function annotation methods has emerged as a major unsolved biological problem. Protein similarity networks, one approach to function annotation via annotation transfer, group proteins into similarity-based clusters. An underlying assumption is that the edge metric used to identify such clusters correlates with functional information. In this contribution, this assumption is evaluated by observing topologies in similarity networks using three different edge metrics: sequence (BLAST), structure (TM-Align), and active site similarity (active site profiling, implemented in DASP). Network topologies for four well-studied protein superfamilies (enolase, peroxiredoxin (Prx), glutathione transferase (GST), and crotonase) were compared with curated functional hierarchies and structure. As expected, network topology differs, depending on edge metric; comparison of topologies provides valuable information on structure/function relationships. Subnetworks based on active site similarity correlate with known functional hierarchies at a single edge threshold more often than sequence- or structure-based networks. Sequence- and structure-based networks are useful for identifying sequence and domain similarities and differences; therefore, it is important to consider the clustering goal before deciding appropriate edge metric. Further, conserved active site residues identified in enolase and GST active site subnetworks correspond with published functionally important residues. Extension of this analysis yields predictions of functionally determinant residues for GST subgroups. These results support the hypothesis that active site similarity-based networks reveal clusters that share functional details and lay the foundation for capturing functionally relevant hierarchies using an approach that is both automatable and can deliver greater precision in function annotation than current similarity-based methods. © 2015 The Authors Protein Science published by Wiley Periodicals, Inc. on behalf of The Protein Society.
USDA-ARS?s Scientific Manuscript database
In addition to microarray technology, which provides a robust method to study protein function in a rapid, economical, and proteome-wide fashion, plasmid-based functional proteomics is an important technology for rapidly obtaining large quantities of protein and determining protein function across a...
Profiling protein function with small molecule microarrays
Winssinger, Nicolas; Ficarro, Scott; Schultz, Peter G.; Harris, Jennifer L.
2002-01-01
The regulation of protein function through posttranslational modification, local environment, and protein–protein interaction is critical to cellular function. The ability to analyze on a genome-wide scale protein functional activity rather than changes in protein abundance or structure would provide important new insights into complex biological processes. Herein, we report the application of a spatially addressable small molecule microarray to an activity-based profile of proteases in crude cell lysates. The potential of this small molecule-based profiling technology is demonstrated by the detection of caspase activation upon induction of apoptosis, characterization of the activated caspase, and inhibition of the caspase-executed apoptotic phenotype using the small molecule inhibitor identified in the microarray-based profile. PMID:12167675
A topological approach for protein classification
Cang, Zixuan; Mu, Lin; Wu, Kedi; ...
2015-11-04
Here, protein function and dynamics are closely related to its sequence and structure. However, prediction of protein function and dynamics from its sequence and structure is still a fundamental challenge in molecular biology. Protein classification, which is typically done through measuring the similarity between proteins based on protein sequence or physical information, serves as a crucial step toward the understanding of protein function and dynamics.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cang, Zixuan; Mu, Lin; Wu, Kedi
Here, protein function and dynamics are closely related to its sequence and structure. However, prediction of protein function and dynamics from its sequence and structure is still a fundamental challenge in molecular biology. Protein classification, which is typically done through measuring the similarity between proteins based on protein sequence or physical information, serves as a crucial step toward the understanding of protein function and dynamics.
Nanochemistry of Protein-Based Delivery Agents
Rajendran, Subin R. C. K.; Udenigwe, Chibuike C.; Yada, Rickey Y.
2016-01-01
The past decade has seen an increased interest in the conversion of food proteins into functional biomaterials, including their use for loading and delivery of physiologically active compounds such as nutraceuticals and pharmaceuticals. Proteins possess a competitive advantage over other platforms for the development of nanodelivery systems since they are biocompatible, amphipathic, and widely available. Proteins also have unique molecular structures and diverse functional groups that can be selectively modified to alter encapsulation and release properties. A number of physical and chemical methods have been used for preparing protein nanoformulations, each based on different underlying protein chemistry. This review focuses on the chemistry of the reorganization and/or modification of proteins into functional nanostructures for delivery, from the perspective of their preparation, functionality, stability and physiological behavior. PMID:27489854
Nanochemistry of protein-based delivery agents
NASA Astrophysics Data System (ADS)
Rajendran, Subin; Udenigwe, Chibuike; Yada, Rickey
2016-07-01
The past decade has seen an increased interest in the conversion of food proteins into functional biomaterials, including their use for loading and delivery of physiologically active compounds such as nutraceuticals and pharmaceuticals. Proteins possess a competitive advantage over other platforms for the development of nanodelivery systems since they are biocompatible, amphipathic, and widely available. Proteins also have unique molecular structures and diverse functional groups that can be selectively modified to alter encapsulation and release properties. A number of physical and chemical methods have been used for preparing protein nanoformulations, each based on different underlying protein chemistry. This review focuses on the chemistry of the reorganization and/or modification of proteins into functional nanostructures for delivery, from the perspective of their preparation, functionality, stability and physiological behavior.
Functional Evolution of PLP-dependent Enzymes based on Active-Site Structural Similarities
Catazaro, Jonathan; Caprez, Adam; Guru, Ashu; Swanson, David; Powers, Robert
2014-01-01
Families of distantly related proteins typically have very low sequence identity, which hinders evolutionary analysis and functional annotation. Slowly evolving features of proteins, such as an active site, are therefore valuable for annotating putative and distantly related proteins. To date, a complete evolutionary analysis of the functional relationship of an entire enzyme family based on active-site structural similarities has not yet been undertaken. Pyridoxal-5’-phosphate (PLP) dependent enzymes are primordial enzymes that diversified in the last universal ancestor. Using the Comparison of Protein Active Site Structures (CPASS) software and database, we show that the active site structures of PLP-dependent enzymes can be used to infer evolutionary relationships based on functional similarity. The enzymes successfully clustered together based on substrate specificity, function, and three-dimensional fold. This study demonstrates the value of using active site structures for functional evolutionary analysis and the effectiveness of CPASS. PMID:24920327
Functional evolution of PLP-dependent enzymes based on active-site structural similarities.
Catazaro, Jonathan; Caprez, Adam; Guru, Ashu; Swanson, David; Powers, Robert
2014-10-01
Families of distantly related proteins typically have very low sequence identity, which hinders evolutionary analysis and functional annotation. Slowly evolving features of proteins, such as an active site, are therefore valuable for annotating putative and distantly related proteins. To date, a complete evolutionary analysis of the functional relationship of an entire enzyme family based on active-site structural similarities has not yet been undertaken. Pyridoxal-5'-phosphate (PLP) dependent enzymes are primordial enzymes that diversified in the last universal ancestor. Using the comparison of protein active site structures (CPASS) software and database, we show that the active site structures of PLP-dependent enzymes can be used to infer evolutionary relationships based on functional similarity. The enzymes successfully clustered together based on substrate specificity, function, and three-dimensional-fold. This study demonstrates the value of using active site structures for functional evolutionary analysis and the effectiveness of CPASS. © 2014 Wiley Periodicals, Inc.
Structure refinement of membrane proteins via molecular dynamics simulations.
Dutagaci, Bercem; Heo, Lim; Feig, Michael
2018-07-01
A refinement protocol based on physics-based techniques established for water soluble proteins is tested for membrane protein structures. Initial structures were generated by homology modeling and sampled via molecular dynamics simulations in explicit lipid bilayer and aqueous solvent systems. Snapshots from the simulations were selected based on scoring with either knowledge-based or implicit membrane-based scoring functions and averaged to obtain refined models. The protocol resulted in consistent and significant refinement of the membrane protein structures similar to the performance of refinement methods for soluble proteins. Refinement success was similar between sampling in the presence of lipid bilayers and aqueous solvent but the presence of lipid bilayers may benefit the improvement of lipid-facing residues. Scoring with knowledge-based functions (DFIRE and RWplus) was found to be as good as scoring using implicit membrane-based scoring functions suggesting that differences in internal packing is more important than orientations relative to the membrane during the refinement of membrane protein homology models. © 2018 Wiley Periodicals, Inc.
AptRank: an adaptive PageRank model for protein function prediction on bi-relational graphs.
Jiang, Biaobin; Kloster, Kyle; Gleich, David F; Gribskov, Michael
2017-06-15
Diffusion-based network models are widely used for protein function prediction using protein network data and have been shown to outperform neighborhood-based and module-based methods. Recent studies have shown that integrating the hierarchical structure of the Gene Ontology (GO) data dramatically improves prediction accuracy. However, previous methods usually either used the GO hierarchy to refine the prediction results of multiple classifiers, or flattened the hierarchy into a function-function similarity kernel. No study has taken the GO hierarchy into account together with the protein network as a two-layer network model. We first construct a Bi-relational graph (Birg) model comprised of both protein-protein association and function-function hierarchical networks. We then propose two diffusion-based methods, BirgRank and AptRank, both of which use PageRank to diffuse information on this two-layer graph model. BirgRank is a direct application of traditional PageRank with fixed decay parameters. In contrast, AptRank utilizes an adaptive diffusion mechanism to improve the performance of BirgRank. We evaluate the ability of both methods to predict protein function on yeast, fly and human protein datasets, and compare with four previous methods: GeneMANIA, TMC, ProteinRank and clusDCA. We design four different validation strategies: missing function prediction, de novo function prediction, guided function prediction and newly discovered function prediction to comprehensively evaluate predictability of all six methods. We find that both BirgRank and AptRank outperform the previous methods, especially in missing function prediction when using only 10% of the data for training. The MATLAB code is available at https://github.rcac.purdue.edu/mgribsko/aptrank . gribskov@purdue.edu. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Houston, Simon; Lithgow, Karen Vivien; Osbak, Kara Krista; Kenyon, Chris Richard; Cameron, Caroline E
2018-05-16
Syphilis continues to be a major global health threat with 11 million new infections each year, and a global burden of 36 million cases. The causative agent of syphilis, Treponema pallidum subspecies pallidum, is a highly virulent bacterium, however the molecular mechanisms underlying T. pallidum pathogenesis remain to be definitively identified. This is due to the fact that T. pallidum is currently uncultivatable, inherently fragile and thus difficult to work with, and phylogenetically distinct with no conventional virulence factor homologs found in other pathogens. In fact, approximately 30% of its predicted protein-coding genes have no known orthologs or assigned functions. Here we employed a structural bioinformatics approach using Phyre2-based tertiary structure modeling to improve our understanding of T. pallidum protein function on a proteome-wide scale. Phyre2-based tertiary structure modeling generated high-confidence predictions for 80% of the T. pallidum proteome (780/978 predicted proteins). Tertiary structure modeling also inferred the same function as primary structure-based annotations from genome sequencing pipelines for 525/605 proteins (87%), which represents 54% (525/978) of all T. pallidum proteins. Of the 175 T. pallidum proteins modeled with high confidence that were not assigned functions in the previously annotated published proteome, 167 (95%) were able to be assigned predicted functions. Twenty-one of the 175 hypothetical proteins modeled with high confidence were also predicted to exhibit significant structural similarity with proteins experimentally confirmed to be required for virulence in other pathogens. Phyre2-based structural modeling is a powerful bioinformatics tool that has provided insight into the potential structure and function of the majority of T. pallidum proteins and helped validate the primary structure-based annotation of more than 50% of all T. pallidum proteins with high confidence. This work represents the first T. pallidum proteome-wide structural modeling study and is one of few studies to apply this approach for the functional annotation of a whole proteome.
Li, Yang; Yang, Jianyi
2017-04-24
The prediction of protein-ligand binding affinity has recently been improved remarkably by machine-learning-based scoring functions. For example, using a set of simple descriptors representing the atomic distance counts, the RF-Score improves the Pearson correlation coefficient to about 0.8 on the core set of the PDBbind 2007 database, which is significantly higher than the performance of any conventional scoring function on the same benchmark. A few studies have been made to discuss the performance of machine-learning-based methods, but the reason for this improvement remains unclear. In this study, by systemically controlling the structural and sequence similarity between the training and test proteins of the PDBbind benchmark, we demonstrate that protein structural and sequence similarity makes a significant impact on machine-learning-based methods. After removal of training proteins that are highly similar to the test proteins identified by structure alignment and sequence alignment, machine-learning-based methods trained on the new training sets do not outperform the conventional scoring functions any more. On the contrary, the performance of conventional functions like X-Score is relatively stable no matter what training data are used to fit the weights of its energy terms.
Concomitant prediction of function and fold at the domain level with GO-based profiles.
Lopez, Daniel; Pazos, Florencio
2013-01-01
Predicting the function of newly sequenced proteins is crucial due to the pace at which these raw sequences are being obtained. Almost all resources for predicting protein function assign functional terms to whole chains, and do not distinguish which particular domain is responsible for the allocated function. This is not a limitation of the methodologies themselves but it is due to the fact that in the databases of functional annotations these methods use for transferring functional terms to new proteins, these annotations are done on a whole-chain basis. Nevertheless, domains are the basic evolutionary and often functional units of proteins. In many cases, the domains of a protein chain have distinct molecular functions, independent from each other. For that reason resources with functional annotations at the domain level, as well as methodologies for predicting function for individual domains adapted to these resources are required.We present a methodology for predicting the molecular function of individual domains, based on a previously developed database of functional annotations at the domain level. The approach, which we show outperforms a standard method based on sequence searches in assigning function, concomitantly predicts the structural fold of the domains and can give hints on the functionally important residues associated to the predicted function.
Boyanova, Desislava; Nilla, Santosh; Klau, Gunnar W.; Dandekar, Thomas; Müller, Tobias; Dittrich, Marcus
2014-01-01
The continuously evolving field of proteomics produces increasing amounts of data while improving the quality of protein identifications. Albeit quantitative measurements are becoming more popular, many proteomic studies are still based on non-quantitative methods for protein identification. These studies result in potentially large sets of identified proteins, where the biological interpretation of proteins can be challenging. Systems biology develops innovative network-based methods, which allow an integrated analysis of these data. Here we present a novel approach, which combines prior knowledge of protein-protein interactions (PPI) with proteomics data using functional similarity measurements of interacting proteins. This integrated network analysis exactly identifies network modules with a maximal consistent functional similarity reflecting biological processes of the investigated cells. We validated our approach on small (H9N2 virus-infected gastric cells) and large (blood constituents) proteomic data sets. Using this novel algorithm, we identified characteristic functional modules in virus-infected cells, comprising key signaling proteins (e.g. the stress-related kinase RAF1) and demonstrate that this method allows a module-based functional characterization of cell types. Analysis of a large proteome data set of blood constituents resulted in clear separation of blood cells according to their developmental origin. A detailed investigation of the T-cell proteome further illustrates how the algorithm partitions large networks into functional subnetworks each representing specific cellular functions. These results demonstrate that the integrated network approach not only allows a detailed analysis of proteome networks but also yields a functional decomposition of complex proteomic data sets and thereby provides deeper insights into the underlying cellular processes of the investigated system. PMID:24807868
McLaughlin, William A; Chen, Ken; Hou, Tingjun; Wang, Wei
2007-01-01
Background Protein domains coordinate to perform multifaceted cellular functions, and domain combinations serve as the functional building blocks of the cell. The available methods to identify functional domain combinations are limited in their scope, e.g. to the identification of combinations falling within individual proteins or within specific regions in a translated genome. Further effort is needed to identify groups of domains that span across two or more proteins and are linked by a cooperative function. Such functional domain combinations can be useful for protein annotation. Results Using a new computational method, we have identified 114 groups of domains, referred to as domain assembly units (DASSEM units), in the proteome of budding yeast Saccharomyces cerevisiae. The units participate in many important cellular processes such as transcription regulation, translation initiation, and mRNA splicing. Within the units the domains were found to function in a cooperative manner; and each domain contributed to a different aspect of the unit's overall function. The member domains of DASSEM units were found to be significantly enriched among proteins contained in transcription modules, defined as genes sharing similar expression profiles and presumably similar functions. The observation further confirmed the functional coherence of DASSEM units. The functional linkages of units were found in both functionally characterized and uncharacterized proteins, which enabled the assessment of protein function based on domain composition. Conclusion A new computational method was developed to identify groups of domains that are linked by a common function in the proteome of Saccharomyces cerevisiae. These groups can either lie within individual proteins or span across different proteins. We propose that the functional linkages among the domains within the DASSEM units can be used as a non-homology based tool to annotate uncharacterized proteins. PMID:17937820
Visualizing and Clustering Protein Similarity Networks: Sequences, Structures, and Functions.
Mai, Te-Lun; Hu, Geng-Ming; Chen, Chi-Ming
2016-07-01
Research in the recent decade has demonstrated the usefulness of protein network knowledge in furthering the study of molecular evolution of proteins, understanding the robustness of cells to perturbation, and annotating new protein functions. In this study, we aimed to provide a general clustering approach to visualize the sequence-structure-function relationship of protein networks, and investigate possible causes for inconsistency in the protein classifications based on sequences, structures, and functions. Such visualization of protein networks could facilitate our understanding of the overall relationship among proteins and help researchers comprehend various protein databases. As a demonstration, we clustered 1437 enzymes by their sequences and structures using the minimum span clustering (MSC) method. The general structure of this protein network was delineated at two clustering resolutions, and the second level MSC clustering was found to be highly similar to existing enzyme classifications. The clustering of these enzymes based on sequence, structure, and function information is consistent with each other. For proteases, the Jaccard's similarity coefficient is 0.86 between sequence and function classifications, 0.82 between sequence and structure classifications, and 0.78 between structure and function classifications. From our clustering results, we discussed possible examples of divergent evolution and convergent evolution of enzymes. Our clustering approach provides a panoramic view of the sequence-structure-function network of proteins, helps visualize the relation between related proteins intuitively, and is useful in predicting the structure and function of newly determined protein sequences.
A Survey of Computational Intelligence Techniques in Protein Function Prediction
Tiwari, Arvind Kumar; Srivastava, Rajeev
2014-01-01
During the past, there was a massive growth of knowledge of unknown proteins with the advancement of high throughput microarray technologies. Protein function prediction is the most challenging problem in bioinformatics. In the past, the homology based approaches were used to predict the protein function, but they failed when a new protein was different from the previous one. Therefore, to alleviate the problems associated with homology based traditional approaches, numerous computational intelligence techniques have been proposed in the recent past. This paper presents a state-of-the-art comprehensive review of various computational intelligence techniques for protein function predictions using sequence, structure, protein-protein interaction network, and gene expression data used in wide areas of applications such as prediction of DNA and RNA binding sites, subcellular localization, enzyme functions, signal peptides, catalytic residues, nuclear/G-protein coupled receptors, membrane proteins, and pathway analysis from gene expression datasets. This paper also summarizes the result obtained by many researchers to solve these problems by using computational intelligence techniques with appropriate datasets to improve the prediction performance. The summary shows that ensemble classifiers and integration of multiple heterogeneous data are useful for protein function prediction. PMID:25574395
Protein Structure and Function Prediction Using I-TASSER
Yang, Jianyi; Zhang, Yang
2016-01-01
I-TASSER is a hierarchical protocol for automated protein structure prediction and structure-based function annotation. Starting from the amino acid sequence of target proteins, I-TASSER first generates full-length atomic structural models from multiple threading alignments and iterative structural assembly simulations followed by atomic-level structure refinement. The biological functions of the protein, including ligand-binding sites, enzyme commission number, and gene ontology terms, are then inferred from known protein function databases based on sequence and structure profile comparisons. I-TASSER is freely available as both an on-line server and a stand-alone package. This unit describes how to use the I-TASSER protocol to generate structure and function prediction and how to interpret the prediction results, as well as alternative approaches for further improving the I-TASSER modeling quality for distant-homologous and multi-domain protein targets. PMID:26678386
Electronegativity and intrinsic disorder of preeclampsia-related proteins.
Polanco, Carlos; Castañón-González, Jorge Alberto; Uversky, Vladimir N; Buhse, Thomas; Samaniego Mendoza, José Lino; Calva, Juan J
2017-01-01
Preeclampsia, hemorrhage, and infection are the leading causes of maternal death in underdeveloped countries. Since several proteins associated with preeclampsia are known, we conducted a computational study which evaluated the commonness and potential functionality of intrinsic disorder of these proteins and also made an attempt to characterize their origin. The origin of the preeclampsia-related proteins was assessed with a supervised technique, a Polarity Index Method (PIM), which evaluates the electronegativity of proteins based solely on their sequence. The commonness of intrinsic disorder was evaluated using several disorder predictors from the PONDR family, the charge-hydropathy plot (CH-plot) and cumulative distribution function (CDF) analyses, and using the MobiDB web-based tool, whereas potential functionality of intrinsic disorder was studied with the D2P2 resource and ANCHOR predictor of disorder-based binding sites, and the STRING tool was used to build the interactivity networks of the preeclampsia-related proteins. Peculiarities of the PIM-derived polar profile of the group of preeclampsia-related proteins were then compared with profiles of a group of lipoproteins, antimicrobial peptides, angiogenesis-related proteins, and the intrinsically disordered proteins. Our results showed a high graphical correlation between preeclampsia proteins, lipoproteins, and the angiogenesis proteins. We also showed that many preeclampsia-related proteins contain numerous functional disordered regions. Therefore, these bioinformatics results led us to assume that the preeclampsia proteins are highly associated with the lipoproteins group, and that some preeclampsia-related proteins contain significant amounts of functional disorders.
Cui, Jian; Liu, Jinghua; Li, Yuhua; Shi, Tieliu
2011-01-01
Mitochondria are major players on the production of energy, and host several key reactions involved in basic metabolism and biosynthesis of essential molecules. Currently, the majority of nucleus-encoded mitochondrial proteins are unknown even for model plant Arabidopsis. We reported a computational framework for predicting Arabidopsis mitochondrial proteins based on a probabilistic model, called Naive Bayesian Network, which integrates disparate genomic data generated from eight bioinformatics tools, multiple orthologous mappings, protein domain properties and co-expression patterns using 1,027 microarray profiles. Through this approach, we predicted 2,311 candidate mitochondrial proteins with 84.67% accuracy and 2.53% FPR performances. Together with those experimental confirmed proteins, 2,585 mitochondria proteins (named CoreMitoP) were identified, we explored those proteins with unknown functions based on protein-protein interaction network (PIN) and annotated novel functions for 26.65% CoreMitoP proteins. Moreover, we found newly predicted mitochondrial proteins embedded in particular subnetworks of the PIN, mainly functioning in response to diverse environmental stresses, like salt, draught, cold, and wound etc. Candidate mitochondrial proteins involved in those physiological acitivites provide useful targets for further investigation. Assigned functions also provide comprehensive information for Arabidopsis mitochondrial proteome. PMID:21297957
Text Mining Improves Prediction of Protein Functional Sites
Cohn, Judith D.; Ravikumar, Komandur E.
2012-01-01
We present an approach that integrates protein structure analysis and text mining for protein functional site prediction, called LEAP-FS (Literature Enhanced Automated Prediction of Functional Sites). The structure analysis was carried out using Dynamics Perturbation Analysis (DPA), which predicts functional sites at control points where interactions greatly perturb protein vibrations. The text mining extracts mentions of residues in the literature, and predicts that residues mentioned are functionally important. We assessed the significance of each of these methods by analyzing their performance in finding known functional sites (specifically, small-molecule binding sites and catalytic sites) in about 100,000 publicly available protein structures. The DPA predictions recapitulated many of the functional site annotations and preferentially recovered binding sites annotated as biologically relevant vs. those annotated as potentially spurious. The text-based predictions were also substantially supported by the functional site annotations: compared to other residues, residues mentioned in text were roughly six times more likely to be found in a functional site. The overlap of predictions with annotations improved when the text-based and structure-based methods agreed. Our analysis also yielded new high-quality predictions of many functional site residues that were not catalogued in the curated data sources we inspected. We conclude that both DPA and text mining independently provide valuable high-throughput protein functional site predictions, and that integrating the two methods using LEAP-FS further improves the quality of these predictions. PMID:22393388
Binding ligand prediction for proteins using partial matching of local surface patches.
Sael, Lee; Kihara, Daisuke
2010-01-01
Functional elucidation of uncharacterized protein structures is an important task in bioinformatics. We report our new approach for structure-based function prediction which captures local surface features of ligand binding pockets. Function of proteins, specifically, binding ligands of proteins, can be predicted by finding similar local surface regions of known proteins. To enable partial comparison of binding sites in proteins, a weighted bipartite matching algorithm is used to match pairs of surface patches. The surface patches are encoded with the 3D Zernike descriptors. Unlike the existing methods which compare global characteristics of the protein fold or the global pocket shape, the local surface patch method can find functional similarity between non-homologous proteins and binding pockets for flexible ligand molecules. The proposed method improves prediction results over global pocket shape-based method which was previously developed by our group.
Binding Ligand Prediction for Proteins Using Partial Matching of Local Surface Patches
Sael, Lee; Kihara, Daisuke
2010-01-01
Functional elucidation of uncharacterized protein structures is an important task in bioinformatics. We report our new approach for structure-based function prediction which captures local surface features of ligand binding pockets. Function of proteins, specifically, binding ligands of proteins, can be predicted by finding similar local surface regions of known proteins. To enable partial comparison of binding sites in proteins, a weighted bipartite matching algorithm is used to match pairs of surface patches. The surface patches are encoded with the 3D Zernike descriptors. Unlike the existing methods which compare global characteristics of the protein fold or the global pocket shape, the local surface patch method can find functional similarity between non-homologous proteins and binding pockets for flexible ligand molecules. The proposed method improves prediction results over global pocket shape-based method which was previously developed by our group. PMID:21614188
Functional equivalency inferred from "authoritative sources" in networks of homologous proteins.
Natarajan, Shreedhar; Jakobsson, Eric
2009-06-12
A one-on-one mapping of protein functionality across different species is a critical component of comparative analysis. This paper presents a heuristic algorithm for discovering the Most Likely Functional Counterparts (MoLFunCs) of a protein, based on simple concepts from network theory. A key feature of our algorithm is utilization of the user's knowledge to assign high confidence to selected functional identification. We show use of the algorithm to retrieve functional equivalents for 7 membrane proteins, from an exploration of almost 40 genomes form multiple online resources. We verify the functional equivalency of our dataset through a series of tests that include sequence, structure and function comparisons. Comparison is made to the OMA methodology, which also identifies one-on-one mapping between proteins from different species. Based on that comparison, we believe that incorporation of user's knowledge as a key aspect of the technique adds value to purely statistical formal methods.
Functional Equivalency Inferred from “Authoritative Sources” in Networks of Homologous Proteins
Natarajan, Shreedhar; Jakobsson, Eric
2009-01-01
A one-on-one mapping of protein functionality across different species is a critical component of comparative analysis. This paper presents a heuristic algorithm for discovering the Most Likely Functional Counterparts (MoLFunCs) of a protein, based on simple concepts from network theory. A key feature of our algorithm is utilization of the user's knowledge to assign high confidence to selected functional identification. We show use of the algorithm to retrieve functional equivalents for 7 membrane proteins, from an exploration of almost 40 genomes form multiple online resources. We verify the functional equivalency of our dataset through a series of tests that include sequence, structure and function comparisons. Comparison is made to the OMA methodology, which also identifies one-on-one mapping between proteins from different species. Based on that comparison, we believe that incorporation of user's knowledge as a key aspect of the technique adds value to purely statistical formal methods. PMID:19521530
Simulation-Based Validation of the p53 Transcriptional Activity with Hybrid Functional Petri Net.
Doi, Atsushi; Nagasaki, Masao; Matsuno, Hiroshi; Miyano, Satoru
2011-01-01
MDM2 and p19ARF are essential proteins in cancer pathways forming a complex with protein p53 to control the transcriptional activity of protein p53. It is confirmed that protein p53 loses its transcriptional activity by forming the functional dimer with protein MDM2. However, it is still unclear that protein p53 keeps its transcriptional activity when it forms the trimer with proteins MDM2 and p19ARF. We have observed mutual behaviors among genes p53, MDM2, p19ARF and their products on a computational model with hybrid functional Petri net (HFPN) which is constructed based on information described in the literature. The simulation results suggested that protein p53 should have the transcriptional activity in the forms of the trimer of proteins p53, MDM2, and p19ARF. This paper also discusses the advantages of HFPN based modeling method in terms of pathway description for simulations.
LenVarDB: database of length-variant protein domains.
Mutt, Eshita; Mathew, Oommen K; Sowdhamini, Ramanathan
2014-01-01
Protein domains are functionally and structurally independent modules, which add to the functional variety of proteins. This array of functional diversity has been enabled by evolutionary changes, such as amino acid substitutions or insertions or deletions, occurring in these protein domains. Length variations (indels) can introduce changes at structural, functional and interaction levels. LenVarDB (freely available at http://caps.ncbs.res.in/lenvardb/) traces these length variations, starting from structure-based sequence alignments in our Protein Alignments organized as Structural Superfamilies (PASS2) database, across 731 structural classification of proteins (SCOP)-based protein domain superfamilies connected to 2 730 625 sequence homologues. Alignment of sequence homologues corresponding to a structural domain is available, starting from a structure-based sequence alignment of the superfamily. Orientation of the length-variant (indel) regions in protein domains can be visualized by mapping them on the structure and on the alignment. Knowledge about location of length variations within protein domains and their visual representation will be useful in predicting changes within structurally or functionally relevant sites, which may ultimately regulate protein function. Non-technical summary: Evolutionary changes bring about natural changes to proteins that may be found in many organisms. Such changes could be reflected as amino acid substitutions or insertions-deletions (indels) in protein sequences. LenVarDB is a database that provides an early overview of observed length variations that were set among 731 protein families and after examining >2 million sequences. Indels are followed up to observe if they are close to the active site such that they can affect the activity of proteins. Inclusion of such information can aid the design of bioengineering experiments.
Predicting protein complex geometries with a neural network.
Chae, Myong-Ho; Krull, Florian; Lorenzen, Stephan; Knapp, Ernst-Walter
2010-03-01
A major challenge of the protein docking problem is to define scoring functions that can distinguish near-native protein complex geometries from a large number of non-native geometries (decoys) generated with noncomplexed protein structures (unbound docking). In this study, we have constructed a neural network that employs the information from atom-pair distance distributions of a large number of decoys to predict protein complex geometries. We found that docking prediction can be significantly improved using two different types of polar hydrogen atoms. To train the neural network, 2000 near-native decoys of even distance distribution were used for each of the 185 considered protein complexes. The neural network normalizes the information from different protein complexes using an additional protein complex identity input neuron for each complex. The parameters of the neural network were determined such that they mimic a scoring funnel in the neighborhood of the native complex structure. The neural network approach avoids the reference state problem, which occurs in deriving knowledge-based energy functions for scoring. We show that a distance-dependent atom pair potential performs much better than a simple atom-pair contact potential. We have compared the performance of our scoring function with other empirical and knowledge-based scoring functions such as ZDOCK 3.0, ZRANK, ITScore-PP, EMPIRE, and RosettaDock. In spite of the simplicity of the method and its functional form, our neural network-based scoring function achieves a reasonable performance in rigid-body unbound docking of proteins. Proteins 2010. (c) 2009 Wiley-Liss, Inc.
2014-01-01
Background Due to rapid sequencing of genomes, there are now millions of deposited protein sequences with no known function. Fast sequence-based comparisons allow detecting close homologs for a protein of interest to transfer functional information from the homologs to the given protein. Sequence-based comparison cannot detect remote homologs, in which evolution has adjusted the sequence while largely preserving structure. Structure-based comparisons can detect remote homologs but most methods for doing so are too expensive to apply at a large scale over structural databases of proteins. Recently, fragment-based structural representations have been proposed that allow fast detection of remote homologs with reasonable accuracy. These representations have also been used to obtain linearly-reducible maps of protein structure space. It has been shown, as additionally supported from analysis in this paper that such maps preserve functional co-localization of the protein structure space. Methods Inspired by a recent application of the Latent Dirichlet Allocation (LDA) model for conducting structural comparisons of proteins, we propose higher-order LDA-obtained topic-based representations of protein structures to provide an alternative route for remote homology detection and organization of the protein structure space in few dimensions. Various techniques based on natural language processing are proposed and employed to aid the analysis of topics in the protein structure domain. Results We show that a topic-based representation is just as effective as a fragment-based one at automated detection of remote homologs and organization of protein structure space. We conduct a detailed analysis of the information content in the topic-based representation, showing that topics have semantic meaning. The fragment-based and topic-based representations are also shown to allow prediction of superfamily membership. Conclusions This work opens exciting venues in designing novel representations to extract information about protein structures, as well as organizing and mining protein structure space with mature text mining tools. PMID:25080993
Evolution-Based Functional Decomposition of Proteins
Rivoire, Olivier; Reynolds, Kimberly A.; Ranganathan, Rama
2016-01-01
The essential biological properties of proteins—folding, biochemical activities, and the capacity to adapt—arise from the global pattern of interactions between amino acid residues. The statistical coupling analysis (SCA) is an approach to defining this pattern that involves the study of amino acid coevolution in an ensemble of sequences comprising a protein family. This approach indicates a functional architecture within proteins in which the basic units are coupled networks of amino acids termed sectors. This evolution-based decomposition has potential for new understandings of the structural basis for protein function. To facilitate its usage, we present here the principles and practice of the SCA and introduce new methods for sector analysis in a python-based software package (pySCA). We show that the pattern of amino acid interactions within sectors is linked to the divergence of functional lineages in a multiple sequence alignment—a model for how sector properties might be differentially tuned in members of a protein family. This work provides new tools for studying proteins and for generally testing the concept of sectors as the principal units of function and adaptive variation. PMID:27254668
Nadzirin, Nurul; Firdaus-Raih, Mohd
2012-10-08
Proteins of uncharacterized functions form a large part of many of the currently available biological databases and this situation exists even in the Protein Data Bank (PDB). Our analysis of recent PDB data revealed that only 42.53% of PDB entries (1084 coordinate files) that were categorized under "unknown function" are true examples of proteins of unknown function at this point in time. The remainder 1465 entries also annotated as such appear to be able to have their annotations re-assessed, based on the availability of direct functional characterization experiments for the protein itself, or for homologous sequences or structures thus enabling computational function inference.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Giulliani, S. E.; Frank, A. E.; Collart, F. R.
2008-12-08
We have used a fluorescence-based thermal shift (FTS) assay to identify amino acids that bind to solute-binding proteins in the bacterial ABC transporter family. The assay was validated with a set of six proteins with known binding specificity and was consistently able to map proteins with their known binding ligands. The assay also identified additional candidate binding ligands for several of the amino acid-binding proteins in the validation set. We extended this approach to additional targets and demonstrated the ability of the FTS assay to unambiguously identify preferential binding for several homologues of amino acid-binding proteins with known specificity andmore » to functionally annotate proteins of unknown binding specificity. The assay is implemented in a microwell plate format and provides a rapid approach to validate an anticipated function or to screen proteins of unknown function. The ABC-type transporter family is ubiquitous and transports a variety of biological compounds, but the current annotation of the ligand-binding proteins is limited to mostly generic descriptions of function. The results illustrate the feasibility of the FTS assay to improve the functional annotation of binding proteins associated with ABC-type transporters and suggest this approach that can also be extended to other protein families.« less
Leuthaeuser, Janelle B; Knutson, Stacy T; Kumar, Kiran; Babbitt, Patricia C; Fetrow, Jacquelyn S
2015-01-01
The development of accurate protein function annotation methods has emerged as a major unsolved biological problem. Protein similarity networks, one approach to function annotation via annotation transfer, group proteins into similarity-based clusters. An underlying assumption is that the edge metric used to identify such clusters correlates with functional information. In this contribution, this assumption is evaluated by observing topologies in similarity networks using three different edge metrics: sequence (BLAST), structure (TM-Align), and active site similarity (active site profiling, implemented in DASP). Network topologies for four well-studied protein superfamilies (enolase, peroxiredoxin (Prx), glutathione transferase (GST), and crotonase) were compared with curated functional hierarchies and structure. As expected, network topology differs, depending on edge metric; comparison of topologies provides valuable information on structure/function relationships. Subnetworks based on active site similarity correlate with known functional hierarchies at a single edge threshold more often than sequence- or structure-based networks. Sequence- and structure-based networks are useful for identifying sequence and domain similarities and differences; therefore, it is important to consider the clustering goal before deciding appropriate edge metric. Further, conserved active site residues identified in enolase and GST active site subnetworks correspond with published functionally important residues. Extension of this analysis yields predictions of functionally determinant residues for GST subgroups. These results support the hypothesis that active site similarity-based networks reveal clusters that share functional details and lay the foundation for capturing functionally relevant hierarchies using an approach that is both automatable and can deliver greater precision in function annotation than current similarity-based methods. PMID:26073648
In silico identification of functional regions in proteins.
Nimrod, Guy; Glaser, Fabian; Steinberg, David; Ben-Tal, Nir; Pupko, Tal
2005-06-01
In silico prediction of functional regions on protein surfaces, i.e. sites of interaction with DNA, ligands, substrates and other proteins, is of utmost importance in various applications in the emerging fields of proteomics and structural genomics. When a sufficient number of homologs is found, powerful prediction schemes can be based on the observation that evolutionarily conserved regions are often functionally important, typically, only the principal functionally important region of the protein is detected, while secondary functional regions with weaker conservation signals are overlooked. Moreover, it is challenging to unambiguously identify the boundaries of the functional regions. We present a new methodology, called PatchFinder, that automatically identifies patches of conserved residues that are located in close proximity to each other on the protein surface. PatchFinder is based on the following steps: (1) Assignment of conservation scores to each amino acid position on the protein surface. (2) Assignment of a score to each putative patch, based on its likelihood to be functionally important. The patch of maximum likelihood is considered to be the main functionally important region, and the search is continued for non-overlapping patches of secondary importance. We examined the accuracy of the method using the IGPS enzyme, the SH2 domain and a benchmark set of 112 proteins. These examples demonstrated that PatchFinder is capable of identifying both the main and secondary functional patches. The PatchFinder program is available at: http://ashtoret.tau.ac.il/~nimrodg/
Modelling protein functional domains in signal transduction using Maude
NASA Technical Reports Server (NTRS)
Sriram, M. G.
2003-01-01
Modelling of protein-protein interactions in signal transduction is receiving increased attention in computational biology. This paper describes recent research in the application of Maude, a symbolic language founded on rewriting logic, to the modelling of functional domains within signalling proteins. Protein functional domains (PFDs) are a critical focus of modern signal transduction research. In general, Maude models can simulate biological signalling networks and produce specific testable hypotheses at various levels of abstraction. Developing symbolic models of signalling proteins containing functional domains is important because of the potential to generate analyses of complex signalling networks based on structure-function relationships.
[Supercomputer investigation of the protein-ligand system low-energy minima].
Oferkin, I V; Sulimov, A V; Katkova, E V; Kutov, D K; Grigoriev, F V; Kondakova, O A; Sulimov, V B
2015-01-01
The accuracy of the protein-ligand binding energy calculations and ligand positioning is strongly influenced by the choice of the docking target function. This work demonstrates the evaluation of the five different target functions used in docking: functions based on MMFF94 force field and functions based on PM7 quantum-chemical method accounting or without accounting the implicit solvent model (PCM, COSMO or SGB). For these purposes the ligand positions corresponding to the minima of the target function and the experimentally known ligand positions in the protein active site (crystal ligand positions) were compared. Each function was examined on the same test-set of 16 protein-ligand complexes. The new parallelized docking program FLM based on Monte Carlo search algorithm was developed to perform the comprehensive low-energy minima search and to calculate the protein-ligand binding energy. This study demonstrates that the docking target function based on the MMFF94 force field can be used to detect the crystal or near crystal positions of the ligand by the finding the low-energy local minima spectrum of the target function. The importance of solvent accounting in the docking process for the accurate ligand positioning is also shown. The accuracy of the ligand positioning as well as the correlation between the calculated and experimentally determined protein-ligand binding energies are improved when the MMFF94 force field is substituted by the new PM7 method with implicit solvent accounting.
Molecular Dynamics Information Improves cis-Peptide-Based Function Annotation of Proteins.
Das, Sreetama; Bhadra, Pratiti; Ramakumar, Suryanarayanarao; Pal, Debnath
2017-08-04
cis-Peptide bonds, whose occurrence in proteins is rare but evolutionarily conserved, are implicated to play an important role in protein function. This has led to their previous use in a homology-independent, fragment-match-based protein function annotation method. However, proteins are not static molecules; dynamics is integral to their activity. This is nicely epitomized by the geometric isomerization of cis-peptide to trans form for molecular activity. Hence we have incorporated both static (cis-peptide) and dynamics information to improve the prediction of protein molecular function. Our results show that cis-peptide information alone cannot detect functional matches in cases where cis-trans isomerization exists but 3D coordinates have been obtained for only the trans isomer or when the cis-peptide bond is incorrectly assigned as trans. On the contrary, use of dynamics information alone includes false-positive matches for cases where fragments with similar secondary structure show similar dynamics, but the proteins do not share a common function. Combining the two methods reduces errors while detecting the true matches, thereby enhancing the utility of our method in function annotation. A combined approach, therefore, opens up new avenues of improving existing automated function annotation methodologies.
von Grotthuss, Marcin; Plewczynski, Dariusz; Ginalski, Krzysztof; Rychlewski, Leszek; Shakhnovich, Eugene I
2006-02-06
The number of protein structures from structural genomics centers dramatically increases in the Protein Data Bank (PDB). Many of these structures are functionally unannotated because they have no sequence similarity to proteins of known function. However, it is possible to successfully infer function using only structural similarity. Here we present the PDB-UF database, a web-accessible collection of predictions of enzymatic properties using structure-function relationship. The assignments were conducted for three-dimensional protein structures of unknown function that come from structural genomics initiatives. We show that 4 hypothetical proteins (with PDB accession codes: 1VH0, 1NS5, 1O6D, and 1TO0), for which standard BLAST tools such as PSI-BLAST or RPS-BLAST failed to assign any function, are probably methyltransferase enzymes. We suggest that the structure-based prediction of an EC number should be conducted having the different similarity score cutoff for different protein folds. Moreover, performing the annotation using two different algorithms can reduce the rate of false positive assignments. We believe, that the presented web-based repository will help to decrease the number of protein structures that have functions marked as "unknown" in the PDB file. http://paradox.harvard.edu/PDB-UF and http://bioinfo.pl/PDB-UF.
Functionalizing Microporous Membranes for Protein Purification and Protein Digestion
NASA Astrophysics Data System (ADS)
Dong, Jinlan; Bruening, Merlin L.
2015-07-01
This review examines advances in the functionalization of microporous membranes for protein purification and the development of protease-containing membranes for controlled protein digestion prior to mass spectrometry analysis. Recent studies confirm that membranes are superior to bead-based columns for rapid protein capture, presumably because convective mass transport in membrane pores rapidly brings proteins to binding sites. Modification of porous membranes with functional polymeric films or TiO2 nanoparticles yields materials that selectively capture species ranging from phosphopeptides to His-tagged proteins, and protein-binding capacities often exceed those of commercial beads. Thin membranes also provide a convenient framework for creating enzyme-containing reactors that afford control over residence times. With millisecond residence times, reactors with immobilized proteases limit protein digestion to increase sequence coverage in mass spectrometry analysis and facilitate elucidation of protein structures. This review emphasizes the advantages of membrane-based techniques and concludes with some challenges for their practical application.
Functionalizing Microporous Membranes for Protein Purification and Protein Digestion.
Dong, Jinlan; Bruening, Merlin L
2015-01-01
This review examines advances in the functionalization of microporous membranes for protein purification and the development of protease-containing membranes for controlled protein digestion prior to mass spectrometry analysis. Recent studies confirm that membranes are superior to bead-based columns for rapid protein capture, presumably because convective mass transport in membrane pores rapidly brings proteins to binding sites. Modification of porous membranes with functional polymeric films or TiO₂ nanoparticles yields materials that selectively capture species ranging from phosphopeptides to His-tagged proteins, and protein-binding capacities often exceed those of commercial beads. Thin membranes also provide a convenient framework for creating enzyme-containing reactors that afford control over residence times. With millisecond residence times, reactors with immobilized proteases limit protein digestion to increase sequence coverage in mass spectrometry analysis and facilitate elucidation of protein structures. This review emphasizes the advantages of membrane-based techniques and concludes with some challenges for their practical application.
Proteomics profiling of interactome dynamics by colocalisation analysis (COLA).
Mardakheh, Faraz K; Sailem, Heba Z; Kümper, Sandra; Tape, Christopher J; McCully, Ryan R; Paul, Angela; Anjomani-Virmouni, Sara; Jørgensen, Claus; Poulogiannis, George; Marshall, Christopher J; Bakal, Chris
2016-12-20
Localisation and protein function are intimately linked in eukaryotes, as proteins are localised to specific compartments where they come into proximity of other functionally relevant proteins. Significant co-localisation of two proteins can therefore be indicative of their functional association. We here present COLA, a proteomics based strategy coupled with a bioinformatics framework to detect protein-protein co-localisations on a global scale. COLA reveals functional interactions by matching proteins with significant similarity in their subcellular localisation signatures. The rapid nature of COLA allows mapping of interactome dynamics across different conditions or treatments with high precision.
Bhagavat, Raghu; Sankar, Santhosh; Srinivasan, Narayanaswamy; Chandra, Nagasuma
2018-03-06
Protein-ligand interactions form the basis of most cellular events. Identifying ligand binding pockets in proteins will greatly facilitate rationalizing and predicting protein function. Ligand binding sites are unknown for many proteins of known three-dimensional (3D) structure, creating a gap in our understanding of protein structure-function relationships. To bridge this gap, we detect pockets in proteins of known 3D structures, using computational techniques. This augmented pocketome (PocketDB) consists of 249,096 pockets, which is about seven times larger than what is currently known. We deduce possible ligand associations for about 46% of the newly identified pockets. The augmented pocketome, when subjected to clustering based on similarities among pockets, yielded 2,161 site types, which are associated with 1,037 ligand types, together providing fold-site-type-ligand-type associations. The PocketDB resource facilitates a structure-based function annotation, delineation of the structural basis of ligand recognition, and provides functional clues for domains of unknown functions, allosteric proteins, and druggable pockets. Copyright © 2018 Elsevier Ltd. All rights reserved.
Automatic annotation of protein motif function with Gene Ontology terms.
Lu, Xinghua; Zhai, Chengxiang; Gopalakrishnan, Vanathi; Buchanan, Bruce G
2004-09-02
Conserved protein sequence motifs are short stretches of amino acid sequence patterns that potentially encode the function of proteins. Several sequence pattern searching algorithms and programs exist foridentifying candidate protein motifs at the whole genome level. However, a much needed and important task is to determine the functions of the newly identified protein motifs. The Gene Ontology (GO) project is an endeavor to annotate the function of genes or protein sequences with terms from a dynamic, controlled vocabulary and these annotations serve well as a knowledge base. This paper presents methods to mine the GO knowledge base and use the association between the GO terms assigned to a sequence and the motifs matched by the same sequence as evidence for predicting the functions of novel protein motifs automatically. The task of assigning GO terms to protein motifs is viewed as both a binary classification and information retrieval problem, where PROSITE motifs are used as samples for mode training and functional prediction. The mutual information of a motif and aGO term association is found to be a very useful feature. We take advantage of the known motifs to train a logistic regression classifier, which allows us to combine mutual information with other frequency-based features and obtain a probability of correct association. The trained logistic regression model has intuitively meaningful and logically plausible parameter values, and performs very well empirically according to our evaluation criteria. In this research, different methods for automatic annotation of protein motifs have been investigated. Empirical result demonstrated that the methods have a great potential for detecting and augmenting information about the functions of newly discovered candidate protein motifs.
An ensemble framework for clustering protein-protein interaction networks.
Asur, Sitaram; Ucar, Duygu; Parthasarathy, Srinivasan
2007-07-01
Protein-Protein Interaction (PPI) networks are believed to be important sources of information related to biological processes and complex metabolic functions of the cell. The presence of biologically relevant functional modules in these networks has been theorized by many researchers. However, the application of traditional clustering algorithms for extracting these modules has not been successful, largely due to the presence of noisy false positive interactions as well as specific topological challenges in the network. In this article, we propose an ensemble clustering framework to address this problem. For base clustering, we introduce two topology-based distance metrics to counteract the effects of noise. We develop a PCA-based consensus clustering technique, designed to reduce the dimensionality of the consensus problem and yield informative clusters. We also develop a soft consensus clustering variant to assign multifaceted proteins to multiple functional groups. We conduct an empirical evaluation of different consensus techniques using topology-based, information theoretic and domain-specific validation metrics and show that our approaches can provide significant benefits over other state-of-the-art approaches. Our analysis of the consensus clusters obtained demonstrates that ensemble clustering can (a) produce improved biologically significant functional groupings; and (b) facilitate soft clustering by discovering multiple functional associations for proteins. Supplementary data are available at Bioinformatics online.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, Jing; Ma, Zihao; Carr, Steven A.
Coexpression of mRNAs under multiple conditions is commonly used to infer cofunctionality of their gene products despite well-known limitations of this “guilt-by-association” (GBA) approach. Recent advancements in mass spectrometry-based proteomic technologies have enabled global expression profiling at the protein level; however, whether proteome profiling data can outperform transcriptome profiling data for coexpression based gene function prediction has not been systematically investigated. Here, we address this question by constructing and analyzing mRNA and protein coexpression networks for three cancer types with matched mRNA and protein profiling data from The Cancer Genome Atlas (TCGA) and the Clinical Proteomic Tumor Analysis Consortium (CPTAC).more » Our analyses revealed a marked difference in wiring between the mRNA and protein coexpression networks. Whereas protein coexpression was driven primarily by functional similarity between coexpressed genes, mRNA coexpression was driven by both cofunction and chromosomal colocalization of the genes. Functionally coherent mRNA modules were more likely to have their edges preserved in corresponding protein networks than functionally incoherent mRNA modules. Proteomic data strengthened the link between gene expression and function for at least 75% of Gene Ontology (GO) biological processes and 90% of KEGG pathways. A web application Gene2Net (http://cptac.gene2net.org) developed based on the three protein coexpression networks revealed novel gene-function relationships, such as linking ERBB2 (HER2) to lipid biosynthetic process in breast cancer, identifying PLG as a new gene involved in complement activation, and identifying AEBP1 as a new epithelial-mesenchymal transition (EMT) marker. Our results demonstrate that proteome profiling outperforms transcriptome profiling for coexpression based gene function prediction. Proteomics should be integrated if not preferred in gene function and human disease studies. Molecular & Cellular Proteomics 16: 10.1074/mcp.M116.060301, 121–134, 2017.« less
Bandyopadhyay, Deepak; Huan, Jun; Prins, Jan; Snoeyink, Jack; Wang, Wei; Tropsha, Alexander
2009-11-01
Protein function prediction is one of the central problems in computational biology. We present a novel automated protein structure-based function prediction method using libraries of local residue packing patterns that are common to most proteins in a known functional family. Critical to this approach is the representation of a protein structure as a graph where residue vertices (residue name used as a vertex label) are connected by geometrical proximity edges. The approach employs two steps. First, it uses a fast subgraph mining algorithm to find all occurrences of family-specific labeled subgraphs for all well characterized protein structural and functional families. Second, it queries a new structure for occurrences of a set of motifs characteristic of a known family, using a graph index to speed up Ullman's subgraph isomorphism algorithm. The confidence of function inference from structure depends on the number of family-specific motifs found in the query structure compared with their distribution in a large non-redundant database of proteins. This method can assign a new structure to a specific functional family in cases where sequence alignments, sequence patterns, structural superposition and active site templates fail to provide accurate annotation.
2012-01-01
Background To discover a compound inhibiting multiple proteins (i.e. polypharmacological targets) is a new paradigm for the complex diseases (e.g. cancers and diabetes). In general, the polypharmacological proteins often share similar local binding environments and motifs. As the exponential growth of the number of protein structures, to find the similar structural binding motifs (pharma-motifs) is an emergency task for drug discovery (e.g. side effects and new uses for old drugs) and protein functions. Results We have developed a Space-Related Pharmamotifs (called SRPmotif) method to recognize the binding motifs by searching against protein structure database. SRPmotif is able to recognize conserved binding environments containing spatially discontinuous pharma-motifs which are often short conserved peptides with specific physico-chemical properties for protein functions. Among 356 pharma-motifs, 56.5% interacting residues are highly conserved. Experimental results indicate that 81.1% and 92.7% polypharmacological targets of each protein-ligand complex are annotated with same biological process (BP) and molecular function (MF) terms, respectively, based on Gene Ontology (GO). Our experimental results show that the identified pharma-motifs often consist of key residues in functional (active) sites and play the key roles for protein functions. The SRPmotif is available at http://gemdock.life.nctu.edu.tw/SRP/. Conclusions SRPmotif is able to identify similar pharma-interfaces and pharma-motifs sharing similar binding environments for polypharmacological targets by rapidly searching against the protein structure database. Pharma-motifs describe the conservations of binding environments for drug discovery and protein functions. Additionally, these pharma-motifs provide the clues for discovering new sequence-based motifs to predict protein functions from protein sequence databases. We believe that SRPmotif is useful for elucidating protein functions and drug discovery. PMID:23281852
Chiu, Yi-Yuan; Lin, Chun-Yu; Lin, Chih-Ta; Hsu, Kai-Cheng; Chang, Li-Zen; Yang, Jinn-Moon
2012-01-01
To discover a compound inhibiting multiple proteins (i.e. polypharmacological targets) is a new paradigm for the complex diseases (e.g. cancers and diabetes). In general, the polypharmacological proteins often share similar local binding environments and motifs. As the exponential growth of the number of protein structures, to find the similar structural binding motifs (pharma-motifs) is an emergency task for drug discovery (e.g. side effects and new uses for old drugs) and protein functions. We have developed a Space-Related Pharmamotifs (called SRPmotif) method to recognize the binding motifs by searching against protein structure database. SRPmotif is able to recognize conserved binding environments containing spatially discontinuous pharma-motifs which are often short conserved peptides with specific physico-chemical properties for protein functions. Among 356 pharma-motifs, 56.5% interacting residues are highly conserved. Experimental results indicate that 81.1% and 92.7% polypharmacological targets of each protein-ligand complex are annotated with same biological process (BP) and molecular function (MF) terms, respectively, based on Gene Ontology (GO). Our experimental results show that the identified pharma-motifs often consist of key residues in functional (active) sites and play the key roles for protein functions. The SRPmotif is available at http://gemdock.life.nctu.edu.tw/SRP/. SRPmotif is able to identify similar pharma-interfaces and pharma-motifs sharing similar binding environments for polypharmacological targets by rapidly searching against the protein structure database. Pharma-motifs describe the conservations of binding environments for drug discovery and protein functions. Additionally, these pharma-motifs provide the clues for discovering new sequence-based motifs to predict protein functions from protein sequence databases. We believe that SRPmotif is useful for elucidating protein functions and drug discovery.
Improved protein model quality assessments by changing the target function.
Uziela, Karolis; Menéndez Hurtado, David; Shu, Nanjiang; Wallner, Björn; Elofsson, Arne
2018-06-01
Protein modeling quality is an important part of protein structure prediction. We have for more than a decade developed a set of methods for this problem. We have used various types of description of the protein and different machine learning methodologies. However, common to all these methods has been the target function used for training. The target function in ProQ describes the local quality of a residue in a protein model. In all versions of ProQ the target function has been the S-score. However, other quality estimation functions also exist, which can be divided into superposition- and contact-based methods. The superposition-based methods, such as S-score, are based on a rigid body superposition of a protein model and the native structure, while the contact-based methods compare the local environment of each residue. Here, we examine the effects of retraining our latest predictor, ProQ3D, using identical inputs but different target functions. We find that the contact-based methods are easier to predict and that predictors trained on these measures provide some advantages when it comes to identifying the best model. One possible reason for this is that contact based methods are better at estimating the quality of multi-domain targets. However, training on the S-score gives the best correlation with the GDT_TS score, which is commonly used in CASP to score the global model quality. To take the advantage of both of these features we provide an updated version of ProQ3D that predicts local and global model quality estimates based on different quality estimates. © 2018 Wiley Periodicals, Inc.
Protein Function Prediction: Problems and Pitfalls.
Pearson, William R
2015-09-03
The characterization of new genomes based on their protein sets has been revolutionized by new sequencing technologies, but biologists seeking to exploit new sequence information are often frustrated by the challenges associated with accurately assigning biological functions to newly identified proteins. Here, we highlight some of the challenges in functional inference from sequence similarity. Investigators can improve the accuracy of function prediction by (1) being conservative about the evolutionary distance to a protein of known function; (2) considering the ambiguous meaning of "functional similarity," and (3) being aware of the limitations of annotations in functional databases. Protein function prediction does not offer "one-size-fits-all" solutions. Prediction strategies work better when the idiosyncrasies of function and functional annotation are better understood. Copyright © 2015 John Wiley & Sons, Inc.
Protein engineering and its applications in food industry.
Kapoor, Swati; Rafiq, Aasima; Sharma, Savita
2017-07-24
Protein engineering is a young discipline that has been branched out from the field of genetic engineering. Protein engineering is based on the available knowledge about the proteins structure/function(s), tools/instruments, software, bioinformatics database, available cloned gene, knowledge about available protein, vectors, recombinant strains and other materials that could lead to change in the protein backbone. Protein produced properly from genetic engineering process means a protein that is able to fold correctly and to do particular function(s) efficiently even after being subjected to engineering practices. Protein is modified through its gene or chemically. However, modification of protein through gene is easier. There is no specific limitation of Protein Engineering tools; any technique that can lead to change the protein constituent of amino acid and result in the modification of protein structure/function is in the frame of Protein Engineering. Meanwhile, there are some common tools used to reach a specific target. More active industrial and pharmaceutical based proteins have been invented by the field of Protein Engineering to introduce new function as well as to change its interaction with surrounding environment. A variety of protein engineering applications have been reported in the literature. These applications range from biocatalysis for food and industry to environmental, medical and nanobiotechnology applications. Successful combinations of various protein engineering methods had led to successful results in food industries and have created a scope to maintain the quality of finished product after processing.
Functional annotation from the genome sequence of the giant panda.
Huo, Tong; Zhang, Yinjie; Lin, Jianping
2012-08-01
The giant panda is one of the most critically endangered species due to the fragmentation and loss of its habitat. Studying the functions of proteins in this animal, especially specific trait-related proteins, is therefore necessary to protect the species. In this work, the functions of these proteins were investigated using the genome sequence of the giant panda. Data on 21,001 proteins and their functions were stored in the Giant Panda Protein Database, in which the proteins were divided into two groups: 20,179 proteins whose functions can be predicted by GeneScan formed the known-function group, whereas 822 proteins whose functions cannot be predicted by GeneScan comprised the unknown-function group. For the known-function group, we further classified the proteins by molecular function, biological process, cellular component, and tissue specificity. For the unknown-function group, we developed a strategy in which the proteins were filtered by cross-Blast to identify panda-specific proteins under the assumption that proteins related to the panda-specific traits in the unknown-function group exist. After this filtering procedure, we identified 32 proteins (2 of which are membrane proteins) specific to the giant panda genome as compared against the dog and horse genomes. Based on their amino acid sequences, these 32 proteins were further analyzed by functional classification using SVM-Prot, motif prediction using MyHits, and interacting protein prediction using the Database of Interacting Proteins. Nineteen proteins were predicted to be zinc-binding proteins, thus affecting the activities of nucleic acids. The 32 panda-specific proteins will be further investigated by structural and functional analysis.
Protein-Based Drug-Delivery Materials.
Jao, Dave; Xue, Ye; Medina, Jethro; Hu, Xiao
2017-05-09
There is a pressing need for long-term, controlled drug release for sustained treatment of chronic or persistent medical conditions and diseases. Guided drug delivery is difficult because therapeutic compounds need to survive numerous transport barriers and binding targets throughout the body. Nanoscale protein-based polymers are increasingly used for drug and vaccine delivery to cross these biological barriers and through blood circulation to their molecular site of action. Protein-based polymers compared to synthetic polymers have the advantages of good biocompatibility, biodegradability, environmental sustainability, cost effectiveness and availability. This review addresses the sources of protein-based polymers, compares the similarity and differences, and highlights characteristic properties and functionality of these protein materials for sustained and controlled drug release. Targeted drug delivery using highly functional multicomponent protein composites to guide active drugs to the site of interest will also be discussed. A systematical elucidation of drug-delivery efficiency in the case of molecular weight, particle size, shape, morphology, and porosity of materials will then be demonstrated to achieve increased drug absorption. Finally, several important biomedical applications of protein-based materials with drug-delivery function-including bone healing, antibiotic release, wound healing, and corneal regeneration, as well as diabetes, neuroinflammation and cancer treatments-are summarized at the end of this review.
Lubin, Johnathan W; Rao, Timsi; Mandell, Edward K; Wuttke, Deborah S; Lundblad, Victoria
2013-03-01
Mutations that confer the loss of a single biochemical property (separation-of-function mutations) can often uncover a previously unknown role for a protein in a particular biological process. However, most mutations are identified based on loss-of-function phenotypes, which cannot differentiate between separation-of-function alleles vs. mutations that encode unstable/unfolded proteins. An alternative approach is to use overexpression dominant-negative (ODN) phenotypes to identify mutant proteins that disrupt function in an otherwise wild-type strain when overexpressed. This is based on the assumption that such mutant proteins retain an overall structure that is comparable to that of the wild-type protein and are able to compete with the endogenous protein (Herskowitz 1987). To test this, the in vivo phenotypes of mutations in the Est3 telomerase subunit from Saccharomyces cerevisiae were compared with the in vitro secondary structure of these mutant proteins as analyzed by circular-dichroism spectroscopy, which demonstrates that ODN is a more sensitive assessment of protein stability than the commonly used method of monitoring protein levels from extracts. Reverse mutagenesis of EST3, which targeted different categories of amino acids, also showed that mutating highly conserved charged residues to the oppositely charged amino acid had an increased likelihood of generating a severely defective est3(-) mutation, which nevertheless encoded a structurally stable protein. These results suggest that charge-swap mutagenesis directed at a limited subset of highly conserved charged residues, combined with ODN screening to eliminate partially unfolded proteins, may provide a widely applicable and efficient strategy for generating separation-of-function mutations.
Ferrada, Evandro; Vergara, Ismael A; Melo, Francisco
2007-01-01
The correct discrimination between native and near-native protein conformations is essential for achieving accurate computer-based protein structure prediction. However, this has proven to be a difficult task, since currently available physical energy functions, empirical potentials and statistical scoring functions are still limited in achieving this goal consistently. In this work, we assess and compare the ability of different full atom knowledge-based potentials to discriminate between native protein structures and near-native protein conformations generated by comparative modeling. Using a benchmark of 152 near-native protein models and their corresponding native structures that encompass several different folds, we demonstrate that the incorporation of close non-bonded pairwise atom terms improves the discriminating power of the empirical potentials. Since the direct and unbiased derivation of close non-bonded terms from current experimental data is not possible, we obtained and used those terms from the corresponding pseudo-energy functions of a non-local knowledge-based potential. It is shown that this methodology significantly improves the discrimination between native and near-native protein conformations, suggesting that a proper description of close non-bonded terms is important to achieve a more complete and accurate description of native protein conformations. Some external knowledge-based energy functions that are widely used in model assessment performed poorly, indicating that the benchmark of models and the specific discrimination task tested in this work constitutes a difficult challenge.
Exploring the evolution of protein function in Archaea.
Goncearenco, Alexander; Berezovsky, Igor N
2012-05-30
Despite recent progress in studies of the evolution of protein function, the questions what were the first functional protein domains and what were their basic building blocks remain unresolved. Previously, we introduced the concept of elementary functional loops (EFLs), which are the functional units of enzymes that provide elementary reactions in biochemical transformations. They are presumably descendants of primordial catalytic peptides. We analyzed distant evolutionary connections between protein functions in Archaea based on the EFLs comprising them. We show examples of the involvement of EFLs in new functional domains, as well as reutilization of EFLs and functional domains in building multidomain structures and protein complexes. Our analysis of the archaeal superkingdom yields the dominating mechanisms in different periods of protein evolution, which resulted in several levels of the organization of biochemical function. First, functional domains emerged as combinations of prebiotic peptides with the very basic functions, such as nucleotide/phosphate and metal cofactor binding. Second, domain recombination brought to the evolutionary scene the multidomain proteins and complexes. Later, reutilization and de novo design of functional domains and elementary functional loops complemented evolution of protein function.
NASA Astrophysics Data System (ADS)
Huber, Matthias C.; Schreiber, Andreas; von Olshausen, Philipp; Varga, Balázs R.; Kretz, Oliver; Joch, Barbara; Barnert, Sabine; Schubert, Rolf; Eimer, Stefan; Kele, Péter; Schiller, Stefan M.
2015-01-01
Nanoscale biological materials formed by the assembly of defined block-domain proteins control the formation of cellular compartments such as organelles. Here, we introduce an approach to intentionally ‘program’ the de novo synthesis and self-assembly of genetically encoded amphiphilic proteins to form cellular compartments, or organelles, in Escherichia coli. These proteins serve as building blocks for the formation of artificial compartments in vivo in a similar way to lipid-based organelles. We investigated the formation of these organelles using epifluorescence microscopy, total internal reflection fluorescence microscopy and transmission electron microscopy. The in vivo modification of these protein-based de novo organelles, by means of site-specific incorporation of unnatural amino acids, allows the introduction of artificial chemical functionalities. Co-localization of membrane proteins results in the formation of functionalized artificial organelles combining artificial and natural cellular function. Adding these protein structures to the cellular machinery may have consequences in nanobiotechnology, synthetic biology and materials science, including the constitution of artificial cells and bio-based metamaterials.
Nagata, Koji
2010-01-01
Peptides and proteins with similar amino acid sequences can have different biological functions. Knowledge of their three-dimensional molecular structures is critically important in identifying their functional determinants. In this review, I describe the results of our and other groups' structure-based functional characterization of insect insulin-like peptides, a crustacean hyperglycemic hormone-family peptide, a mammalian epidermal growth factor-family protein, and an intracellular signaling domain that recognizes proline-rich sequence.
Functionality of alternative protein in gluten-free product development.
Deora, Navneet Singh; Deswal, Aastha; Mishra, Hari Niwas
2015-07-01
Celiac disease is an immune-mediated disease triggered in genetically susceptible individuals by ingested gluten from wheat, rye, barley, and other closely related cereal grains. The current treatment for celiac disease is life-long adherence to a strict gluten-exclusion diet. The replacement of gluten presents a significant technological challenge, as it is an essential structure-building protein, which is necessary for formulating high-quality baked goods. A major limitation in the production of gluten-free products is the lack of protein functionality in non-wheat cereals. Additionally, commercial gluten-free mixes usually contain only carbohydrates, which may significantly limit the amount of protein in the diet. In the recent past, various approaches are attempted to incorporate protein-based ingredients and to modify the functional properties for gluten-free product development. This review aims to the highlight functionality of the alternative protein-based ingredients, which can be utilized for gluten-free product development both functionally as well as nutritionally. © The Author(s) 2014.
Phagonaute: A web-based interface for phage synteny browsing and protein function prediction.
Delattre, Hadrien; Souiai, Oussema; Fagoonee, Khema; Guerois, Raphaël; Petit, Marie-Agnès
2016-09-01
Distant homology search tools are of great help to predict viral protein functions. However, due to the lack of profile databases dedicated to viruses, they can lack sensitivity. We constructed HMM profiles for more than 80,000 proteins from both phages and archaeal viruses, and performed all pairwise comparisons with HHsearch program. The whole resulting database can be explored through a user-friendly "Phagonaute" interface to help predict functions. Results are displayed together with their genetic context, to strengthen inferences based on remote homology. Beyond function prediction, this tool permits detections of co-occurrences, often indicative of proteins completing a task together, and observation of conserved patterns across large evolutionary distances. As a test, Herpes simplex virus I was added to Phagonaute, and 25% of its proteome matched to bacterial or archaeal viral protein counterparts. Phagonaute should therefore help virologists in their quest for protein functions and evolutionary relationships. Copyright © 2016 Elsevier Inc. All rights reserved.
G-LoSA for Prediction of Protein-Ligand Binding Sites and Structures.
Lee, Hui Sun; Im, Wonpil
2017-01-01
Recent advances in high-throughput structure determination and computational protein structure prediction have significantly enriched the universe of protein structure. However, there is still a large gap between the number of available protein structures and that of proteins with annotated function in high accuracy. Computational structure-based protein function prediction has emerged to reduce this knowledge gap. The identification of a ligand binding site and its structure is critical to the determination of a protein's molecular function. We present a computational methodology for predicting small molecule ligand binding site and ligand structure using G-LoSA, our protein local structure alignment and similarity measurement tool. All the computational procedures described here can be easily implemented using G-LoSA Toolkit, a package of standalone software programs and preprocessed PDB structure libraries. G-LoSA and G-LoSA Toolkit are freely available to academic users at http://compbio.lehigh.edu/GLoSA . We also illustrate a case study to show the potential of our template-based approach harnessing G-LoSA for protein function prediction.
Decomposition of Proteins into Dynamic Units from Atomic Cross-Correlation Functions.
Calligari, Paolo; Gerolin, Marco; Abergel, Daniel; Polimeno, Antonino
2017-01-10
In this article, we present a clustering method of atoms in proteins based on the analysis of the correlation times of interatomic distance correlation functions computed from MD simulations. The goal is to provide a coarse-grained description of the protein in terms of fewer elements that can be treated as dynamically independent subunits. Importantly, this domain decomposition method does not take into account structural properties of the protein. Instead, the clustering of protein residues in terms of networks of dynamically correlated domains is defined on the basis of the effective correlation times of the pair distance correlation functions. For these properties, our method stands as a complementary analysis to the customary protein decomposition in terms of quasi-rigid, structure-based domains. Results obtained for a prototypal protein structure illustrate the approach proposed.
Lee, Hasup; Baek, Minkyung; Lee, Gyu Rie; Park, Sangwoo; Seok, Chaok
2017-03-01
Many proteins function as homo- or hetero-oligomers; therefore, attempts to understand and regulate protein functions require knowledge of protein oligomer structures. The number of available experimental protein structures is increasing, and oligomer structures can be predicted using the experimental structures of related proteins as templates. However, template-based models may have errors due to sequence differences between the target and template proteins, which can lead to functional differences. Such structural differences may be predicted by loop modeling of local regions or refinement of the overall structure. In CAPRI (Critical Assessment of PRotein Interactions) round 30, we used recently developed features of the GALAXY protein modeling package, including template-based structure prediction, loop modeling, model refinement, and protein-protein docking to predict protein complex structures from amino acid sequences. Out of the 25 CAPRI targets, medium and acceptable quality models were obtained for 14 and 1 target(s), respectively, for which proper oligomer or monomer templates could be detected. Symmetric interface loop modeling on oligomer model structures successfully improved model quality, while loop modeling on monomer model structures failed. Overall refinement of the predicted oligomer structures consistently improved the model quality, in particular in interface contacts. Proteins 2017; 85:399-407. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Automated prediction of protein function and detection of functional sites from structure.
Pazos, Florencio; Sternberg, Michael J E
2004-10-12
Current structural genomics projects are yielding structures for proteins whose functions are unknown. Accordingly, there is a pressing requirement for computational methods for function prediction. Here we present PHUNCTIONER, an automatic method for structure-based function prediction using automatically extracted functional sites (residues associated to functions). The method relates proteins with the same function through structural alignments and extracts 3D profiles of conserved residues. Functional features to train the method are extracted from the Gene Ontology (GO) database. The method extracts these features from the entire GO hierarchy and hence is applicable across the whole range of function specificity. 3D profiles associated with 121 GO annotations were extracted. We tested the power of the method both for the prediction of function and for the extraction of functional sites. The success of function prediction by our method was compared with the standard homology-based method. In the zone of low sequence similarity (approximately 15%), our method assigns the correct GO annotation in 90% of the protein structures considered, approximately 20% higher than inheritance of function from the closest homologue.
Li, Ying Hong; Xu, Jing Yu; Tao, Lin; Li, Xiao Feng; Li, Shuang; Zeng, Xian; Chen, Shang Ying; Zhang, Peng; Qin, Chu; Zhang, Cheng; Chen, Zhe; Zhu, Feng; Chen, Yu Zong
2016-01-01
Knowledge of protein function is important for biological, medical and therapeutic studies, but many proteins are still unknown in function. There is a need for more improved functional prediction methods. Our SVM-Prot web-server employed a machine learning method for predicting protein functional families from protein sequences irrespective of similarity, which complemented those similarity-based and other methods in predicting diverse classes of proteins including the distantly-related proteins and homologous proteins of different functions. Since its publication in 2003, we made major improvements to SVM-Prot with (1) expanded coverage from 54 to 192 functional families, (2) more diverse protein descriptors protein representation, (3) improved predictive performances due to the use of more enriched training datasets and more variety of protein descriptors, (4) newly integrated BLAST analysis option for assessing proteins in the SVM-Prot predicted functional families that were similar in sequence to a query protein, and (5) newly added batch submission option for supporting the classification of multiple proteins. Moreover, 2 more machine learning approaches, K nearest neighbor and probabilistic neural networks, were added for facilitating collective assessment of protein functions by multiple methods. SVM-Prot can be accessed at http://bidd2.nus.edu.sg/cgi-bin/svmprot/svmprot.cgi.
Structure-Based Phylogenetic Analysis of the Lipocalin Superfamily.
Lakshmi, Balasubramanian; Mishra, Madhulika; Srinivasan, Narayanaswamy; Archunan, Govindaraju
2015-01-01
Lipocalins constitute a superfamily of extracellular proteins that are found in all three kingdoms of life. Although very divergent in their sequences and functions, they show remarkable similarity in 3-D structures. Lipocalins bind and transport small hydrophobic molecules. Earlier sequence-based phylogenetic studies of lipocalins highlighted that they have a long evolutionary history. However the molecular and structural basis of their functional diversity is not completely understood. The main objective of the present study is to understand functional diversity of the lipocalins using a structure-based phylogenetic approach. The present study with 39 protein domains from the lipocalin superfamily suggests that the clusters of lipocalins obtained by structure-based phylogeny correspond well with the functional diversity. The detailed analysis on each of the clusters and sub-clusters reveals that the 39 lipocalin domains cluster based on their mode of ligand binding though the clustering was performed on the basis of gross domain structure. The outliers in the phylogenetic tree are often from single member families. Also structure-based phylogenetic approach has provided pointers to assign putative function for the domains of unknown function in lipocalin family. The approach employed in the present study can be used in the future for the functional identification of new lipocalin proteins and may be extended to other protein families where members show poor sequence similarity but high structural similarity.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Calinisan, Venice; Gravem, Dana; Chen, Ray Ping-Hsu
2005-06-17
Members of the protein 4.1 family of adapter proteins are expressed in a broad panel of tissues including various epithelia where they likely play an important role in maintenance of cell architecture and polarity and in control of cell proliferation. We have recently characterized the structure and distribution of three members of the protein 4.1 family, 4.1B, 4.1R and 4.1N, in mouse kidney. We describe here binding partners for renal 4.1 proteins, identified through the screening of a rat kidney yeast two-hybrid system cDNA library. The identification of putative protein 4.1-based complexes enables us to envision potential functions for 4.1more » proteins in kidney: organization of signaling complexes, response to osmotic stress, protein trafficking, and control of cell proliferation. We discuss the relevance of these protein 4.1-based interactions in kidney physio-pathology in the context of their previously identified functions in other cells and tissues. Specifically, we will focus on renal 4.1 protein interactions with beta amyloid precursor protein (beta-APP), 14-3-3 proteins, and the cell swelling-activated chloride channel pICln. We also discuss the functional relevance of another member of the protein 4.1 superfamily, ezrin, in kidney physiopathology.« less
Prediction of scaffold proteins based on protein interaction and domain architectures.
Oh, Kimin; Yi, Gwan-Su
2016-07-28
Scaffold proteins are known for being crucial regulators of various cellular functions by assembling multiple proteins involved in signaling and metabolic pathways. Identification of scaffold proteins and the study of their molecular mechanisms can open a new aspect of cellular systemic regulation and the results can be applied in the field of medicine and engineering. Despite being highlighted as the regulatory roles of dozens of scaffold proteins, there was only one known computational approach carried out so far to find scaffold proteins from interactomes. However, there were limitations in finding diverse types of scaffold proteins because their criteria were restricted to the classical scaffold proteins. In this paper, we will suggest a systematic approach to predict massive scaffold proteins from interactomes and to characterize the roles of scaffold proteins comprehensively. From a total of 10,419 basic scaffold protein candidates in protein interactomes, we classified them into three classes according to the structural evidences for scaffolding, such as domain architectures, domain interactions and protein complexes. Finally, we could define 2716 highly reliable scaffold protein candidates and their characterized functional features. To assess the accuracy of our prediction, the gold standard positive and negative data sets were constructed. We prepared 158 gold standard positive data and 844 gold standard negative data based on the functional information from Gene Ontology consortium. The precision, sensitivity and specificity of our testing was 80.3, 51.0, and 98.5 % respectively. Through the function enrichment analysis of highly reliable scaffold proteins, we could confirm the significantly enriched functions that are related to scaffold protein binding. We also identified functional association between scaffold proteins and their recruited proteins. Furthermore, we checked that the disease association of scaffold proteins is higher than kinases. In conclusion, we could predict larger volume of scaffold proteins and analyzed their functional characteristics. Deeper understandings about the roles of scaffold proteins from this study will provide a higher opportunity to find therapeutic or engineering applications of scaffold proteins using their functional characteristics.
MoonProt: a database for proteins that are known to moonlight
Mani, Mathew; Chen, Chang; Amblee, Vaishak; Liu, Haipeng; Mathur, Tanu; Zwicke, Grant; Zabad, Shadi; Patel, Bansi; Thakkar, Jagravi; Jeffery, Constance J.
2015-01-01
Moonlighting proteins comprise a class of multifunctional proteins in which a single polypeptide chain performs multiple biochemical functions that are not due to gene fusions, multiple RNA splice variants or pleiotropic effects. The known moonlighting proteins perform a variety of diverse functions in many different cell types and species, and information about their structures and functions is scattered in many publications. We have constructed the manually curated, searchable, internet-based MoonProt Database (http://www.moonlightingproteins.org) with information about the over 200 proteins that have been experimentally verified to be moonlighting proteins. The availability of this organized information provides a more complete picture of what is currently known about moonlighting proteins. The database will also aid researchers in other fields, including determining the functions of genes identified in genome sequencing projects, interpreting data from proteomics projects and annotating protein sequence and structural databases. In addition, information about the structures and functions of moonlighting proteins can be helpful in understanding how novel protein functional sites evolved on an ancient protein scaffold, which can also help in the design of proteins with novel functions. PMID:25324305
Self-Assembled Materials Made from Functional Recombinant Proteins.
Jang, Yeongseon; Champion, Julie A
2016-10-18
Proteins are potent molecules that can be used as therapeutics, sensors, and biocatalysts with many advantages over small-molecule counterparts due to the specificity of their activity based on their amino acid sequence and folded three-dimensional structure. However, they also have significant limitations in their stability, localization, and recovery when used in soluble form. These opportunities and challenges have motivated the creation of materials from such functional proteins in order to protect and present them in a way that enhances their function. We have designed functional recombinant fusion proteins capable of self-assembling into materials with unique structures that maintain or improve the functionality of the protein. Fusion of either a functional protein or an assembly domain to a leucine zipper domain makes the materials design strategy modular, based on the high affinity between leucine zippers. The self-assembly domains, including elastin-like polypeptides (ELPs) and defined-sequence random coil polypeptides, can be fused with a leucine zipper motif in order to promote assembly of the fusion proteins into larger structures upon specific stimuli such as temperature and ionic strength. Fusion of other functional domains with the counterpart leucine zipper motif endows the self-assembled materials with protein-specific functions such as fluorescence or catalytic activity. In this Account, we describe several examples of materials assembled from functional fusion proteins as well as the structural characterization, functionality, and understanding of the assembly mechanism. The first example is zipper fusion proteins containing ELPs that assemble into particles when introduced to a model extracellular matrix and subsequently disassemble over time to release the functional protein for drug delivery applications. Under different conditions, the same fusion proteins can self-assemble into hollow vesicles. The vesicles display a functional protein on the surface and can also carry protein, small-molecule, or nanoparticle cargo in the vesicle lumen. To create a material with a more complex hierarchical structure, we combined calcium phosphate with zipper fusion proteins containing random coil polypeptides to produce hybrid protein-inorganic supraparticles with high surface area and porous structure. The use of a functional enzyme created supraparticles with the ability to degrade inflammatory cytokines. Our characterization of these protein materials revealed that the molecular interactions are complex because of the large size of the protein building blocks, their folded structures, and the number of potential interactions including hydrophobic interactions, electrostatic interactions, van der Waals forces, and specific affinity-based interactions. It is difficult or even impossible to predict the structures a priori. However, once the basic assembly principles are understood, there is opportunity to tune the material properties, such as size, through control of the self-assembly conditions. Our future efforts on the fundamental side will focus on identifying the phase space of self-assembly of these fusion proteins and additional experimental levers with which to control and tune the resulting materials. On the application side, we are investigating an array of different functional proteins to expand the use of these structures in both therapeutic protein delivery and biocatalysis.
Hardy, John G; Pfaff, André; Leal-Egaña, Aldo; Müller, Axel H E; Scheibel, Thomas R
2014-07-01
Silk protein-based materials are promising biomaterials for application as tissue scaffolds, due to their processability, biocompatibility, and biodegradability. The preparation of films composed of an engineered spider silk protein (eADF4(C16)) and their functionalization with glycopolymers are described. The glycopolymers bind proteins found in the extracellular matrix, providing a biomimetic coating on the films that improves cell adhesion to the surfaces of engineered spider silk films. Such silk-based materials have potential as coatings for degradable implantable devices. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Zhang, Zheng; Chen, Shengfu; Jiang, Shaoyi
2006-12-01
We introduce a dual-functional biocompatible material based on zwitterionic poly(carboxybetaine methacrylate) (polyCBMA), which not only highly resists protein adsorption/cell adhesion, but also has abundant functional groups convenient for the immobilization of biological ligands, such as proteins. The dual-functional properties are unique to carboxybetaine moieties and are not found in other nonfouling moieties such as ethylene glycol, phosphobetaine, and sulfobetaine. The unique properties are demonstrated in this work by grafting a polyCBMA polymer onto a surface or by preparing a polyCBMA-based hydrogel. PolyCBMA brushes with a thickness of 10-15 nm were grafted on a gold surface using the surface-initiated atom transfer radical polymerization method. Protein adsorption was analyzed using a surface plasmon resonance sensor. The surface grafted with polyCBMA very largely prevented the nonspecific adsorption of three test proteins, that is, fibrinogen, lysozyme, and human chorionic gonadotropin (hCG). The immobilization of anti-hCG on the surface resulted in the specific binding of hCG while maintaining a high resistance to nonspecific protein adsorption. Transparent polyCBMA-based hydrogel disks were decorated with immobilized fibronectin. Aortic endothelial cells did not bind to the polyCBMA controls, but appeared to adhere well and spread on the fibronectin-modified surface. With their dual functionality and biomimetic nature, polyCBMA-based materials are very promising for their applications in medical diagnostics, biomaterials/tissue engineering, and drug delivery.
[Visualization and Functional Regulation of Live Cell Proteins Based on Labeling Probe Design].
Mizukami, Shin; Kikuchi, Kazuya
2016-01-01
There are several approaches to understanding the physiological roles of biomolecules: (1) by observing the localization or activities of biomolecules (based on microscopic imaging experiments with fluorescent proteins or fluorescent probes) and (2) by investigating the cellular response via activation or suppression of functions of the target molecule (by using inhibitors, antagonists, siRNAs, etc.). In this context, protein-labeling technology serves as a powerful tool that can be used in various experiments, such as for fluorescence imaging of target proteins. Recently, we developed a protein-labeling technology that uses a mutant β-lactamase (a bacterial hydrolase) as the tag protein. In this protein-labeling technology, also referred to as the BL-tag technology, various β-lactam compounds were used as specific ligands that were covalently labeled to the tag. One major advantage of this labeling technology is that various functions can be carried out by suitably designing both the functional moieties such as the fluorophore and the β-lactam ligand structure. In this review, we briefly introduce the BL-tag technology and describe our future outlook for this technology, such as in fluorescence imaging of biomolecules and functional regulation of cellular proteins in living cells.
A human protein atlas for normal and cancer tissues based on antibody proteomics.
Uhlén, Mathias; Björling, Erik; Agaton, Charlotta; Szigyarto, Cristina Al-Khalili; Amini, Bahram; Andersen, Elisabet; Andersson, Ann-Catrin; Angelidou, Pia; Asplund, Anna; Asplund, Caroline; Berglund, Lisa; Bergström, Kristina; Brumer, Harry; Cerjan, Dijana; Ekström, Marica; Elobeid, Adila; Eriksson, Cecilia; Fagerberg, Linn; Falk, Ronny; Fall, Jenny; Forsberg, Mattias; Björklund, Marcus Gry; Gumbel, Kristoffer; Halimi, Asif; Hallin, Inga; Hamsten, Carl; Hansson, Marianne; Hedhammar, My; Hercules, Görel; Kampf, Caroline; Larsson, Karin; Lindskog, Mats; Lodewyckx, Wald; Lund, Jan; Lundeberg, Joakim; Magnusson, Kristina; Malm, Erik; Nilsson, Peter; Odling, Jenny; Oksvold, Per; Olsson, Ingmarie; Oster, Emma; Ottosson, Jenny; Paavilainen, Linda; Persson, Anja; Rimini, Rebecca; Rockberg, Johan; Runeson, Marcus; Sivertsson, Asa; Sköllermo, Anna; Steen, Johanna; Stenvall, Maria; Sterky, Fredrik; Strömberg, Sara; Sundberg, Mårten; Tegel, Hanna; Tourle, Samuel; Wahlund, Eva; Waldén, Annelie; Wan, Jinghong; Wernérus, Henrik; Westberg, Joakim; Wester, Kenneth; Wrethagen, Ulla; Xu, Lan Lan; Hober, Sophia; Pontén, Fredrik
2005-12-01
Antibody-based proteomics provides a powerful approach for the functional study of the human proteome involving the systematic generation of protein-specific affinity reagents. We used this strategy to construct a comprehensive, antibody-based protein atlas for expression and localization profiles in 48 normal human tissues and 20 different cancers. Here we report a new publicly available database containing, in the first version, approximately 400,000 high resolution images corresponding to more than 700 antibodies toward human proteins. Each image has been annotated by a certified pathologist to provide a knowledge base for functional studies and to allow queries about protein profiles in normal and disease tissues. Our results suggest it should be possible to extend this analysis to the majority of all human proteins thus providing a valuable tool for medical and biological research.
Synthesis and characterization of recombinant abductin-based proteins.
Su, Renay S-C; Renner, Julie N; Liu, Julie C
2013-12-09
Recombinant proteins are promising tools for tissue engineering and drug delivery applications. Protein-based biomaterials have several advantages over natural and synthetic polymers, including precise control over amino acid composition and molecular weight, modular swapping of functional domains, and tunable mechanical and physical properties. In this work, we describe recombinant proteins based on abductin, an elastomeric protein that is found in the inner hinge of bivalves and functions as a coil spring to keep shells open. We illustrate, for the first time, the design, cloning, expression, and purification of a recombinant protein based on consensus abductin sequences derived from Argopecten irradians . The molecular weight of the protein was confirmed by mass spectrometry, and the protein was 94% pure. Circular dichroism studies showed that the dominant structures of abductin-based proteins were polyproline II helix structures in aqueous solution and type II β-turns in trifluoroethanol. Dynamic light scattering studies illustrated that the abductin-based proteins exhibit reversible upper critical solution temperature behavior and irreversible aggregation behavior at high temperatures. A LIVE/DEAD assay revealed that human umbilical vein endothelial cells had a viability of 98 ± 4% after being cultured for two days on the abductin-based protein. Initial cell spreading on the abductin-based protein was similar to that on bovine serum albumin. These studies thus demonstrate the potential of abductin-based proteins in tissue engineering and drug delivery applications due to the cytocompatibility and its response to temperature.
Functionalization of protein-based nanocages for drug delivery applications.
Schoonen, Lise; van Hest, Jan C M
2014-07-07
Traditional drug delivery strategies involve drugs which are not targeted towards the desired tissue. This can lead to undesired side effects, as normal cells are affected by the drugs as well. Therefore, new systems are now being developed which combine targeting functionalities with encapsulation of drug cargo. Protein nanocages are highly promising drug delivery platforms due to their perfectly defined structures, biocompatibility, biodegradability and low toxicity. A variety of protein nanocages have been modified and functionalized for these types of applications. In this review, we aim to give an overview of different types of modifications of protein-based nanocontainers for drug delivery applications.
GalaxyDock BP2 score: a hybrid scoring function for accurate protein-ligand docking
NASA Astrophysics Data System (ADS)
Baek, Minkyung; Shin, Woong-Hee; Chung, Hwan Won; Seok, Chaok
2017-07-01
Protein-ligand docking is a useful tool for providing atomic-level understanding of protein functions in nature and design principles for artificial ligands or proteins with desired properties. The ability to identify the true binding pose of a ligand to a target protein among numerous possible candidate poses is an essential requirement for successful protein-ligand docking. Many previously developed docking scoring functions were trained to reproduce experimental binding affinities and were also used for scoring binding poses. However, in this study, we developed a new docking scoring function, called GalaxyDock BP2 Score, by directly training the scoring power of binding poses. This function is a hybrid of physics-based, empirical, and knowledge-based score terms that are balanced to strengthen the advantages of each component. The performance of the new scoring function exhibits significant improvement over existing scoring functions in decoy pose discrimination tests. In addition, when the score is used with the GalaxyDock2 protein-ligand docking program, it outperformed other state-of-the-art docking programs in docking tests on the Astex diverse set, the Cross2009 benchmark set, and the Astex non-native set. GalaxyDock BP2 Score and GalaxyDock2 with this score are freely available at http://galaxy.seoklab.org/softwares/galaxydock.html.
Du, Yushen; Wu, Nicholas C; Jiang, Lin; Zhang, Tianhao; Gong, Danyang; Shu, Sara; Wu, Ting-Ting; Sun, Ren
2016-11-01
Identification and annotation of functional residues are fundamental questions in protein sequence analysis. Sequence and structure conservation provides valuable information to tackle these questions. It is, however, limited by the incomplete sampling of sequence space in natural evolution. Moreover, proteins often have multiple functions, with overlapping sequences that present challenges to accurate annotation of the exact functions of individual residues by conservation-based methods. Using the influenza A virus PB1 protein as an example, we developed a method to systematically identify and annotate functional residues. We used saturation mutagenesis and high-throughput sequencing to measure the replication capacity of single nucleotide mutations across the entire PB1 protein. After predicting protein stability upon mutations, we identified functional PB1 residues that are essential for viral replication. To further annotate the functional residues important to the canonical or noncanonical functions of viral RNA-dependent RNA polymerase (vRdRp), we performed a homologous-structure analysis with 16 different vRdRp structures. We achieved high sensitivity in annotating the known canonical polymerase functional residues. Moreover, we identified a cluster of noncanonical functional residues located in the loop region of the PB1 β-ribbon. We further demonstrated that these residues were important for PB1 protein nuclear import through the interaction with Ran-binding protein 5. In summary, we developed a systematic and sensitive method to identify and annotate functional residues that are not restrained by sequence conservation. Importantly, this method is generally applicable to other proteins about which homologous-structure information is available. To fully comprehend the diverse functions of a protein, it is essential to understand the functionality of individual residues. Current methods are highly dependent on evolutionary sequence conservation, which is usually limited by sampling size. Sequence conservation-based methods are further confounded by structural constraints and multifunctionality of proteins. Here we present a method that can systematically identify and annotate functional residues of a given protein. We used a high-throughput functional profiling platform to identify essential residues. Coupling it with homologous-structure comparison, we were able to annotate multiple functions of proteins. We demonstrated the method with the PB1 protein of influenza A virus and identified novel functional residues in addition to its canonical function as an RNA-dependent RNA polymerase. Not limited to virology, this method is generally applicable to other proteins that can be functionally selected and about which homologous-structure information is available. Copyright © 2016 Du et al.
Tan, Kemin; Chang, Changsoo; Cuff, Marianne; Osipiuk, Jerzy; Landorf, Elizabeth; Mack, Jamey C; Zerbs, Sarah; Joachimiak, Andrzej; Collart, Frank R
2013-10-01
Lignin comprises 15-25% of plant biomass and represents a major environmental carbon source for utilization by soil microorganisms. Access to this energy resource requires the action of fungal and bacterial enzymes to break down the lignin polymer into a complex assortment of aromatic compounds that can be transported into the cells. To improve our understanding of the utilization of lignin by microorganisms, we characterized the molecular properties of solute binding proteins of ATP-binding cassette transporter proteins that interact with these compounds. A combination of functional screens and structural studies characterized the binding specificity of the solute binding proteins for aromatic compounds derived from lignin such as p-coumarate, 3-phenylpropionic acid and compounds with more complex ring substitutions. A ligand screen based on thermal stabilization identified several binding protein clusters that exhibit preferences based on the size or number of aromatic ring substituents. Multiple X-ray crystal structures of protein-ligand complexes for these clusters identified the molecular basis of the binding specificity for the lignin-derived aromatic compounds. The screens and structural data provide new functional assignments for these solute-binding proteins which can be used to infer their transport specificity. This knowledge of the functional roles and molecular binding specificity of these proteins will support the identification of the specific enzymes and regulatory proteins of peripheral pathways that funnel these compounds to central metabolic pathways and will improve the predictive power of sequence-based functional annotation methods for this family of proteins. Copyright © 2013 Wiley Periodicals, Inc.
Tan, Kemin; Chang, Changsoo; Cuff, Marianne; Osipiuk, Jerzy; Landorf, Elizabeth; Mack, Jamey C.; Zerbs, Sarah; Joachimiak, Andrzej; Collart, Frank R.
2013-01-01
Lignin comprises 15.25% of plant biomass and represents a major environmental carbon source for utilization by soil microorganisms. Access to this energy resource requires the action of fungal and bacterial enzymes to break down the lignin polymer into a complex assortment of aromatic compounds that can be transported into the cells. To improve our understanding of the utilization of lignin by microorganisms, we characterized the molecular properties of solute binding proteins of ATP.binding cassette transporter proteins that interact with these compounds. A combination of functional screens and structural studies characterized the binding specificity of the solute binding proteins for aromatic compounds derived from lignin such as p-coumarate, 3-phenylpropionic acid and compounds with more complex ring substitutions. A ligand screen based on thermal stabilization identified several binding protein clusters that exhibit preferences based on the size or number of aromatic ring substituents. Multiple X-ray crystal structures of protein-ligand complexes for these clusters identified the molecular basis of the binding specificity for the lignin-derived aromatic compounds. The screens and structural data provide new functional assignments for these solute.binding proteins which can be used to infer their transport specificity. This knowledge of the functional roles and molecular binding specificity of these proteins will support the identification of the specific enzymes and regulatory proteins of peripheral pathways that funnel these compounds to central metabolic pathways and will improve the predictive power of sequence-based functional annotation methods for this family of proteins. PMID:23606130
A traveling salesman approach for predicting protein functions.
Johnson, Olin; Liu, Jing
2006-10-12
Protein-protein interaction information can be used to predict unknown protein functions and to help study biological pathways. Here we present a new approach utilizing the classic Traveling Salesman Problem to study the protein-protein interactions and to predict protein functions in budding yeast Saccharomyces cerevisiae. We apply the global optimization tool from combinatorial optimization algorithms to cluster the yeast proteins based on the global protein interaction information. We then use this clustering information to help us predict protein functions. We use our algorithm together with the direct neighbor algorithm 1 on characterized proteins and compare the prediction accuracy of the two methods. We show our algorithm can produce better predictions than the direct neighbor algorithm, which only considers the immediate neighbors of the query protein. Our method is a promising one to be used as a general tool to predict functions of uncharacterized proteins and a successful sample of using computer science knowledge and algorithms to study biological problems.
A traveling salesman approach for predicting protein functions
Johnson, Olin; Liu, Jing
2006-01-01
Background Protein-protein interaction information can be used to predict unknown protein functions and to help study biological pathways. Results Here we present a new approach utilizing the classic Traveling Salesman Problem to study the protein-protein interactions and to predict protein functions in budding yeast Saccharomyces cerevisiae. We apply the global optimization tool from combinatorial optimization algorithms to cluster the yeast proteins based on the global protein interaction information. We then use this clustering information to help us predict protein functions. We use our algorithm together with the direct neighbor algorithm [1] on characterized proteins and compare the prediction accuracy of the two methods. We show our algorithm can produce better predictions than the direct neighbor algorithm, which only considers the immediate neighbors of the query protein. Conclusion Our method is a promising one to be used as a general tool to predict functions of uncharacterized proteins and a successful sample of using computer science knowledge and algorithms to study biological problems. PMID:17147783
Li, Hongdong; Zhang, Yang; Guan, Yuanfang; Menon, Rajasree; Omenn, Gilbert S
2017-01-01
Tens of thousands of splice isoforms of proteins have been catalogued as predicted sequences from transcripts in humans and other species. Relatively few have been characterized biochemically or structurally. With the extensive development of protein bioinformatics, the characterization and modeling of isoform features, isoform functions, and isoform-level networks have advanced notably. Here we present applications of the I-TASSER family of algorithms for folding and functional predictions and the IsoFunc, MIsoMine, and Hisonet data resources for isoform-level analyses of network and pathway-based functional predictions and protein-protein interactions. Hopefully, predictions and insights from protein bioinformatics will stimulate many experimental validation studies.
Wu, Chia-Chou; Lin, Che
2015-01-01
The induction of stem cells toward a desired differentiation direction is required for the advancement of stem cell-based therapies. Despite successful demonstrations of the control of differentiation direction, the effective use of stem cell-based therapies suffers from a lack of systematic knowledge regarding the mechanisms underlying directed differentiation. Using dynamic modeling and the temporal microarray data of three differentiation stages, three dynamic protein-protein interaction networks were constructed. The interaction difference networks derived from the constructed networks systematically delineated the evolution of interaction variations and the underlying mechanisms. A proposed relevance score identified the essential components in the directed differentiation. Inspection of well-known proteins and functional modules in the directed differentiation showed the plausibility of the proposed relevance score, with the higher scores of several proteins and function modules indicating their essential roles in the directed differentiation. During the differentiation process, the proteins and functional modules with higher relevance scores also became more specific to the neuronal identity. Ultimately, the essential components revealed by the relevance scores may play a role in controlling the direction of differentiation. In addition, these components may serve as a starting point for understanding the systematic mechanisms of directed differentiation and for increasing the efficiency of stem cell-based therapies. PMID:25977693
Prediction of protein-protein interaction network using a multi-objective optimization approach.
Chowdhury, Archana; Rakshit, Pratyusha; Konar, Amit
2016-06-01
Protein-Protein Interactions (PPIs) are very important as they coordinate almost all cellular processes. This paper attempts to formulate PPI prediction problem in a multi-objective optimization framework. The scoring functions for the trial solution deal with simultaneous maximization of functional similarity, strength of the domain interaction profiles, and the number of common neighbors of the proteins predicted to be interacting. The above optimization problem is solved using the proposed Firefly Algorithm with Nondominated Sorting. Experiments undertaken reveal that the proposed PPI prediction technique outperforms existing methods, including gene ontology-based Relative Specific Similarity, multi-domain-based Domain Cohesion Coupling method, domain-based Random Decision Forest method, Bagging with REP Tree, and evolutionary/swarm algorithm-based approaches, with respect to sensitivity, specificity, and F1 score.
Hensen, Ulf; Meyer, Tim; Haas, Jürgen; Rex, René; Vriend, Gert; Grubmüller, Helmut
2012-01-01
Proteins are usually described and classified according to amino acid sequence, structure or function. Here, we develop a minimally biased scheme to compare and classify proteins according to their internal mobility patterns. This approach is based on the notion that proteins not only fold into recurring structural motifs but might also be carrying out only a limited set of recurring mobility motifs. The complete set of these patterns, which we tentatively call the dynasome, spans a multi-dimensional space with axes, the dynasome descriptors, characterizing different aspects of protein dynamics. The unique dynamic fingerprint of each protein is represented as a vector in the dynasome space. The difference between any two vectors, consequently, gives a reliable measure of the difference between the corresponding protein dynamics. We characterize the properties of the dynasome by comparing the dynamics fingerprints obtained from molecular dynamics simulations of 112 proteins but our approach is, in principle, not restricted to any specific source of data of protein dynamics. We conclude that: 1. the dynasome consists of a continuum of proteins, rather than well separated classes. 2. For the majority of proteins we observe strong correlations between structure and dynamics. 3. Proteins with similar function carry out similar dynamics, which suggests a new method to improve protein function annotation based on protein dynamics. PMID:22606222
Prior knowledge based mining functional modules from Yeast PPI networks with gene ontology
2010-01-01
Background In the literature, there are fruitful algorithmic approaches for identification functional modules in protein-protein interactions (PPI) networks. Because of accumulation of large-scale interaction data on multiple organisms and non-recording interaction data in the existing PPI database, it is still emergent to design novel computational techniques that can be able to correctly and scalably analyze interaction data sets. Indeed there are a number of large scale biological data sets providing indirect evidence for protein-protein interaction relationships. Results The main aim of this paper is to present a prior knowledge based mining strategy to identify functional modules from PPI networks with the aid of Gene Ontology. Higher similarity value in Gene Ontology means that two gene products are more functionally related to each other, so it is better to group such gene products into one functional module. We study (i) to encode the functional pairs into the existing PPI networks; and (ii) to use these functional pairs as pairwise constraints to supervise the existing functional module identification algorithms. Topology-based modularity metric and complex annotation in MIPs will be used to evaluate the identified functional modules by these two approaches. Conclusions The experimental results on Yeast PPI networks and GO have shown that the prior knowledge based learning methods perform better than the existing algorithms. PMID:21172053
Shahbaaz, Mohd; Ahmad, Faizan; Imtaiyaz Hassan, Md
2015-06-01
Haemophilus influenzae is a small pleomorphic Gram-negative bacteria which causes several chronic diseases, including bacteremia, meningitis, cellulitis, epiglottitis, septic arthritis, pneumonia, and empyema. Here we extensively analyzed the sequenced genome of H. influenzae strain Rd KW20 using protein family databases, protein structure prediction, pathways and genome context methods to assign a precise function to proteins whose functions are unknown. These proteins are termed as hypothetical proteins (HPs), for which no experimental information is available. Function prediction of these proteins would surely be supportive to precisely understand the biochemical pathways and mechanism of pathogenesis of Haemophilus influenzae. During the extensive analysis of H. influenzae genome, we found the presence of eight HPs showing lyase activity. Subsequently, we modeled and analyzed three-dimensional structure of all these HPs to determine their functions more precisely. We found these HPs possess cystathionine-β-synthase, cyclase, carboxymuconolactone decarboxylase, pseudouridine synthase A and C, D-tagatose-1,6-bisphosphate aldolase and aminodeoxychorismate lyase-like features, indicating their corresponding functions in the H. influenzae. Lyases are actively involved in the regulation of biosynthesis of various hormones, metabolic pathways, signal transduction, and DNA repair. Lyases are also considered as a key player for various biological processes. These enzymes are critically essential for the survival and pathogenesis of H. influenzae and, therefore, these enzymes may be considered as a potential target for structure-based rational drug design. Our structure-function relationship analysis will be useful to search and design potential lead molecules based on the structure of these lyases, for drug design and discovery.
Enzymatically Active Microgels from Self-Assembling Protein Nanofibrils for Microflow Chemistry.
Zhou, Xiao-Ming; Shimanovich, Ulyana; Herling, Therese W; Wu, Si; Dobson, Christopher M; Knowles, Tuomas P J; Perrett, Sarah
2015-06-23
Amyloid fibrils represent a generic class of protein structure associated with both pathological states and with naturally occurring functional materials. This class of protein nanostructure has recently also emerged as an excellent foundation for sophisticated functional biocompatible materials including scaffolds and carriers for biologically active molecules. Protein-based materials offer the potential advantage that additional functions can be directly incorporated via gene fusion producing a single chimeric polypeptide that will both self-assemble and display the desired activity. To succeed, a chimeric protein system must self-assemble without the need for harsh triggering conditions which would damage the appended functional protein molecule. However, the micrometer to nanoscale patterning and morphological control of protein-based nanomaterials has remained challenging. This study demonstrates a general approach for overcoming these limitations through the microfluidic generation of enzymatically active microgels that are stabilized by amyloid nanofibrils. The use of scaffolds formed from biomaterials that self-assemble under mild conditions enables the formation of catalytic microgels while maintaining the integrity of the encapsulated enzyme. The enzymatically active microgel particles show robust material properties and their porous architecture allows diffusion in and out of reactants and products. In combination with microfluidic droplet trapping approaches, enzymatically active microgels illustrate the potential of self-assembling materials for enzyme immobilization and recycling, and for biological flow-chemistry. These design principles can be adopted to create countless other bioactive amyloid-based materials with diverse functions.
Pan, Li; Iliuk, Anton; Yu, Shuai; Geahlen, Robert L.; Tao, W. Andy
2012-01-01
We report here for the first time the multiplexed quantitation of phosphorylation and protein expression based on a functionalized soluble nanopolymer. The soluble nanopolymer, pIMAGO, is functionalized with Ti (IV) ions for chelating phosphoproteins in high specificity, and with infrared fluorescent tags for direct, multiplexed assays. The nanopolymer allows for direct competition for epitopes on proteins of interest, thus facilitating simultaneous detection of phosphorylation by pIMAGO and total protein amount by protein antibody in the same well of microplates. The new strategy has a great potential to measure cell signaling events by clearly distinguishing actual phosphorylation signals from protein expression changes, thus providing a powerful tool to accurately profile cellular signal transduction in healthy and disease cells. We anticipate broad applications of this new strategy in monitoring cellular signaling pathways and discovering new signaling events. PMID:23088311
Protein-based hydrogels for tissue engineering
Schloss, Ashley C.; Williams, Danielle M.; Regan, Lynne J.
2017-01-01
The tunable mechanical and structural properties of protein-based hydrogels make them excellent scaffolds for tissue engineering and repair. Moreover, using protein-based components provides the option to insert sequences associated with the promoting both cellular adhesion to the substrate and overall cell growth. Protein-based hydrogel components are appealing for their structural designability, specific biological functionality, and stimuli-responsiveness. Here we present highlights in the field of protein-based hydrogels for tissue engineering applications including design requirements, components, and gel types. PMID:27677513
Park, Hahnbeom; Lee, Gyu Rie; Heo, Lim; Seok, Chaok
2014-01-01
Protein loop modeling is a tool for predicting protein local structures of particular interest, providing opportunities for applications involving protein structure prediction and de novo protein design. Until recently, the majority of loop modeling methods have been developed and tested by reconstructing loops in frameworks of experimentally resolved structures. In many practical applications, however, the protein loops to be modeled are located in inaccurate structural environments. These include loops in model structures, low-resolution experimental structures, or experimental structures of different functional forms. Accordingly, discrepancies in the accuracy of the structural environment assumed in development of the method and that in practical applications present additional challenges to modern loop modeling methods. This study demonstrates a new strategy for employing a hybrid energy function combining physics-based and knowledge-based components to help tackle this challenge. The hybrid energy function is designed to combine the strengths of each energy component, simultaneously maintaining accurate loop structure prediction in a high-resolution framework structure and tolerating minor environmental errors in low-resolution structures. A loop modeling method based on global optimization of this new energy function is tested on loop targets situated in different levels of environmental errors, ranging from experimental structures to structures perturbed in backbone as well as side chains and template-based model structures. The new method performs comparably to force field-based approaches in loop reconstruction in crystal structures and better in loop prediction in inaccurate framework structures. This result suggests that higher-accuracy predictions would be possible for a broader range of applications. The web server for this method is available at http://galaxy.seoklab.org/loop with the PS2 option for the scoring function.
Cheng, Yiming; Perocchi, Fabiana
2015-07-01
ProtPhylo is a web-based tool to identify proteins that are functionally linked to either a phenotype or a protein of interest based on co-evolution. ProtPhylo infers functional associations by comparing protein phylogenetic profiles (co-occurrence patterns of orthology relationships) for more than 9.7 million non-redundant protein sequences from all three domains of life. Users can query any of 2048 fully sequenced organisms, including 1678 bacteria, 255 eukaryotes and 115 archaea. In addition, they can tailor ProtPhylo to a particular kind of biological question by choosing among four main orthology inference methods based either on pair-wise sequence comparisons (One-way Best Hits and Best Reciprocal Hits) or clustering of orthologous proteins across multiple species (OrthoMCL and eggNOG). Next, ProtPhylo ranks phylogenetic neighbors of query proteins or phenotypic properties using the Hamming distance as a measure of similarity between pairs of phylogenetic profiles. Candidate hits can be easily and flexibly prioritized by complementary clues on subcellular localization, known protein-protein interactions, membrane spanning regions and protein domains. The resulting protein list can be quickly exported into a csv text file for further analyses. ProtPhylo is freely available at http://www.protphylo.org. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
You, Ronghui; Huang, Xiaodi; Zhu, Shanfeng
2018-06-06
As of April 2018, UniProtKB has collected more than 115 million protein sequences. Less than 0.15% of these proteins, however, have been associated with experimental GO annotations. As such, the use of automatic protein function prediction (AFP) to reduce this huge gap becomes increasingly important. The previous studies conclude that sequence homology based methods are highly effective in AFP. In addition, mining motif, domain, and functional information from protein sequences has been found very helpful for AFP. Other than sequences, alternative information sources such as text, however, may be useful for AFP as well. Instead of using BOW (bag of words) representation in traditional text-based AFP, we propose a new method called DeepText2GO that relies on deep semantic text representation, together with different kinds of available protein information such as sequence homology, families, domains, and motifs, to improve large-scale AFP. Furthermore, DeepText2GO integrates text-based methods with sequence-based ones by means of a consensus approach. Extensive experiments on the benchmark dataset extracted from UniProt/SwissProt have demonstrated that DeepText2GO significantly outperformed both text-based and sequence-based methods, validating its superiority. Copyright © 2018 Elsevier Inc. All rights reserved.
SIFTER search: a web server for accurate phylogeny-based protein function prediction
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sahraeian, Sayed M.; Luo, Kevin R.; Brenner, Steven E.
We are awash in proteins discovered through high-throughput sequencing projects. As only a minuscule fraction of these have been experimentally characterized, computational methods are widely used for automated annotation. Here, we introduce a user-friendly web interface for accurate protein function prediction using the SIFTER algorithm. SIFTER is a state-of-the-art sequence-based gene molecular function prediction algorithm that uses a statistical model of function evolution to incorporate annotations throughout the phylogenetic tree. Due to the resources needed by the SIFTER algorithm, running SIFTER locally is not trivial for most users, especially for large-scale problems. The SIFTER web server thus provides access tomore » precomputed predictions on 16 863 537 proteins from 232 403 species. Users can explore SIFTER predictions with queries for proteins, species, functions, and homologs of sequences not in the precomputed prediction set. Lastly, the SIFTER web server is accessible at http://sifter.berkeley.edu/ and the source code can be downloaded.« less
SIFTER search: a web server for accurate phylogeny-based protein function prediction
Sahraeian, Sayed M.; Luo, Kevin R.; Brenner, Steven E.
2015-05-15
We are awash in proteins discovered through high-throughput sequencing projects. As only a minuscule fraction of these have been experimentally characterized, computational methods are widely used for automated annotation. Here, we introduce a user-friendly web interface for accurate protein function prediction using the SIFTER algorithm. SIFTER is a state-of-the-art sequence-based gene molecular function prediction algorithm that uses a statistical model of function evolution to incorporate annotations throughout the phylogenetic tree. Due to the resources needed by the SIFTER algorithm, running SIFTER locally is not trivial for most users, especially for large-scale problems. The SIFTER web server thus provides access tomore » precomputed predictions on 16 863 537 proteins from 232 403 species. Users can explore SIFTER predictions with queries for proteins, species, functions, and homologs of sequences not in the precomputed prediction set. Lastly, the SIFTER web server is accessible at http://sifter.berkeley.edu/ and the source code can be downloaded.« less
Monitoring the function of membrane transport proteins in detergent-solubilized form
Quick, Matthias; Javitch, Jonathan A.
2007-01-01
Transport proteins constitute ≈10% of most proteomes and play vital roles in the translocation of solutes across membranes of all organisms. Their (dys)function is implicated in many disorders, making them frequent targets for pharmacotherapy. The identification of substrates for members of this large protein family, still replete with many orphans of unknown function, has proven difficult, in part because high-throughput screening is greatly complicated by endogenous transporters present in many expression systems. In addition, direct structural studies require that transporters be extracted from the membrane with detergent, thereby precluding transport measurements because of the lack of a vectorial environment and necessitating reconstitution into proteoliposomes for activity measurements. Here, we describe a direct scintillation proximity-based radioligand-binding assay for determining transport protein function in crude cell extracts and in purified form. This rapid and universally applicable assay with advantages over cell-based platforms will greatly facilitate the identification of substrates for many orphan transporters and allows monitoring the function of transport proteins in a nonmembranous environment. PMID:17360689
Pan, Joshua; Meyers, Robin M; Michel, Brittany C; Mashtalir, Nazar; Sizemore, Ann E; Wells, Jonathan N; Cassel, Seth H; Vazquez, Francisca; Weir, Barbara A; Hahn, William C; Marsh, Joseph A; Tsherniak, Aviad; Kadoch, Cigall
2018-05-23
Protein complexes are assemblies of subunits that have co-evolved to execute one or many coordinated functions in the cellular environment. Functional annotation of mammalian protein complexes is critical to understanding biological processes, as well as disease mechanisms. Here, we used genetic co-essentiality derived from genome-scale RNAi- and CRISPR-Cas9-based fitness screens performed across hundreds of human cancer cell lines to assign measures of functional similarity. From these measures, we systematically built and characterized functional similarity networks that recapitulate known structural and functional features of well-studied protein complexes and resolve novel functional modules within complexes lacking structural resolution, such as the mammalian SWI/SNF complex. Finally, by integrating functional networks with large protein-protein interaction networks, we discovered novel protein complexes involving recently evolved genes of unknown function. Taken together, these findings demonstrate the utility of genetic perturbation screens alone, and in combination with large-scale biophysical data, to enhance our understanding of mammalian protein complexes in normal and disease states. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.
ERIC Educational Resources Information Center
Giron, Maria D.; Salto, Rafael
2011-01-01
Structure-function relationship studies in proteins are essential in modern Cell Biology. Laboratory exercises that allow students to familiarize themselves with basic mutagenesis techniques are essential in all Genetic Engineering courses to teach the relevance of protein structure. We have implemented a laboratory course based on the…
Phylogeny-Based Systematization of Arabidopsis Proteins with Histone H1 Globular Domain1[OPEN
Knizewski, Lukasz; Schmidt, Anja; Ginalski, Krzysztof
2017-01-01
H1 (or linker) histones are basic nuclear proteins that possess an evolutionarily conserved nucleosome-binding globular domain, GH1. They perform critical functions in determining the accessibility of chromatin DNA to trans-acting factors. In most metazoan species studied so far, linker histones are highly heterogenous, with numerous nonallelic variants cooccurring in the same cells. The phylogenetic relationships among these variants as well as their structural and functional properties have been relatively well established. This contrasts markedly with the rather limited knowledge concerning the phylogeny and structural and functional roles of an unusually diverse group of GH1-containing proteins in plants. The dearth of information and the lack of a coherent phylogeny-based nomenclature of these proteins can lead to misunderstandings regarding their identity and possible relationships, thereby hampering plant chromatin research. Based on published data and our in silico and high-throughput analyses, we propose a systematization and coherent nomenclature of GH1-containing proteins of Arabidopsis (Arabidopsis thaliana [L.] Heynh) that will be useful for both the identification and structural and functional characterization of homologous proteins from other plant species. PMID:28298478
Phylogeny-Based Systematization of Arabidopsis Proteins with Histone H1 Globular Domain.
Kotliński, Maciej; Knizewski, Lukasz; Muszewska, Anna; Rutowicz, Kinga; Lirski, Maciej; Schmidt, Anja; Baroux, Célia; Ginalski, Krzysztof; Jerzmanowski, Andrzej
2017-05-01
H1 (or linker) histones are basic nuclear proteins that possess an evolutionarily conserved nucleosome-binding globular domain, GH1. They perform critical functions in determining the accessibility of chromatin DNA to trans-acting factors. In most metazoan species studied so far, linker histones are highly heterogenous, with numerous nonallelic variants cooccurring in the same cells. The phylogenetic relationships among these variants as well as their structural and functional properties have been relatively well established. This contrasts markedly with the rather limited knowledge concerning the phylogeny and structural and functional roles of an unusually diverse group of GH1-containing proteins in plants. The dearth of information and the lack of a coherent phylogeny-based nomenclature of these proteins can lead to misunderstandings regarding their identity and possible relationships, thereby hampering plant chromatin research. Based on published data and our in silico and high-throughput analyses, we propose a systematization and coherent nomenclature of GH1-containing proteins of Arabidopsis ( Arabidopsis thaliana [L.] Heynh) that will be useful for both the identification and structural and functional characterization of homologous proteins from other plant species. © 2017 American Society of Plant Biologists. All Rights Reserved.
Proteomics-Based Analysis of Protein Complexes in Pluripotent Stem Cells and Cancer Biology.
Sudhir, Putty-Reddy; Chen, Chung-Hsuan
2016-03-22
A protein complex consists of two or more proteins that are linked together through protein-protein interactions. The proteins show stable/transient and direct/indirect interactions within the protein complex or between the protein complexes. Protein complexes are involved in regulation of most of the cellular processes and molecular functions. The delineation of protein complexes is important to expand our knowledge on proteins functional roles in physiological and pathological conditions. The genetic yeast-2-hybrid method has been extensively used to characterize protein-protein interactions. Alternatively, a biochemical-based affinity purification coupled with mass spectrometry (AP-MS) approach has been widely used to characterize the protein complexes. In the AP-MS method, a protein complex of a target protein of interest is purified using a specific antibody or an affinity tag (e.g., DYKDDDDK peptide (FLAG) and polyhistidine (His)) and is subsequently analyzed by means of MS. Tandem affinity purification, a two-step purification system, coupled with MS has been widely used mainly to reduce the contaminants. We review here a general principle for AP-MS-based characterization of protein complexes and we explore several protein complexes identified in pluripotent stem cell biology and cancer biology as examples.
2013-01-01
Background SNPs&GO is a method for the prediction of deleterious Single Amino acid Polymorphisms (SAPs) using protein functional annotation. In this work, we present the web server implementation of SNPs&GO (WS-SNPs&GO). The server is based on Support Vector Machines (SVM) and for a given protein, its input comprises: the sequence and/or its three-dimensional structure (when available), a set of target variations and its functional Gene Ontology (GO) terms. The output of the server provides, for each protein variation, the probabilities to be associated to human diseases. Results The server consists of two main components, including updated versions of the sequence-based SNPs&GO (recently scored as one of the best algorithms for predicting deleterious SAPs) and of the structure-based SNPs&GO3d programs. Sequence and structure based algorithms are extensively tested on a large set of annotated variations extracted from the SwissVar database. Selecting a balanced dataset with more than 38,000 SAPs, the sequence-based approach achieves 81% overall accuracy, 0.61 correlation coefficient and an Area Under the Curve (AUC) of the Receiver Operating Characteristic (ROC) curve of 0.88. For the subset of ~6,600 variations mapped on protein structures available at the Protein Data Bank (PDB), the structure-based method scores with 84% overall accuracy, 0.68 correlation coefficient, and 0.91 AUC. When tested on a new blind set of variations, the results of the server are 79% and 83% overall accuracy for the sequence-based and structure-based inputs, respectively. Conclusions WS-SNPs&GO is a valuable tool that includes in a unique framework information derived from protein sequence, structure, evolutionary profile, and protein function. WS-SNPs&GO is freely available at http://snps.biofold.org/snps-and-go. PMID:23819482
Protein Bricks: 2D and 3D Bio-Nanostructures with Shape and Function on Demand.
Jiang, Jianjuan; Zhang, Shaoqing; Qian, Zhigang; Qin, Nan; Song, Wenwen; Sun, Long; Zhou, Zhitao; Shi, Zhifeng; Chen, Liang; Li, Xinxin; Mao, Ying; Kaplan, David L; Gilbert Corder, Stephanie N; Chen, Xinzhong; Liu, Mengkun; Omenetto, Fiorenzo G; Xia, Xiaoxia; Tao, Tiger H
2018-05-01
Precise patterning of polymer-based biomaterials for functional bio-nanostructures has extensive applications including biosensing, tissue engineering, and regenerative medicine. Remarkable progress is made in both top-down (based on lithographic methods) and bottom-up (via self-assembly) approaches with natural and synthetic biopolymers. However, most methods only yield 2D and pseudo-3D structures with restricted geometries and functionalities. Here, it is reported that precise nanostructuring on genetically engineered spider silk by accurately directing ion and electron beam interactions with the protein's matrix at the nanoscale to create well-defined 2D bionanopatterns and further assemble 3D bionanoarchitectures with shape and function on demand, termed "Protein Bricks." The added control over protein sequence and molecular weight of recombinant spider silk via genetic engineering provides unprecedented lithographic resolution (approaching the molecular limit), sharpness, and biological functions compared to natural proteins. This approach provides a facile method for patterning and immobilizing functional molecules within nanoscopic, hierarchical protein structures, which sheds light on a wide range of biomedical applications such as structure-enhanced fluorescence and biomimetic microenvironments for controlling cell fate. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
NASA Astrophysics Data System (ADS)
Xu, Xianjin; Yan, Chengfei; Zou, Xiaoqin
2017-08-01
The growing number of protein-ligand complex structures, particularly the structures of proteins co-bound with different ligands, in the Protein Data Bank helps us tackle two major challenges in molecular docking studies: the protein flexibility and the scoring function. Here, we introduced a systematic strategy by using the information embedded in the known protein-ligand complex structures to improve both binding mode and binding affinity predictions. Specifically, a ligand similarity calculation method was employed to search a receptor structure with a bound ligand sharing high similarity with the query ligand for the docking use. The strategy was applied to the two datasets (HSP90 and MAP4K4) in recent D3R Grand Challenge 2015. In addition, for the HSP90 dataset, a system-specific scoring function (ITScore2_hsp90) was generated by recalibrating our statistical potential-based scoring function (ITScore2) using the known protein-ligand complex structures and the statistical mechanics-based iterative method. For the HSP90 dataset, better performances were achieved for both binding mode and binding affinity predictions comparing with the original ITScore2 and with ensemble docking. For the MAP4K4 dataset, although there were only eight known protein-ligand complex structures, our docking strategy achieved a comparable performance with ensemble docking. Our method for receptor conformational selection and iterative method for the development of system-specific statistical potential-based scoring functions can be easily applied to other protein targets that have a number of protein-ligand complex structures available to improve predictions on binding.
Activity-based protein profiling: from enzyme chemistry to proteomic chemistry.
Cravatt, Benjamin F; Wright, Aaron T; Kozarich, John W
2008-01-01
Genome sequencing projects have provided researchers with a complete inventory of the predicted proteins produced by eukaryotic and prokaryotic organisms. Assignment of functions to these proteins represents one of the principal challenges for the field of proteomics. Activity-based protein profiling (ABPP) has emerged as a powerful chemical proteomic strategy to characterize enzyme function directly in native biological systems on a global scale. Here, we review the basic technology of ABPP, the enzyme classes addressable by this method, and the biological discoveries attributable to its application.
Proteomics-Based Analysis of Protein Complexes in Pluripotent Stem Cells and Cancer Biology
Sudhir, Putty-Reddy; Chen, Chung-Hsuan
2016-01-01
A protein complex consists of two or more proteins that are linked together through protein–protein interactions. The proteins show stable/transient and direct/indirect interactions within the protein complex or between the protein complexes. Protein complexes are involved in regulation of most of the cellular processes and molecular functions. The delineation of protein complexes is important to expand our knowledge on proteins functional roles in physiological and pathological conditions. The genetic yeast-2-hybrid method has been extensively used to characterize protein-protein interactions. Alternatively, a biochemical-based affinity purification coupled with mass spectrometry (AP-MS) approach has been widely used to characterize the protein complexes. In the AP-MS method, a protein complex of a target protein of interest is purified using a specific antibody or an affinity tag (e.g., DYKDDDDK peptide (FLAG) and polyhistidine (His)) and is subsequently analyzed by means of MS. Tandem affinity purification, a two-step purification system, coupled with MS has been widely used mainly to reduce the contaminants. We review here a general principle for AP-MS-based characterization of protein complexes and we explore several protein complexes identified in pluripotent stem cell biology and cancer biology as examples. PMID:27011181
Popescu, Sorina C.; Popescu, George V.; Bachan, Shawn; Zhang, Zimei; Seay, Montrell; Gerstein, Mark; Snyder, Michael; Dinesh-Kumar, S. P.
2007-01-01
Calmodulins (CaMs) are the most ubiquitous calcium sensors in eukaryotes. A number of CaM-binding proteins have been identified through classical methods, and many proteins have been predicted to bind CaMs based on their structural homology with known targets. However, multicellular organisms typically contain many CaM-like (CML) proteins, and a global identification of their targets and specificity of interaction is lacking. In an effort to develop a platform for large-scale analysis of proteins in plants we have developed a protein microarray and used it to study the global analysis of CaM/CML interactions. An Arabidopsis thaliana expression collection containing 1,133 ORFs was generated and used to produce proteins with an optimized medium-throughput plant-based expression system. Protein microarrays were prepared and screened with several CaMs/CMLs. A large number of previously known and novel CaM/CML targets were identified, including transcription factors, receptor and intracellular protein kinases, F-box proteins, RNA-binding proteins, and proteins of unknown function. Multiple CaM/CML proteins bound many binding partners, but the majority of targets were specific to one or a few CaMs/CMLs indicating that different CaM family members function through different targets. Based on our analyses, the emergent CaM/CML interactome is more extensive than previously predicted. Our results suggest that calcium functions through distinct CaM/CML proteins to regulate a wide range of targets and cellular activities. PMID:17360592
Towards fully automated structure-based function prediction in structural genomics: a case study.
Watson, James D; Sanderson, Steve; Ezersky, Alexandra; Savchenko, Alexei; Edwards, Aled; Orengo, Christine; Joachimiak, Andrzej; Laskowski, Roman A; Thornton, Janet M
2007-04-13
As the global Structural Genomics projects have picked up pace, the number of structures annotated in the Protein Data Bank as hypothetical protein or unknown function has grown significantly. A major challenge now involves the development of computational methods to assign functions to these proteins accurately and automatically. As part of the Midwest Center for Structural Genomics (MCSG) we have developed a fully automated functional analysis server, ProFunc, which performs a battery of analyses on a submitted structure. The analyses combine a number of sequence-based and structure-based methods to identify functional clues. After the first stage of the Protein Structure Initiative (PSI), we review the success of the pipeline and the importance of structure-based function prediction. As a dataset, we have chosen all structures solved by the MCSG during the 5 years of the first PSI. Our analysis suggests that two of the structure-based methods are particularly successful and provide examples of local similarity that is difficult to identify using current sequence-based methods. No one method is successful in all cases, so, through the use of a number of complementary sequence and structural approaches, the ProFunc server increases the chances that at least one method will find a significant hit that can help elucidate function. Manual assessment of the results is a time-consuming process and subject to individual interpretation and human error. We present a method based on the Gene Ontology (GO) schema using GO-slims that can allow the automated assessment of hits with a success rate approaching that of expert manual assessment.
A large-scale evaluation of computational protein function prediction
Radivojac, Predrag; Clark, Wyatt T; Ronnen Oron, Tal; Schnoes, Alexandra M; Wittkop, Tobias; Sokolov, Artem; Graim, Kiley; Funk, Christopher; Verspoor, Karin; Ben-Hur, Asa; Pandey, Gaurav; Yunes, Jeffrey M; Talwalkar, Ameet S; Repo, Susanna; Souza, Michael L; Piovesan, Damiano; Casadio, Rita; Wang, Zheng; Cheng, Jianlin; Fang, Hai; Gough, Julian; Koskinen, Patrik; Törönen, Petri; Nokso-Koivisto, Jussi; Holm, Liisa; Cozzetto, Domenico; Buchan, Daniel W A; Bryson, Kevin; Jones, David T; Limaye, Bhakti; Inamdar, Harshal; Datta, Avik; Manjari, Sunitha K; Joshi, Rajendra; Chitale, Meghana; Kihara, Daisuke; Lisewski, Andreas M; Erdin, Serkan; Venner, Eric; Lichtarge, Olivier; Rentzsch, Robert; Yang, Haixuan; Romero, Alfonso E; Bhat, Prajwal; Paccanaro, Alberto; Hamp, Tobias; Kassner, Rebecca; Seemayer, Stefan; Vicedo, Esmeralda; Schaefer, Christian; Achten, Dominik; Auer, Florian; Böhm, Ariane; Braun, Tatjana; Hecht, Maximilian; Heron, Mark; Hönigschmid, Peter; Hopf, Thomas; Kaufmann, Stefanie; Kiening, Michael; Krompass, Denis; Landerer, Cedric; Mahlich, Yannick; Roos, Manfred; Björne, Jari; Salakoski, Tapio; Wong, Andrew; Shatkay, Hagit; Gatzmann, Fanny; Sommer, Ingolf; Wass, Mark N; Sternberg, Michael J E; Škunca, Nives; Supek, Fran; Bošnjak, Matko; Panov, Panče; Džeroski, Sašo; Šmuc, Tomislav; Kourmpetis, Yiannis A I; van Dijk, Aalt D J; ter Braak, Cajo J F; Zhou, Yuanpeng; Gong, Qingtian; Dong, Xinran; Tian, Weidong; Falda, Marco; Fontana, Paolo; Lavezzo, Enrico; Di Camillo, Barbara; Toppo, Stefano; Lan, Liang; Djuric, Nemanja; Guo, Yuhong; Vucetic, Slobodan; Bairoch, Amos; Linial, Michal; Babbitt, Patricia C; Brenner, Steven E; Orengo, Christine; Rost, Burkhard; Mooney, Sean D; Friedberg, Iddo
2013-01-01
Automated annotation of protein function is challenging. As the number of sequenced genomes rapidly grows, the overwhelming majority of protein products can only be annotated computationally. If computational predictions are to be relied upon, it is crucial that the accuracy of these methods be high. Here we report the results from the first large-scale community-based Critical Assessment of protein Function Annotation (CAFA) experiment. Fifty-four methods representing the state-of-the-art for protein function prediction were evaluated on a target set of 866 proteins from eleven organisms. Two findings stand out: (i) today’s best protein function prediction algorithms significantly outperformed widely-used first-generation methods, with large gains on all types of targets; and (ii) although the top methods perform well enough to guide experiments, there is significant need for improvement of currently available tools. PMID:23353650
Improving protein complex classification accuracy using amino acid composition profile.
Huang, Chien-Hung; Chou, Szu-Yu; Ng, Ka-Lok
2013-09-01
Protein complex prediction approaches are based on the assumptions that complexes have dense protein-protein interactions and high functional similarity between their subunits. We investigated those assumptions by studying the subunits' interaction topology, sequence similarity and molecular function for human and yeast protein complexes. Inclusion of amino acids' physicochemical properties can provide better understanding of protein complex properties. Principal component analysis is carried out to determine the major features. Adopting amino acid composition profile information with the SVM classifier serves as an effective post-processing step for complexes classification. Improvement is based on primary sequence information only, which is easy to obtain. Copyright © 2013 Elsevier Ltd. All rights reserved.
Gold, Nicola D; Jackson, Richard M
2006-02-03
The rapid growth in protein structural data and the emergence of structural genomics projects have increased the need for automatic structure analysis and tools for function prediction. Small molecule recognition is critical to the function of many proteins; therefore, determination of ligand binding site similarity is important for understanding ligand interactions and may allow their functional classification. Here, we present a binding sites database (SitesBase) that given a known protein-ligand binding site allows rapid retrieval of other binding sites with similar structure independent of overall sequence or fold similarity. However, each match is also annotated with sequence similarity and fold information to aid interpretation of structure and functional similarity. Similarity in ligand binding sites can indicate common binding modes and recognition of similar molecules, allowing potential inference of function for an uncharacterised protein or providing additional evidence of common function where sequence or fold similarity is already known. Alternatively, the resource can provide valuable information for detailed studies of molecular recognition including structure-based ligand design and in understanding ligand cross-reactivity. Here, we show examples of atomic similarity between superfamily or more distant fold relatives as well as between seemingly unrelated proteins. Assignment of unclassified proteins to structural superfamiles is also undertaken and in most cases substantiates assignments made using sequence similarity. Correct assignment is also possible where sequence similarity fails to find significant matches, illustrating the potential use of binding site comparisons for newly determined proteins.
Ruan, Peiying; Hayashida, Morihiro; Maruyama, Osamu; Akutsu, Tatsuya
2013-01-01
Since many proteins express their functional activity by interacting with other proteins and forming protein complexes, it is very useful to identify sets of proteins that form complexes. For that purpose, many prediction methods for protein complexes from protein-protein interactions have been developed such as MCL, MCODE, RNSC, PCP, RRW, and NWE. These methods have dealt with only complexes with size of more than three because the methods often are based on some density of subgraphs. However, heterodimeric protein complexes that consist of two distinct proteins occupy a large part according to several comprehensive databases of known complexes. In this paper, we propose several feature space mappings from protein-protein interaction data, in which each interaction is weighted based on reliability. Furthermore, we make use of prior knowledge on protein domains to develop feature space mappings, domain composition kernel and its combination kernel with our proposed features. We perform ten-fold cross-validation computational experiments. These results suggest that our proposed kernel considerably outperforms the naive Bayes-based method, which is the best existing method for predicting heterodimeric protein complexes. PMID:23776458
Wang, Bing; Westerhoff, Lance M.; Merz, Kenneth M.
2008-01-01
We have generated docking poses for the FKBP-GPI complex using eight docking programs, and compared their scoring functions with scoring based on NMR chemical shift perturbations (NMRScore). Because the chemical shift perturbation (CSP) is exquisitely sensitive on the orientation of ligand inside the binding pocket, NMRScore offers an accurate and straightforward approach to score different poses. All scoring functions were inspected by their abilities to highly rank the native-like structures and separate them from decoy poses generated for a protein-ligand complex. The overall performance of NMRScore is much better than that of energy-based scoring functions associated with docking programs in both aspects. In summary, we find that the combination of docking programs with NMRScore results in an approach that can robustly determine the binding site structure for a protein-ligand complex, thereby, providing a new tool facilitating the structure-based drug discovery process. PMID:17867664
NASA Astrophysics Data System (ADS)
Palla, Gergely; Derenyi, Imre; Farkas, Illes J.; Vicsek, Tamas
2006-03-01
Most tasks in a cell are performed not by individual proteins, but by functional groups of proteins (either physically interacting with each other or associated in other ways). In gene (protein) association networks these groups show up as sets of densely connected nodes. In the yeast, Saccharomyces cerevisiae, known physically interacting groups of proteins (called protein complexes) strongly overlap: the total number of proteins contained by these complexes by far underestimates the sum of their sizes (2750 vs. 8932). Thus, most functional groups of proteins, both physically interacting and other, are likely to share many of their members with other groups. However, current algorithms searching for dense groups of nodes in networks usually exclude overlaps. With the aim to discover both novel functions of individual proteins and novel protein functional groups we combine in protein association networks (i) a search for overlapping dense subgraphs based on the Clique Percolation Method (CPM) (Palla, G., et.al. Nature 435, 814-818 (2005), http://angel.elte.hu/clustering), which explicitly allows for overlaps among the groups, and (ii) a verification and characterization of the identified groups of nodes (proteins) with the help of standard annotation databases listing known functions.
Baqader, Noor O.; Radulovic, Marko; Crawford, Mark; Stoeber, Kai; Godovac-Zimmermann, Jasminka
2014-01-01
We have used a subcellular spatial razor approach based on LC–MS/MS-based proteomics with SILAC isotope labeling to determine changes in protein abundances in the nuclear and cytoplasmic compartments of human IMR90 fibroblasts subjected to mild oxidative stress. We show that response to mild tert-butyl hydrogen peroxide treatment includes redistribution between the nucleus and cytoplasm of numerous proteins not previously associated with oxidative stress. The 121 proteins with the most significant changes encompass proteins with known functions in a wide variety of subcellular locations and of cellular functional processes (transcription, signal transduction, autophagy, iron metabolism, TCA cycle, ATP synthesis) and are consistent with functional networks that are spatially dispersed across the cell. Both nuclear respiratory factor 2 and the proline regulatory axis appear to contribute to the cellular metabolic response. Proteins involved in iron metabolism or with iron/heme as a cofactor as well as mitochondrial proteins are prominent in the response. Evidence suggesting that nuclear import/export and vesicle-mediated protein transport contribute to the cellular response was obtained. We suggest that measurements of global changes in total cellular protein abundances need to be complemented with measurements of the dynamic subcellular spatial redistribution of proteins to obtain comprehensive pictures of cellular function. PMID:25133973
Engineering cholesterol-based fibers for antibody immobilization and cell capture
NASA Astrophysics Data System (ADS)
Cohn, Celine
In 2015, the United States is expected to have nearly 600,000 deaths attributed to cancer. Of these 600,000 deaths, 90% will be a direct result of cancer metastasis, the spread of cancer throughout the body. During cancer metastasis, circulating tumor cells (CTCs) are shed from primary tumors and migrate through bodily fluids, establishing secondary cancer sites. As cancer metastasis is incredibly lethal, there is a growing emphasis on developing "liquid biopsies" that can screen peripheral blood, search for and identify CTCs. One popular method for capturing CTCs is the use of a detection platform with antibodies specifically suited to recognize and capture cancer cells. These antibodies are immobilized onto the platform and can then bind and capture cells of interest. However, current means to immobilize antibodies often leave them with drastically reduced function. The antibodies are left poorly suited for cell capture, resulting in low cell capture efficiencies. This body of work investigates the use of lipid-based fibers to immobilize proteins in a way that retains protein function, ultimately leading to increased cell capture efficiencies. The resulting increased efficiencies are thought to arise from the retained three-dimensional structure of the protein as well as having a complete coating of the material surface with antibodies that are capable of interacting with their antigens. It is possible to electrospin cholesterol-based fibers that are similar in design to the natural cell membrane, providing proteins a more natural setting during immobilization. Such fibers have been produced from cholesterol-based cholesteryl succinyl silane (CSS). These fibers have previously illustrated a keen aptitude for retaining protein function and increasing cell capture. Herein the work focuses on three key concepts. First, a model is developed to understand the immobilization mechanism used by electrospun CSS fibers. The antibody immobilization and cell capturing abilities of the CSS fibers were compared to that of hydrophobic polycaprolactone (PCL) fibers and hydrophilic plasma-treated PCL fibers. Electrospun CSS fibers were found to immobilize equivalent amounts of protein as hydrophobically immobilized proteins. However, these proteins captured 6 times more cells, indicative of retained protein function. The second key concept was the design and fabrication of a hybridized lipid fiber. Lipid fibers provide improved protein function but fabrication difficulties have limited their adoption. Thus, we sought to fabricate a lipid-polymer hybrid that is easily fabricated while maintaining protein function. The hybrid fiber consists of a PCL backbone with conjugated CSS. The hybrid lipid fibers showed improved protein function. In addition, higher lipid concentrations were directly correlated to higher cell capture efficiencies. The third key concept was on the development of dually functionalized lipid fibers and understanding the resulting cell capture efficiencies. Many platforms are unable to simultaneously search for heterogeneous populations of CTCs -- the ability to dually functionalize cell-capturing platforms would address this technological weakness. Studies indicated that dually functionalizing the lipid fibers did not compromise the platforms' abilities to capture the cells of interest. Such dually functionalized fibers allow for a single cell-capture platform to successfully detect heterogeneous populations of CTCs. The body of work encompassed herein describes the use of lipid fibers for antibody immobilization and cell capture. Data from various projects indicate that the use of cholesterol-based fibers produced from electrospun CSS are well suited for protein immobilization. The CSS fibers are able to immobilize equivalent amounts of protein as compared to other immobilization techniques. However, the benefit of these fibers is illustrated by the strong cell-capturing efficiencies, indicating that the immobilized proteins are able to retain their function and selectively target cells of interest. The successful immobilization of proteins and their retained function allows for the development of increasingly sensitive cancer diagnostic tools that are able to screen for CTCs early on in the cancer disease cycle.
Gouran, Hossein; Chakraborty, Sandeep; Rao, Basuthkar J; Asgeirsson, Bjarni; Dandekar, Abhaya
2014-01-01
Duplication of genes is one of the preferred ways for natural selection to add advantageous functionality to the genome without having to reinvent the wheel with respect to catalytic efficiency and protein stability. The duplicated secretory virulence factors of Xylella fastidiosa (LesA, LesB and LesC), implicated in Pierce's disease of grape and citrus variegated chlorosis of citrus species, epitomizes the positive selection pressures exerted on advantageous genes in such pathogens. A deeper insight into the evolution of these lipases/esterases is essential to develop resistance mechanisms in transgenic plants. Directed evolution, an attempt to accelerate the evolutionary steps in the laboratory, is inherently simple when targeted for loss of function. A bigger challenge is to specify mutations that endow a new function, such as a lost functionality in a duplicated gene. Previously, we have proposed a method for enumerating candidates for mutations intended to transfer the functionality of one protein into another related protein based on the spatial and electrostatic properties of the active site residues (DECAAF). In the current work, we present in vivo validation of DECAAF by inducing tributyrin hydrolysis in LesB based on the active site similarity to LesA. The structures of these proteins have been modeled using RaptorX based on the closely related LipA protein from Xanthomonas oryzae. These mutations replicate the spatial and electrostatic conformation of LesA in the modeled structure of the mutant LesB as well, providing in silico validation before proceeding to the laborious in vivo work. Such focused mutations allows one to dissect the relevance of the duplicated genes in finer detail as compared to gene knockouts, since they do not interfere with other moonlighting functions, protein expression levels or protein-protein interaction.
Rao, Basuthkar J.; Asgeirsson, Bjarni; Dandekar, Abhaya
2014-01-01
Duplication of genes is one of the preferred ways for natural selection to add advantageous functionality to the genome without having to reinvent the wheel with respect to catalytic efficiency and protein stability. The duplicated secretory virulence factors of Xylella fastidiosa (LesA, LesB and LesC), implicated in Pierce's disease of grape and citrus variegated chlorosis of citrus species, epitomizes the positive selection pressures exerted on advantageous genes in such pathogens. A deeper insight into the evolution of these lipases/esterases is essential to develop resistance mechanisms in transgenic plants. Directed evolution, an attempt to accelerate the evolutionary steps in the laboratory, is inherently simple when targeted for loss of function. A bigger challenge is to specify mutations that endow a new function, such as a lost functionality in a duplicated gene. Previously, we have proposed a method for enumerating candidates for mutations intended to transfer the functionality of one protein into another related protein based on the spatial and electrostatic properties of the active site residues (DECAAF). In the current work, we present in vivo validation of DECAAF by inducing tributyrin hydrolysis in LesB based on the active site similarity to LesA. The structures of these proteins have been modeled using RaptorX based on the closely related LipA protein from Xanthomonas oryzae. These mutations replicate the spatial and electrostatic conformation of LesA in the modeled structure of the mutant LesB as well, providing in silico validation before proceeding to the laborious in vivo work. Such focused mutations allows one to dissect the relevance of the duplicated genes in finer detail as compared to gene knockouts, since they do not interfere with other moonlighting functions, protein expression levels or protein-protein interaction. PMID:25717364
Proteome-wide Prediction of Self-interacting Proteins Based on Multiple Properties*
Liu, Zhongyang; Guo, Feifei; Zhang, Jiyang; Wang, Jian; Lu, Liang; Li, Dong; He, Fuchu
2013-01-01
Self-interacting proteins, whose two or more copies can interact with each other, play important roles in cellular functions and the evolution of protein interaction networks (PINs). Knowing whether a protein can self-interact can contribute to and sometimes is crucial for the elucidation of its functions. Previous related research has mainly focused on the structures and functions of specific self-interacting proteins, whereas knowledge on their overall properties is limited. Meanwhile, the two current most common high throughput protein interaction assays have limited ability to detect self-interactions because of biological artifacts and design limitations, whereas the bioinformatic prediction method of self-interacting proteins is lacking. This study aims to systematically study and predict self-interacting proteins from an overall perspective. We find that compared with other proteins the self-interacting proteins in the structural aspect contain more domains; in the evolutionary aspect they tend to be conserved and ancient; in the functional aspect they are significantly enriched with enzyme genes, housekeeping genes, and drug targets, and in the topological aspect tend to occupy important positions in PINs. Furthermore, based on these features, after feature selection, we use logistic regression to integrate six representative features, including Gene Ontology term, domain, paralogous interactor, enzyme, model organism self-interacting protein, and betweenness centrality in the PIN, to develop a proteome-wide prediction model of self-interacting proteins. Using 5-fold cross-validation and an independent test, this model shows good performance. Finally, the prediction model is developed into a user-friendly web service SLIPPER (SeLf-Interacting Protein PrEdictoR). Users may submit a list of proteins, and then SLIPPER will return the probability_scores measuring their possibility to be self-interacting proteins and various related annotation information. This work helps us understand the role self-interacting proteins play in cellular functions from an overall perspective, and the constructed prediction model may contribute to the high throughput finding of self-interacting proteins and provide clues for elucidating their functions. PMID:23422585
DOE Office of Scientific and Technical Information (OSTI.GOV)
Davis, Ryan W.; Brozik, James A.; Brozik, Susan Marie
2007-03-01
The introduction of functional transmembrane proteins into supported bilayer-based biomimetic systems presents a significant challenge for biophysics. Among the various methods for producing supported bilayers, liposomal fusion offers a versatile method for the introduction of membrane proteins into supported bilayers on a variety of substrates. In this study, the properties of protein containing unilamellar phosphocholine lipid bilayers on nanoporous silica microspheres are investigated. The effects of the silica substrate, pore structure, and the substrate curvature on the stability of the membrane and the functionality of the membrane protein are determined. Supported bilayers on porous silica microspheres show a significant increasemore » in surface area on surfaces with structures in excess of 10 nm as well as an overall decrease in stability resulting from increasing pore size and curvature. Comparison of the liposomal and detergent-mediated introduction of purified bacteriorhodopsin (bR) and the human type 3 serotonin receptor (5HT3R) are investigated focusing on the resulting protein function, diffusion, orientation, and incorporation efficiency. In both cases, functional proteins are observed; however, the reconstitution efficiency and orientation selectivity are significantly enhanced through detergent-mediated protein reconstitution. The results of these experiments provide a basis for bulk ionic and fluorescent dye-based compartmentalization assays as well as single-molecule optical and single-channel electrochemical interrogation of transmembrane proteins in a biomimetic platform.« less
Motomura, Kenta; Nakamura, Morikazu; Otaki, Joji M.
2013-01-01
Protein structure and function information is coded in amino acid sequences. However, the relationship between primary sequences and three-dimensional structures and functions remains enigmatic. Our approach to this fundamental biochemistry problem is based on the frequencies of short constituent sequences (SCSs) or words. A protein amino acid sequence is considered analogous to an English sentence, where SCSs are equivalent to words. Availability scores, which are defined as real SCS frequencies in the non-redundant amino acid database relative to their probabilistically expected frequencies, demonstrate the biological usage bias of SCSs. As a result, this frequency-based linguistic approach is expected to have diverse applications, such as secondary structure specifications by structure-specific SCSs and immunological adjuvants with rare or non-existent SCSs. Linguistic similarities (e.g., wide ranges of scale-free distributions) and dissimilarities (e.g., behaviors of low-rank samples) between proteins and the natural English language have been revealed in the rank-frequency relationships of SCSs or words. We have developed a web server, the SCS Package, which contains five applications for analyzing protein sequences based on the linguistic concept. These tools have the potential to assist researchers in deciphering structurally and functionally important protein sites, species-specific sequences, and functional relationships between SCSs. The SCS Package also provides researchers with a tool to construct amino acid sequences de novo based on the idiomatic usage of SCSs. PMID:24688703
Motomura, Kenta; Nakamura, Morikazu; Otaki, Joji M
2013-01-01
Protein structure and function information is coded in amino acid sequences. However, the relationship between primary sequences and three-dimensional structures and functions remains enigmatic. Our approach to this fundamental biochemistry problem is based on the frequencies of short constituent sequences (SCSs) or words. A protein amino acid sequence is considered analogous to an English sentence, where SCSs are equivalent to words. Availability scores, which are defined as real SCS frequencies in the non-redundant amino acid database relative to their probabilistically expected frequencies, demonstrate the biological usage bias of SCSs. As a result, this frequency-based linguistic approach is expected to have diverse applications, such as secondary structure specifications by structure-specific SCSs and immunological adjuvants with rare or non-existent SCSs. Linguistic similarities (e.g., wide ranges of scale-free distributions) and dissimilarities (e.g., behaviors of low-rank samples) between proteins and the natural English language have been revealed in the rank-frequency relationships of SCSs or words. We have developed a web server, the SCS Package, which contains five applications for analyzing protein sequences based on the linguistic concept. These tools have the potential to assist researchers in deciphering structurally and functionally important protein sites, species-specific sequences, and functional relationships between SCSs. The SCS Package also provides researchers with a tool to construct amino acid sequences de novo based on the idiomatic usage of SCSs.
McDonald, Shelley R; Porter Starr, Kathryn N; Mauceri, Luisa; Orenduff, Melissa; Granville, Esther; Ocampo, Christine; Payne, Martha E; Pieper, Carl F; Bales, Connie W
2015-01-01
Obese older adults with even modest functional limitations are at a disadvantage for maintaining their independence into late life. However, there is no established intervention for obesity in older individuals. The Measuring Eating, Activity, and Strength: Understanding the Response - Using Protein (MEASUR-UP) trial is a randomized controlled pilot study of obese women and men aged ≥60 years with mild to moderate functional impairments. Changes in body composition (lean and fat mass) and function (Short Physical Performance Battery) in an enhanced protein weight reduction (Protein) arm will be compared to those in a traditional weight loss (Control) arm. The Protein intervention is based on evidence that older adults achieve optimal rates of muscle protein synthesis when consuming about 25-30 g of high quality protein per meal; these participants will consume ~30 g of animal protein at each meal via a combination of provided protein (beef) servings and diet counseling. This trial will provide information on the feasibility and efficacy of enhancing protein quantity and quality in the context of a weight reduction regimen and determine the impact of this intervention on body weight, functional status, and lean muscle mass. We hypothesize that the enhancement of protein quantity and quality in the Protein arm will result in better outcomes for function and/or lean muscle mass than in the Control arm. Ultimately, we hope our findings will help identify a safe weight loss approach that can delay or prevent late life disability by changing the trajectory of age-associated functional impairment associated with obesity. Copyright © 2014 Elsevier Inc. All rights reserved.
Chemical Component and Proteomic Study of the Amphibalanus (= Balanus) amphitrite Shell
Zhang, Gen; He, Li-sheng; Wong, Yue-Him; Xu, Ying; Zhang, Yu; Qian, Pei-yuan
2015-01-01
As typical biofoulers, barnacles possess hard shells and cause serious biofouling problems. In this study, we analyzed the protein component of the barnacle Amphibalanus (= Balanus) amphitrite shell using gel-based proteomics. The results revealed 52 proteins in the A. Amphitrite shell. Among them, 40 proteins were categorized into 11 functional groups based on KOG database, and the remaining 12 proteins were unknown. Besides the known proteins in barnacle shell (SIPC, carbonic anhydrase and acidic acid matrix protein), we also identified chorion peroxidase, C-type lectin-like domains, serine proteases and proteinase inhibitor proteins in the A. Amphitrite shell. The sequences of these proteins were characterized and their potential functions were discussed. Histology and DAPI staining revealed living cells in the shell, which might secrete the shell proteins identified in this study. PMID:26222041
Enzymatically Active Microgels from Self-Assembling Protein Nanofibrils for Microflow Chemistry
2015-01-01
Amyloid fibrils represent a generic class of protein structure associated with both pathological states and with naturally occurring functional materials. This class of protein nanostructure has recently also emerged as an excellent foundation for sophisticated functional biocompatible materials including scaffolds and carriers for biologically active molecules. Protein-based materials offer the potential advantage that additional functions can be directly incorporated via gene fusion producing a single chimeric polypeptide that will both self-assemble and display the desired activity. To succeed, a chimeric protein system must self-assemble without the need for harsh triggering conditions which would damage the appended functional protein molecule. However, the micrometer to nanoscale patterning and morphological control of protein-based nanomaterials has remained challenging. This study demonstrates a general approach for overcoming these limitations through the microfluidic generation of enzymatically active microgels that are stabilized by amyloid nanofibrils. The use of scaffolds formed from biomaterials that self-assemble under mild conditions enables the formation of catalytic microgels while maintaining the integrity of the encapsulated enzyme. The enzymatically active microgel particles show robust material properties and their porous architecture allows diffusion in and out of reactants and products. In combination with microfluidic droplet trapping approaches, enzymatically active microgels illustrate the potential of self-assembling materials for enzyme immobilization and recycling, and for biological flow-chemistry. These design principles can be adopted to create countless other bioactive amyloid-based materials with diverse functions. PMID:26030507
Evaluating Functional Annotations of Enzymes Using the Gene Ontology.
Holliday, Gemma L; Davidson, Rebecca; Akiva, Eyal; Babbitt, Patricia C
2017-01-01
The Gene Ontology (GO) (Ashburner et al., Nat Genet 25(1):25-29, 2000) is a powerful tool in the informatics arsenal of methods for evaluating annotations in a protein dataset. From identifying the nearest well annotated homologue of a protein of interest to predicting where misannotation has occurred to knowing how confident you can be in the annotations assigned to those proteins is critical. In this chapter we explore what makes an enzyme unique and how we can use GO to infer aspects of protein function based on sequence similarity. These can range from identification of misannotation or other errors in a predicted function to accurate function prediction for an enzyme of entirely unknown function. Although GO annotation applies to any gene products, we focus here a describing our approach for hierarchical classification of enzymes in the Structure-Function Linkage Database (SFLD) (Akiva et al., Nucleic Acids Res 42(Database issue):D521-530, 2014) as a guide for informed utilisation of annotation transfer based on GO terms.
Multifunctional recombinant phycobiliprotein-based fluorescent constructs and phycobilisome display
Glazer, Alexander N.; Cai, Yuping
2007-01-30
The invention provides multifunctional fusion constructs which are rapidly incorporated into a macromolecular structure such as a phycobilisome such that the fusion proteins are separated from one another and unable to self-associate. The invention provides methods and compositions for displaying a functional polypeptide domain on an oligomeric phycobiliprotein, including fusion proteins comprising a functional displayed domain and a functional phycobiliprotein domain incorporated in a functional oligomeric phycobiliprotein. The fusion proteins provide novel specific labeling reagents.
Multifunctional recombinant phycobiliprotein-based fluorescent constructs and phycobilisome display
Glazer, Alexander N.; Cai, Yuping
2007-02-13
The invention provides multifunctional fusion constructs which are rapidly incorporated into a macromolecular structure such as a phycobilisome such that the fusion proteins are separated from one another and unable to self-associate. The invention provides methods and compositions for displaying a functional polypeptide domain on an oligomeric phycobiliprotein. including fusion proteins comprising a functional displayed domain and a functional phycobiliprotein domain incorporated in a functional oligomeric phycobiliprotein. The fusion proteins provide novel specific labeling reagents.
Multifunctional recombinant phycobiliprotein-based fluorescent constructs and phycobilisome display
Glazer, Alexander N.; Cai, Yuping
2003-11-18
The invention provides multifunctional fusion constructs which are rapidly incorporated into a macromolecular structure such as a phycobilisome such that the fusion proteins are separated from one another and unable to self-associate. The invention provides methods and compositions for displaying a functional polypeptide domain on an oligomeric phycobiliprotein, including fusion proteins comprising a functional displayed domain and a functional phycobiliprotein domain incorporated in a functional oligomeric phycobiliprotein. The fusion proteins provide novel specific labeling reagents.
Naqvi, Ahmad Abu Turab; Ahmad, Faizan; Hassan, Md Imtaiyaz
2015-01-01
Mycobacterium leprae is an intracellular obligate parasite that causes leprosy in humans, and it leads to the destruction of peripheral nerves and skin deformation. Here, we report an extensive analysis of the hypothetical proteins (HPs) from M. leprae strain Br4923, assigning their functions to better understand the mechanism of pathogenesis and to search for potential therapeutic interventions. The genome of M. leprae encodes 1604 proteins, of which the functions of 632 are not known (HPs). In this paper, we predicted the probable functions of 312 HPs. First, we classified all HPs into families and subfamilies on the basis of sequence similarity, followed by domain assignment, which provides many clues for their possible function. However, the functions of 320 proteins were not predicted because of low sequence similarity with proteins of known function. Annotated HPs were categorized into enzymes, binding proteins, transporters, and proteins involved in cellular processes. We found several novel proteins whose functions were unknown for M. leprae. These proteins have a requisite association with bacterial virulence and pathogenicity. Finally, our sequence-based analysis will be helpful for further validation and the search for potential drug targets while developing effective drugs to cure leprosy.
Graph pyramids for protein function prediction
2015-01-01
Background Uncovering the hidden organizational characteristics and regularities among biological sequences is the key issue for detailed understanding of an underlying biological phenomenon. Thus pattern recognition from nucleic acid sequences is an important affair for protein function prediction. As proteins from the same family exhibit similar characteristics, homology based approaches predict protein functions via protein classification. But conventional classification approaches mostly rely on the global features by considering only strong protein similarity matches. This leads to significant loss of prediction accuracy. Methods Here we construct the Protein-Protein Similarity (PPS) network, which captures the subtle properties of protein families. The proposed method considers the local as well as the global features, by examining the interactions among 'weakly interacting proteins' in the PPS network and by using hierarchical graph analysis via the graph pyramid. Different underlying properties of the protein families are uncovered by operating the proposed graph based features at various pyramid levels. Results Experimental results on benchmark data sets show that the proposed hierarchical voting algorithm using graph pyramid helps to improve computational efficiency as well the protein classification accuracy. Quantitatively, among 14,086 test sequences, on an average the proposed method misclassified only 21.1 sequences whereas baseline BLAST score based global feature matching method misclassified 362.9 sequences. With each correctly classified test sequence, the fast incremental learning ability of the proposed method further enhances the training model. Thus it has achieved more than 96% protein classification accuracy using only 20% per class training data. PMID:26044522
Graph pyramids for protein function prediction.
Sandhan, Tushar; Yoo, Youngjun; Choi, Jin; Kim, Sun
2015-01-01
Uncovering the hidden organizational characteristics and regularities among biological sequences is the key issue for detailed understanding of an underlying biological phenomenon. Thus pattern recognition from nucleic acid sequences is an important affair for protein function prediction. As proteins from the same family exhibit similar characteristics, homology based approaches predict protein functions via protein classification. But conventional classification approaches mostly rely on the global features by considering only strong protein similarity matches. This leads to significant loss of prediction accuracy. Here we construct the Protein-Protein Similarity (PPS) network, which captures the subtle properties of protein families. The proposed method considers the local as well as the global features, by examining the interactions among 'weakly interacting proteins' in the PPS network and by using hierarchical graph analysis via the graph pyramid. Different underlying properties of the protein families are uncovered by operating the proposed graph based features at various pyramid levels. Experimental results on benchmark data sets show that the proposed hierarchical voting algorithm using graph pyramid helps to improve computational efficiency as well the protein classification accuracy. Quantitatively, among 14,086 test sequences, on an average the proposed method misclassified only 21.1 sequences whereas baseline BLAST score based global feature matching method misclassified 362.9 sequences. With each correctly classified test sequence, the fast incremental learning ability of the proposed method further enhances the training model. Thus it has achieved more than 96% protein classification accuracy using only 20% per class training data.
Eisenberg, David; Marcotte, Edward M.; Pellegrini, Matteo; Thompson, Michael J.; Yeates, Todd O.
2002-10-15
A computational method system, and computer program are provided for inferring functional links from genome sequences. One method is based on the observation that some pairs of proteins A' and B' have homologs in another organism fused into a single protein chain AB. A trans-genome comparison of sequences can reveal these AB sequences, which are Rosetta Stone sequences because they decipher an interaction between A' and B. Another method compares the genomic sequence of two or more organisms to create a phylogenetic profile for each protein indicating its presence or absence across all the genomes. The profile provides information regarding functional links between different families of proteins. In yet another method a combination of the above two methods is used to predict functional links.
Development and Application of Functionalized Protein Binders in Multicellular Organisms.
Bieli, D; Alborelli, I; Harmansa, S; Matsuda, S; Caussinus, E; Affolter, M
2016-01-01
Protein-protein interactions are crucial for almost all biological processes. Studying such interactions in their native environment is critical but not easy to perform. Recently developed genetically encoded protein binders were shown to function inside living cells. These molecules offer a new, direct way to assess protein function, distribution and dynamics in vivo. A widely used protein binder scaffold are the so-called nanobodies, which are derived from the variable domain of camelid heavy-chain antibodies. Another commonly used scaffold, the DARPins, is based on Ankyrin repeats. In this review, we highlight how these binders can be functionalized in order to study proteins in vivo during the development of multicellular organisms. It is to be anticipated that many more applications for such synthetic protein binders will be developed in the near future. Copyright © 2016 Elsevier Inc. All rights reserved.
Assigning protein functions by comparative genome analysis protein phylogenetic profiles
Pellegrini, Matteo; Marcotte, Edward M.; Thompson, Michael J.; Eisenberg, David; Grothe, Robert; Yeates, Todd O.
2003-05-13
A computational method system, and computer program are provided for inferring functional links from genome sequences. One method is based on the observation that some pairs of proteins A' and B' have homologs in another organism fused into a single protein chain AB. A trans-genome comparison of sequences can reveal these AB sequences, which are Rosetta Stone sequences because they decipher an interaction between A' and B. Another method compares the genomic sequence of two or more organisms to create a phylogenetic profile for each protein indicating its presence or absence across all the genomes. The profile provides information regarding functional links between different families of proteins. In yet another method a combination of the above two methods is used to predict functional links.
Sailem, Heba Z.; Kümper, Sandra; Tape, Christopher J.; McCully, Ryan R.; Paul, Angela; Anjomani-Virmouni, Sara; Jørgensen, Claus; Poulogiannis, George; Marshall, Christopher J.
2017-01-01
Localisation and protein function are intimately linked in eukaryotes, as proteins are localised to specific compartments where they come into proximity of other functionally relevant proteins. Significant co-localisation of two proteins can therefore be indicative of their functional association. We here present COLA, a proteomics based strategy coupled with a bioinformatics framework to detect protein–protein co-localisations on a global scale. COLA reveals functional interactions by matching proteins with significant similarity in their subcellular localisation signatures. The rapid nature of COLA allows mapping of interactome dynamics across different conditions or treatments with high precision. PMID:27824369
Structure-based conformational preferences of amino acids
Koehl, Patrice; Levitt, Michael
1999-01-01
Proteins can be very tolerant to amino acid substitution, even within their core. Understanding the factors responsible for this behavior is of critical importance for protein engineering and design. Mutations in proteins have been quantified in terms of the changes in stability they induce. For example, guest residues in specific secondary structures have been used as probes of conformational preferences of amino acids, yielding propensity scales. Predicting these amino acid propensities would be a good test of any new potential energy functions used to mimic protein stability. We have recently developed a protein design procedure that optimizes whole sequences for a given target conformation based on the knowledge of the template backbone and on a semiempirical potential energy function. This energy function is purely physical, including steric interactions based on a Lennard-Jones potential, electrostatics based on a Coulomb potential, and hydrophobicity in the form of an environment free energy based on accessible surface area and interatomic contact areas. Sequences designed by this procedure for 10 different proteins were analyzed to extract conformational preferences for amino acids. The resulting structure-based propensity scales show significant agreements with experimental propensity scale values, both for α-helices and β-sheets. These results indicate that amino acid conformational preferences are a natural consequence of the potential energy we use. This confirms the accuracy of our potential and indicates that such preferences should not be added as a design criterion. PMID:10535955
Predicting protein functions from redundancies in large-scale protein interaction networks
NASA Technical Reports Server (NTRS)
Samanta, Manoj Pratim; Liang, Shoudan
2003-01-01
Interpreting data from large-scale protein interaction experiments has been a challenging task because of the widespread presence of random false positives. Here, we present a network-based statistical algorithm that overcomes this difficulty and allows us to derive functions of unannotated proteins from large-scale interaction data. Our algorithm uses the insight that if two proteins share significantly larger number of common interaction partners than random, they have close functional associations. Analysis of publicly available data from Saccharomyces cerevisiae reveals >2,800 reliable functional associations, 29% of which involve at least one unannotated protein. By further analyzing these associations, we derive tentative functions for 81 unannotated proteins with high certainty. Our method is not overly sensitive to the false positives present in the data. Even after adding 50% randomly generated interactions to the measured data set, we are able to recover almost all (approximately 89%) of the original associations.
Random heteropolymers preserve protein function in foreign environments
NASA Astrophysics Data System (ADS)
Panganiban, Brian; Qiao, Baofu; Jiang, Tao; DelRe, Christopher; Obadia, Mona M.; Nguyen, Trung Dac; Smith, Anton A. A.; Hall, Aaron; Sit, Izaac; Crosby, Marquise G.; Dennis, Patrick B.; Drockenmuller, Eric; Olvera de la Cruz, Monica; Xu, Ting
2018-03-01
The successful incorporation of active proteins into synthetic polymers could lead to a new class of materials with functions found only in living systems. However, proteins rarely function under the conditions suitable for polymer processing. On the basis of an analysis of trends in protein sequences and characteristic chemical patterns on protein surfaces, we designed four-monomer random heteropolymers to mimic intrinsically disordered proteins for protein solubilization and stabilization in non-native environments. The heteropolymers, with optimized composition and statistical monomer distribution, enable cell-free synthesis of membrane proteins with proper protein folding for transport and enzyme-containing plastics for toxin bioremediation. Controlling the statistical monomer distribution in a heteropolymer, rather than the specific monomer sequence, affords a new strategy to interface with biological systems for protein-based biomaterials.
Shi, Xiaohe; Lu, Wen-Cong; Cai, Yu-Dong; Chou, Kuo-Chen
2011-01-01
Background With the huge amount of uncharacterized protein sequences generated in the post-genomic age, it is highly desirable to develop effective computational methods for quickly and accurately predicting their functions. The information thus obtained would be very useful for both basic research and drug development in a timely manner. Methodology/Principal Findings Although many efforts have been made in this regard, most of them were based on either sequence similarity or protein-protein interaction (PPI) information. However, the former often fails to work if a query protein has no or very little sequence similarity to any function-known proteins, while the latter had similar problem if the relevant PPI information is not available. In view of this, a new approach is proposed by hybridizing the PPI information and the biochemical/physicochemical features of protein sequences. The overall first-order success rates by the new predictor for the functions of mouse proteins on training set and test set were 69.1% and 70.2%, respectively, and the success rate covered by the results of the top-4 order from a total of 24 orders was 65.2%. Conclusions/Significance The results indicate that the new approach is quite promising that may open a new avenue or direction for addressing the difficult and complicated problem. PMID:21283518
Hati, Sanchita; Bhattacharyya, Sudeep
2016-01-01
A project-based biophysical chemistry laboratory course, which is offered to the biochemistry and molecular biology majors in their senior year, is described. In this course, the classroom study of the structure-function of biomolecules is integrated with the discovery-guided laboratory study of these molecules using computer modeling and simulations. In particular, modern computational tools are employed to elucidate the relationship between structure, dynamics, and function in proteins. Computer-based laboratory protocols that we introduced in three modules allow students to visualize the secondary, super-secondary, and tertiary structures of proteins, analyze non-covalent interactions in protein-ligand complexes, develop three-dimensional structural models (homology model) for new protein sequences and evaluate their structural qualities, and study proteins' intrinsic dynamics to understand their functions. In the fourth module, students are assigned to an authentic research problem, where they apply their laboratory skills (acquired in modules 1-3) to answer conceptual biophysical questions. Through this process, students gain in-depth understanding of protein dynamics-the missing link between structure and function. Additionally, the requirement of term papers sharpens students' writing and communication skills. Finally, these projects result in new findings that are communicated in peer-reviewed journals. © 2016 The International Union of Biochemistry and Molecular Biology.
Scale-space measures for graph topology link protein network architecture to function.
Hulsman, Marc; Dimitrakopoulos, Christos; de Ridder, Jeroen
2014-06-15
The network architecture of physical protein interactions is an important determinant for the molecular functions that are carried out within each cell. To study this relation, the network architecture can be characterized by graph topological characteristics such as shortest paths and network hubs. These characteristics have an important shortcoming: they do not take into account that interactions occur across different scales. This is important because some cellular functions may involve a single direct protein interaction (small scale), whereas others require more and/or indirect interactions, such as protein complexes (medium scale) and interactions between large modules of proteins (large scale). In this work, we derive generalized scale-aware versions of known graph topological measures based on diffusion kernels. We apply these to characterize the topology of networks across all scales simultaneously, generating a so-called graph topological scale-space. The comprehensive physical interaction network in yeast is used to show that scale-space based measures consistently give superior performance when distinguishing protein functional categories and three major types of functional interactions-genetic interaction, co-expression and perturbation interactions. Moreover, we demonstrate that graph topological scale spaces capture biologically meaningful features that provide new insights into the link between function and protein network architecture. Matlab(TM) code to calculate the scale-aware topological measures (STMs) is available at http://bioinformatics.tudelft.nl/TSSA © The Author 2014. Published by Oxford University Press.
Montané, M H; Kloppstech, K
2000-11-27
Light-harvesting complex proteins (LHCs) and early light-induced proteins (ELIPs) are essential pigment-binding components of the thylakoid membrane and are encoded by one of the largest and most complex higher plant gene families. The functional diversification of these proteins corresponded to the transition from extrinsic (phycobilisome-based) to intrinsic (LHC-based) light-harvesting antenna systems during the evolution of chloroplasts from cyanobacteria, yet the functional basis of this diversification has been elusive. Here, we propose that the original function of LHCs and ELIPs was not to collect light and to transfer its energy content to the reaction centers but to disperse the absorbed energy of light in the form of heat or fluorescence. These energy-dispersing proteins are believed to have originated in cyanobacteria as one-helix, highly light-inducible proteins (HLIPs) that later acquired four helices through two successive gene duplication steps. We suggest that the ELIPs arose first in this succession, with a primary function in energy dispersion for protection of photosynthetic pigments from photo-oxidation. We consider the LHC I and II families as more recent and very successful evolutionary additions to this family that ultimately attained a new function, thereby replacing the ancestral extrinsic light-harvesting system. Our model accounts for the non-photochemical quenching role recently shown for higher plant psbS proteins.
Pantazes, Robert J; Saraf, Manish C; Maranas, Costas D
2007-08-01
In this paper, we introduce and test two new sequence-based protein scoring systems (i.e. S1, S2) for assessing the likelihood that a given protein hybrid will be functional. By binning together amino acids with similar properties (i.e. volume, hydrophobicity and charge) the scoring systems S1 and S2 allow for the quantification of the severity of mismatched interactions in the hybrids. The S2 scoring system is found to be able to significantly functionally enrich a cytochrome P450 library over other scoring methods. Given this scoring base, we subsequently constructed two separate optimization formulations (i.e. OPTCOMB and OPTOLIGO) for optimally designing protein combinatorial libraries involving recombination or mutations, respectively. Notably, two separate versions of OPTCOMB are generated (i.e. model M1, M2) with the latter allowing for position-dependent parental fragment skipping. Computational benchmarking results demonstrate the efficacy of models OPTCOMB and OPTOLIGO to generate high scoring libraries of a prespecified size.
Protein complexes are assemblies of subunits that have co-evolved to execute one or many coordinated functions in the cellular environment. Functional annotation of mammalian protein complexes is critical to understanding biological processes, as well as disease mechanisms. Here, we used genetic co-essentiality derived from genome-scale RNAi- and CRISPR-Cas9-based fitness screens performed across hundreds of human cancer cell lines to assign measures of functional similarity.
The evolution of function within the Nudix homology clan
Srouji, John R.; Xu, Anting; Park, Annsea; Kirsch, Jack F.
2017-01-01
ABSTRACT The Nudix homology clan encompasses over 80,000 protein domains from all three domains of life, defined by homology to each other. Proteins with a domain from this clan fall into four general functional classes: pyrophosphohydrolases, isopentenyl diphosphate isomerases (IDIs), adenine/guanine mismatch‐specific adenine glycosylases (A/G‐specific adenine glycosylases), and nonenzymatic activities such as protein/protein interaction and transcriptional regulation. The largest group, pyrophosphohydrolases, encompasses more than 100 distinct hydrolase specificities. To understand the evolution of this vast number of activities, we assembled and analyzed experimental and structural data for 205 Nudix proteins collected from the literature. We corrected erroneous functions or provided more appropriate descriptions for 53 annotations described in the Gene Ontology Annotation database in this family, and propose 275 new experimentally‐based annotations. We manually constructed a structure‐guided sequence alignment of 78 Nudix proteins. Using the structural alignment as a seed, we then made an alignment of 347 “select” Nudix homology domains, curated from structurally determined, functionally characterized, or phylogenetically important Nudix domains. Based on our review of Nudix pyrophosphohydrolase structures and specificities, we further analyzed a loop region downstream of the Nudix hydrolase motif previously shown to contact the substrate molecule and possess known functional motifs. This loop region provides a potential structural basis for the functional radiation and evolution of substrate specificity within the hydrolase family. Finally, phylogenetic analyses of the 347 select protein domains and of the complete Nudix homology clan revealed general monophyly with regard to function and a few instances of probable homoplasy. Proteins 2017; 85:775–811. © 2016 Wiley Periodicals, Inc. PMID:27936487
Verma, Amit K; Diwan, Danish; Raut, Sandeep; Dobriyal, Neha; Brown, Rebecca E; Gowda, Vinita; Hines, Justin K; Sahi, Chandan
2017-06-07
Heat shock proteins of 70 kDa (Hsp70s) partner with structurally diverse Hsp40s (J proteins), generating distinct chaperone networks in various cellular compartments that perform myriad housekeeping and stress-associated functions in all organisms. Plants, being sessile, need to constantly maintain their cellular proteostasis in response to external environmental cues. In these situations, the Hsp70:J protein machines may play an important role in fine-tuning cellular protein quality control. Although ubiquitous, the functional specificity and complexity of the plant Hsp70:J protein network has not been studied. Here, we analyzed the J protein network in the cytosol of Arabidopsis thaliana and, using yeast genetics, show that the functional specificities of most plant J proteins in fundamental chaperone functions are conserved across long evolutionary timescales. Detailed phylogenetic and functional analysis revealed that increased number, regulatory differences, and neofunctionalization in J proteins together contribute to the emerging functional diversity and complexity in the Hsp70:J protein network in higher plants. Based on the data presented, we propose that higher plants have orchestrated their "chaperome," especially their J protein complement, according to their specialized cellular and physiological stipulations. Copyright © 2017 Verma et al.
Super-resolution links vinculin localization to function in focal adhesions.
Giannone, Grégory
2015-07-01
Integrin-based focal adhesions integrate biochemical and biomechanical signals from the extracellular matrix and the actin cytoskeleton. The combination of three-dimensional super-resolution imaging and loss- or gain-of-function protein mutants now links the nanoscale dynamic localization of proteins to their activation and function within focal adhesions.
iDBPs: a web server for the identification of DNA binding proteins.
Nimrod, Guy; Schushan, Maya; Szilágyi, András; Leslie, Christina; Ben-Tal, Nir
2010-03-01
The iDBPs server uses the three-dimensional (3D) structure of a query protein to predict whether it binds DNA. First, the algorithm predicts the functional region of the protein based on its evolutionary profile; the assumption is that large clusters of conserved residues are good markers of functional regions. Next, various characteristics of the predicted functional region as well as global features of the protein are calculated, such as the average surface electrostatic potential, the dipole moment and cluster-based amino acid conservation patterns. Finally, a random forests classifier is used to predict whether the query protein is likely to bind DNA and to estimate the prediction confidence. We have trained and tested the classifier on various datasets and shown that it outperformed related methods. On a dataset that reflects the fraction of DNA binding proteins (DBPs) in a proteome, the area under the ROC curve was 0.90. The application of the server to an updated version of the N-Func database, which contains proteins of unknown function with solved 3D-structure, suggested new putative DBPs for experimental studies. http://idbps.tau.ac.il/
Sequential Release of Proteins from Structured Multishell Microcapsules.
Shimanovich, Ulyana; Michaels, Thomas C T; De Genst, Erwin; Matak-Vinkovic, Dijana; Dobson, Christopher M; Knowles, Tuomas P J
2017-10-09
In nature, a wide range of functional materials is based on proteins. Increasing attention is also turning to the use of proteins as artificial biomaterials in the form of films, gels, particles, and fibrils that offer great potential for applications in areas ranging from molecular medicine to materials science. To date, however, most such applications have been limited to single component materials despite the fact that their natural analogues are composed of multiple types of proteins with a variety of functionalities that are coassembled in a highly organized manner on the micrometer scale, a process that is currently challenging to achieve in the laboratory. Here, we demonstrate the fabrication of multicomponent protein microcapsules where the different components are positioned in a controlled manner. We use molecular self-assembly to generate multicomponent structures on the nanometer scale and droplet microfluidics to bring together the different components on the micrometer scale. Using this approach, we synthesize a wide range of multiprotein microcapsules containing three well-characterized proteins: glucagon, insulin, and lysozyme. The localization of each protein component in multishell microcapsules has been detected by labeling protein molecules with different fluorophores, and the final three-dimensional microcapsule structure has been resolved by using confocal microscopy together with image analysis techniques. In addition, we show that these structures can be used to tailor the release of such functional proteins in a sequential manner. Moreover, our observations demonstrate that the protein release mechanism from multishell capsules is driven by the kinetic control of mass transport of the cargo and by the dissolution of the shells. The ability to generate artificial materials that incorporate a variety of different proteins with distinct functionalities increases the breadth of the potential applications of artificial protein-based materials and provides opportunities to design more refined functional protein delivery systems.
Elegbede, Jennifer L; Li, Min; Jones, Owen G; Campanella, Osvaldo H; Ferruzzi, Mario G
2018-05-01
With growing interest in formulating new food products with added protein and flavonoid-rich ingredients for health benefits, direct interactions between these ingredient classes becomes critical in so much as they may impact protein functionality, product quality, and flavonoids bioavailability. In this study, sodium caseinate (SCN)-based model products (foams and emulsions) were formulated with grape seed extract (GSE, rich in galloylated flavonoids) and green tea extract (GTE, rich in nongalloylated flavonoids), respectively, to assess changes in functional properties of SCN and impacts on flavonoid bioaccessibility. Experiments with pure flavonoids suggested that galloylated flavonoids reduced air-water interfacial tension of 0.01% SCN dispersions more significantly than nongalloylated flavonoids at high concentrations (>50 μg/mL). This observation was supported by changes in stability of 5% SCN foam, which showed that foam stability was increased at high levels of GSE (≥50 μg/mL, P < 0.05) but was not affected by GTE. However, flavonoid extracts had modest effects on SCN emulsion. In addition, galloylated flavonoids had higher bioaccessibility in both SCN foam and emulsion. These results suggest that SCN-flavonoid binding interactions can modulate protein functionality leading to difference in performance and flavonoid bioaccessibility of protein-based products. As information on the beneficial health effects of flavonoids expands, it is likely that usage of these ingredients in consumer foods will increase. However, the necessary levels to provide such benefits may exceed those that begin to impact functionality of the macronutrients such as proteins. Flavonoid inclusion within protein matrices may modulate protein functionality in a food system and modify critical consumer traits or delivery of these beneficial plant-derived components. The product matrices utilized in this study offer relevant model systems to evaluate how fortification with flavonoid-rich extracts allows for differing effects on formability and stability of the protein-based systems, and on bioaccessibility of fortified flavonoid extracts. © 2018 Institute of Food Technologists®.
The Functional Human C-Terminome
Hedden, Michael; Lyon, Kenneth F.; Brooks, Steven B.; David, Roxanne P.; Limtong, Justin; Newsome, Jacklyn M.; Novakovic, Nemanja; Rajasekaran, Sanguthevar; Thapar, Vishal; Williams, Sean R.; Schiller, Martin R.
2016-01-01
All translated proteins end with a carboxylic acid commonly called the C-terminus. Many short functional sequences (minimotifs) are located on or immediately proximal to the C-terminus. However, information about the function of protein C-termini has not been consolidated into a single source. Here, we built a new “C-terminome” database and web system focused on human proteins. Approximately 3,600 C-termini in the human proteome have a minimotif with an established molecular function. To help evaluate the function of the remaining C-termini in the human proteome, we inferred minimotifs identified by experimentation in rodent cells, predicted minimotifs based upon consensus sequence matches, and predicted novel highly repetitive sequences in C-termini. Predictions can be ranked by enrichment scores or Gene Evolutionary Rate Profiling (GERP) scores, a measurement of evolutionary constraint. By searching for new anchored sequences on the last 10 amino acids of proteins in the human proteome with lengths between 3–10 residues and up to 5 degenerate positions in the consensus sequences, we have identified new consensus sequences that predict instances in the majority of human genes. All of this information is consolidated into a database that can be accessed through a C-terminome web system with search and browse functions for minimotifs and human proteins. A known consensus sequence-based predicted function is assigned to nearly half the proteins in the human proteome. Weblink: http://cterminome.bio-toolkit.com. PMID:27050421
Lee, Kyung-Ho; Kim, Dong-Myung
2013-11-01
Synthetic biology is built on the synthesis, engineering, and assembly of biological parts. Proteins are the first components considered for the construction of systems with designed biological functions because proteins carry out most of the biological functions and chemical reactions inside cells. Protein synthesis is considered to comprise the most basic levels of the hierarchical structure of synthetic biology. Cell-free protein synthesis has emerged as a powerful technology that can potentially transform the concept of bioprocesses. With the ability to harness the synthetic power of biology without many of the constraints of cell-based systems, cell-free protein synthesis enables the rapid creation of protein molecules from diverse sources of genetic information. Cell-free protein synthesis is virtually free from the intrinsic constraints of cell-based methods and offers greater flexibility in system design and manipulability of biological synthetic machinery. Among its potential applications, cell-free protein synthesis can be combined with various man-made devices for rapid functional analysis of genomic sequences. This review covers recent efforts to integrate cell-free protein synthesis with various reaction devices and analytical platforms. Copyright © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Density functional study of molecular interactions in secondary structures of proteins.
Takano, Yu; Kusaka, Ayumi; Nakamura, Haruki
2016-01-01
Proteins play diverse and vital roles in biology, which are dominated by their three-dimensional structures. The three-dimensional structure of a protein determines its functions and chemical properties. Protein secondary structures, including α-helices and β-sheets, are key components of the protein architecture. Molecular interactions, in particular hydrogen bonds, play significant roles in the formation of protein secondary structures. Precise and quantitative estimations of these interactions are required to understand the principles underlying the formation of three-dimensional protein structures. In the present study, we have investigated the molecular interactions in α-helices and β-sheets, using ab initio wave function-based methods, the Hartree-Fock method (HF) and the second-order Møller-Plesset perturbation theory (MP2), density functional theory, and molecular mechanics. The characteristic interactions essential for forming the secondary structures are discussed quantitatively.
The flavivirus capsid protein: Structure, function and perspectives towards drug design.
Oliveira, Edson R A; Mohana-Borges, Ronaldo; de Alencastro, Ricardo B; Horta, Bruno A C
2017-01-02
Flaviviruses, such as dengue and zika viruses, are etiologic agents transmitted to humans mainly by arthropods and are of great epidemiological interest. The flavivirus capsid protein is a structural element required for the viral nucleocapsid assembly that presents the classical function of sheltering the viral genome. After decades of research, many reports have shown its different functionalities and influence over cell normal functioning. The subcellular distribution of this protein, which involves accumulation around lipid droplets and nuclear localization, also corroborates with its multi-functional characteristic. As flavivirus diseases are still in need of global control and in view of the possible key functionalities that the capsid protein promotes over flavivirus biology, novel considerations arise towards anti-flavivirus drug research. This review covers the main aspects concerning structural and functional features of the flavivirus C protein, ultimately, highlighting prospects in drug discovery based on this viral target. Copyright © 2016 Elsevier B.V. All rights reserved.
Understand protein functions by comparing the similarity of local structural environments.
Chen, Jiawen; Xie, Zhong-Ru; Wu, Yinghao
2017-02-01
The three-dimensional structures of proteins play an essential role in regulating binding between proteins and their partners, offering a direct relationship between structures and functions of proteins. It is widely accepted that the function of a protein can be determined if its structure is similar to other proteins whose functions are known. However, it is also observed that proteins with similar global structures do not necessarily correspond to the same function, while proteins with very different folds can share similar functions. This indicates that function similarity is originated from the local structural information of proteins instead of their global shapes. We assume that proteins with similar local environments prefer binding to similar types of molecular targets. In order to testify this assumption, we designed a new structural indicator to define the similarity of local environment between residues in different proteins. This indicator was further used to calculate the probability that a given residue binds to a specific type of structural neighbors, including DNA, RNA, small molecules and proteins. After applying the method to a large-scale non-redundant database of proteins, we show that the positive signal of binding probability calculated from the local structural indicator is statistically meaningful. In summary, our studies suggested that the local environment of residues in a protein is a good indicator to recognize specific binding partners of the protein. The new method could be a potential addition to a suite of existing template-based approaches for protein function prediction. Copyright © 2016 Elsevier B.V. All rights reserved.
Sequence-similar, structure-dissimilar protein pairs in the PDB.
Kosloff, Mickey; Kolodny, Rachel
2008-05-01
It is often assumed that in the Protein Data Bank (PDB), two proteins with similar sequences will also have similar structures. Accordingly, it has proved useful to develop subsets of the PDB from which "redundant" structures have been removed, based on a sequence-based criterion for similarity. Similarly, when predicting protein structure using homology modeling, if a template structure for modeling a target sequence is selected by sequence alone, this implicitly assumes that all sequence-similar templates are equivalent. Here, we show that this assumption is often not correct and that standard approaches to create subsets of the PDB can lead to the loss of structurally and functionally important information. We have carried out sequence-based structural superpositions and geometry-based structural alignments of a large number of protein pairs to determine the extent to which sequence similarity ensures structural similarity. We find many examples where two proteins that are similar in sequence have structures that differ significantly from one another. The source of the structural differences usually has a functional basis. The number of such proteins pairs that are identified and the magnitude of the dissimilarity depend on the approach that is used to calculate the differences; in particular sequence-based structure superpositioning will identify a larger number of structurally dissimilar pairs than geometry-based structural alignments. When two sequences can be aligned in a statistically meaningful way, sequence-based structural superpositioning provides a meaningful measure of structural differences. This approach and geometry-based structure alignments reveal somewhat different information and one or the other might be preferable in a given application. Our results suggest that in some cases, notably homology modeling, the common use of nonredundant datasets, culled from the PDB based on sequence, may mask important structural and functional information. We have established a data base of sequence-similar, structurally dissimilar protein pairs that will help address this problem (http://luna.bioc.columbia.edu/rachel/seqsimstrdiff.htm).
Stein, Matthias; Pilli, Manohar; Bernauer, Sabine; Habermann, Bianca H.; Zerial, Marino; Wade, Rebecca C.
2012-01-01
Background Rab GTPases constitute the largest subfamily of the Ras protein superfamily. Rab proteins regulate organelle biogenesis and transport, and display distinct binding preferences for effector and activator proteins, many of which have not been elucidated yet. The underlying molecular recognition motifs, binding partner preferences and selectivities are not well understood. Methodology/Principal Findings Comparative analysis of the amino acid sequences and the three-dimensional electrostatic and hydrophobic molecular interaction fields of 62 human Rab proteins revealed a wide range of binding properties with large differences between some Rab proteins. This analysis assists the functional annotation of Rab proteins 12, 14, 26, 37 and 41 and provided an explanation for the shared function of Rab3 and 27. Rab7a and 7b have very different electrostatic potentials, indicating that they may bind to different effector proteins and thus, exert different functions. The subfamily V Rab GTPases which are associated with endosome differ subtly in the interaction properties of their switch regions, and this may explain exchange factor specificity and exchange kinetics. Conclusions/Significance We have analysed conservation of sequence and of molecular interaction fields to cluster and annotate the human Rab proteins. The analysis of three dimensional molecular interaction fields provides detailed insight that is not available from a sequence-based approach alone. Based on our results, we predict novel functions for some Rab proteins and provide insights into their divergent functions and the determinants of their binding partner selectivity. PMID:22523562
Loo, Lit-Hsin; Laksameethanasan, Danai; Tung, Yi-Ling
2014-03-01
Protein subcellular localization is a major determinant of protein function. However, this important protein feature is often described in terms of discrete and qualitative categories of subcellular compartments, and therefore it has limited applications in quantitative protein function analyses. Here, we present Protein Localization Analysis and Search Tools (PLAST), an automated analysis framework for constructing and comparing quantitative signatures of protein subcellular localization patterns based on microscopy images. PLAST produces human-interpretable protein localization maps that quantitatively describe the similarities in the localization patterns of proteins and major subcellular compartments, without requiring manual assignment or supervised learning of these compartments. Using the budding yeast Saccharomyces cerevisiae as a model system, we show that PLAST is more accurate than existing, qualitative protein localization annotations in identifying known co-localized proteins. Furthermore, we demonstrate that PLAST can reveal protein localization-function relationships that are not obvious from these annotations. First, we identified proteins that have similar localization patterns and participate in closely-related biological processes, but do not necessarily form stable complexes with each other or localize at the same organelles. Second, we found an association between spatial and functional divergences of proteins during evolution. Surprisingly, as proteins with common ancestors evolve, they tend to develop more diverged subcellular localization patterns, but still occupy similar numbers of compartments. This suggests that divergence of protein localization might be more frequently due to the development of more specific localization patterns over ancestral compartments than the occupation of new compartments. PLAST enables systematic and quantitative analyses of protein localization-function relationships, and will be useful to elucidate protein functions and how these functions were acquired in cells from different organisms or species. A public web interface of PLAST is available at http://plast.bii.a-star.edu.sg.
Loo, Lit-Hsin; Laksameethanasan, Danai; Tung, Yi-Ling
2014-01-01
Protein subcellular localization is a major determinant of protein function. However, this important protein feature is often described in terms of discrete and qualitative categories of subcellular compartments, and therefore it has limited applications in quantitative protein function analyses. Here, we present Protein Localization Analysis and Search Tools (PLAST), an automated analysis framework for constructing and comparing quantitative signatures of protein subcellular localization patterns based on microscopy images. PLAST produces human-interpretable protein localization maps that quantitatively describe the similarities in the localization patterns of proteins and major subcellular compartments, without requiring manual assignment or supervised learning of these compartments. Using the budding yeast Saccharomyces cerevisiae as a model system, we show that PLAST is more accurate than existing, qualitative protein localization annotations in identifying known co-localized proteins. Furthermore, we demonstrate that PLAST can reveal protein localization-function relationships that are not obvious from these annotations. First, we identified proteins that have similar localization patterns and participate in closely-related biological processes, but do not necessarily form stable complexes with each other or localize at the same organelles. Second, we found an association between spatial and functional divergences of proteins during evolution. Surprisingly, as proteins with common ancestors evolve, they tend to develop more diverged subcellular localization patterns, but still occupy similar numbers of compartments. This suggests that divergence of protein localization might be more frequently due to the development of more specific localization patterns over ancestral compartments than the occupation of new compartments. PLAST enables systematic and quantitative analyses of protein localization-function relationships, and will be useful to elucidate protein functions and how these functions were acquired in cells from different organisms or species. A public web interface of PLAST is available at http://plast.bii.a-star.edu.sg. PMID:24603469
An improved method for functional similarity analysis of genes based on Gene Ontology.
Tian, Zhen; Wang, Chunyu; Guo, Maozu; Liu, Xiaoyan; Teng, Zhixia
2016-12-23
Measures of gene functional similarity are essential tools for gene clustering, gene function prediction, evaluation of protein-protein interaction, disease gene prioritization and other applications. In recent years, many gene functional similarity methods have been proposed based on the semantic similarity of GO terms. However, these leading approaches may make errorprone judgments especially when they measure the specificity of GO terms as well as the IC of a term set. Therefore, how to estimate the gene functional similarity reliably is still a challenging problem. We propose WIS, an effective method to measure the gene functional similarity. First of all, WIS computes the IC of a term by employing its depth, the number of its ancestors as well as the topology of its descendants in the GO graph. Secondly, WIS calculates the IC of a term set by means of considering the weighted inherited semantics of terms. Finally, WIS estimates the gene functional similarity based on the IC overlap ratio of term sets. WIS is superior to some other representative measures on the experiments of functional classification of genes in a biological pathway, collaborative evaluation of GO-based semantic similarity measures, protein-protein interaction prediction and correlation with gene expression. Further analysis suggests that WIS takes fully into account the specificity of terms and the weighted inherited semantics of terms between GO terms. The proposed WIS method is an effective and reliable way to compare gene function. The web service of WIS is freely available at http://nclab.hit.edu.cn/WIS/ .
Zhao, Kailou; Yang, Li; Wang, Xuejiao; Bai, Quan; Yang, Fan; Wang, Fei
2012-08-30
We have explored a novel dual-function stationary phase which combines both strong cation exchange (SCX) and hydrophobic interaction chromatography (HIC) characteristics. The novel dual-function stationary phase is based on porous and spherical silica gel functionalized with ligand containing sulfonic and benzyl groups capable of electrostatic and hydrophobic interaction functionalities, which displays HIC character in a high salt concentration, and IEC character in a low salt concentration in mobile phase employed. As a result, it can be employed to separate proteins with SCX and HIC modes, respectively. The resolution and selectivity of the dual-function stationary phase were evaluated under both HIC and SCX modes with standard proteins and can be comparable to that of conventional IEC and HIC columns. More than 96% of mass and bioactivity recoveries of proteins can be achieved in both HIC and SCX modes, respectively. The results indicated that the novel dual-function column could replace two individual SCX and HIC columns for protein separation. Mixed retention mechanism of proteins on this dual-function column based on stoichiometric displacement theory (SDT) in LC was investigated to find the optimal balance of the magnitude of electrostatic and hydrophobic interactions between protein and the ligand on the silica surface in order to obtain high resolution and selectivity for protein separation. In addition, the effects of the hydrophobicity of the ligand of the dual-function packings and pH of the mobile phase used on protein separation were also investigated in detail. The results show that the ligand with suitable hydrophobicity to match the electrostatic interaction is very important to prepare the dual-function stationary phase, and a better resolution and selectivity can be obtained at pH 6.5 in SCX mode. Therefore, the dual-function column can replace two individual SCX and HIC columns for protein separation and be used to set up two-dimensional liquid chromatography with a single column (2DLC-1C), which can also be employed to separate three kinds of active proteins completely, such as lysozyme, ovotransferrin and ovalbumin from egg white. The result is very important not only to the development of new 2DLC technology with a single column for proteomics, but also to recombinant protein drug production for saving column expense and simplifying the process in biotechnology. Copyright © 2012 Elsevier B.V. All rights reserved.
L-GRAAL: Lagrangian graphlet-based network aligner.
Malod-Dognin, Noël; Pržulj, Nataša
2015-07-01
Discovering and understanding patterns in networks of protein-protein interactions (PPIs) is a central problem in systems biology. Alignments between these networks aid functional understanding as they uncover important information, such as evolutionary conserved pathways, protein complexes and functional orthologs. A few methods have been proposed for global PPI network alignments, but because of NP-completeness of underlying sub-graph isomorphism problem, producing topologically and biologically accurate alignments remains a challenge. We introduce a novel global network alignment tool, Lagrangian GRAphlet-based ALigner (L-GRAAL), which directly optimizes both the protein and the interaction functional conservations, using a novel alignment search heuristic based on integer programming and Lagrangian relaxation. We compare L-GRAAL with the state-of-the-art network aligners on the largest available PPI networks from BioGRID and observe that L-GRAAL uncovers the largest common sub-graphs between the networks, as measured by edge-correctness and symmetric sub-structures scores, which allow transferring more functional information across networks. We assess the biological quality of the protein mappings using the semantic similarity of their Gene Ontology annotations and observe that L-GRAAL best uncovers functionally conserved proteins. Furthermore, we introduce for the first time a measure of the semantic similarity of the mapped interactions and show that L-GRAAL also uncovers best functionally conserved interactions. In addition, we illustrate on the PPI networks of baker's yeast and human the ability of L-GRAAL to predict new PPIs. Finally, L-GRAAL's results are the first to show that topological information is more important than sequence information for uncovering functionally conserved interactions. L-GRAAL is coded in C++. Software is available at: http://bio-nets.doc.ic.ac.uk/L-GRAAL/. n.malod-dognin@imperial.ac.uk Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.
Tactile Teaching: Exploring Protein Structure/Function Using Physical Models
ERIC Educational Resources Information Center
Herman, Tim; Morris, Jennifer; Colton, Shannon; Batiza, Ann; Patrick, Michael; Franzen, Margaret; Goodsell, David S.
2006-01-01
The technology now exists to construct physical models of proteins based on atomic coordinates of solved structures. We review here our recent experiences in using physical models to teach concepts of protein structure and function at both the high school and the undergraduate levels. At the high school level, physical models are used in a…
ERIC Educational Resources Information Center
Terrell, Cassidy R.; Listenberger, Laura L.
2017-01-01
Recognizing that undergraduate students can benefit from analysis of 3D protein structure and function, we have developed a multiweek, inquiry-based molecular visualization project for Biochemistry I students. This project uses a virtual model of cyclooxygenase-1 (COX-1) to guide students through multiple levels of protein structure analysis. The…
Andersen, Tonni Grube; Nintemann, Sebastian J.; Marek, Magdalena; Halkier, Barbara A.; Schulz, Alexander; Burow, Meike
2016-01-01
When investigating interactions between two proteins with complementary reporter tags in yeast two-hybrid or split GFP assays, it remains troublesome to discriminate true- from false-negative results and challenging to compare the level of interaction across experiments. This leads to decreased sensitivity and renders analysis of weak or transient interactions difficult to perform. In this work, we describe the development of reporters that can be chemically induced to dimerize independently of the investigated interactions and thus alleviate these issues. We incorporated our reporters into the widely used split ubiquitin-, bimolecular fluorescence complementation (BiFC)- and Förster resonance energy transfer (FRET)- based methods and investigated different protein-protein interactions in yeast and plants. We demonstrate the functionality of this concept by the analysis of weakly interacting proteins from specialized metabolism in the model plant Arabidopsis thaliana. Our results illustrate that chemically induced dimerization can function as a built-in control for split-based systems that is easily implemented and allows for direct evaluation of functionality. PMID:27282591
The secret life of kinases: insights into non-catalytic signalling functions from pseudokinases.
Jacobsen, Annette V; Murphy, James M
2017-06-15
Over the past decade, our understanding of the mechanisms by which pseudokinases, which comprise ∼10% of the human and mouse kinomes, mediate signal transduction has advanced rapidly with increasing structural, biochemical, cellular and genetic studies. Pseudokinases are the catalytically defective counterparts of conventional, active protein kinases and have been attributed functions as protein interaction domains acting variously as allosteric modulators of conventional protein kinases and other enzymes, as regulators of protein trafficking or localisation, as hubs to nucleate assembly of signalling complexes, and as transmembrane effectors of such functions. Here, by categorising mammalian pseudokinases based on their known functions, we illustrate the mechanistic diversity among these proteins, which can be viewed as a window into understanding the non-catalytic functions that can be exerted by conventional protein kinases. © 2017 The Author(s); published by Portland Press Limited on behalf of the Biochemical Society.
Quality assessment of protein model-structures based on structural and functional similarities.
Konopka, Bogumil M; Nebel, Jean-Christophe; Kotulska, Malgorzata
2012-09-21
Experimental determination of protein 3D structures is expensive, time consuming and sometimes impossible. A gap between number of protein structures deposited in the World Wide Protein Data Bank and the number of sequenced proteins constantly broadens. Computational modeling is deemed to be one of the ways to deal with the problem. Although protein 3D structure prediction is a difficult task, many tools are available. These tools can model it from a sequence or partial structural information, e.g. contact maps. Consequently, biologists have the ability to generate automatically a putative 3D structure model of any protein. However, the main issue becomes evaluation of the model quality, which is one of the most important challenges of structural biology. GOBA--Gene Ontology-Based Assessment is a novel Protein Model Quality Assessment Program. It estimates the compatibility between a model-structure and its expected function. GOBA is based on the assumption that a high quality model is expected to be structurally similar to proteins functionally similar to the prediction target. Whereas DALI is used to measure structure similarity, protein functional similarity is quantified using standardized and hierarchical description of proteins provided by Gene Ontology combined with Wang's algorithm for calculating semantic similarity. Two approaches are proposed to express the quality of protein model-structures. One is a single model quality assessment method, the other is its modification, which provides a relative measure of model quality. Exhaustive evaluation is performed on data sets of model-structures submitted to the CASP8 and CASP9 contests. The validation shows that the method is able to discriminate between good and bad model-structures. The best of tested GOBA scores achieved 0.74 and 0.8 as a mean Pearson correlation to the observed quality of models in our CASP8 and CASP9-based validation sets. GOBA also obtained the best result for two targets of CASP8, and one of CASP9, compared to the contest participants. Consequently, GOBA offers a novel single model quality assessment program that addresses the practical needs of biologists. In conjunction with other Model Quality Assessment Programs (MQAPs), it would prove useful for the evaluation of single protein models.
Protein subcellular localization prediction using artificial intelligence technology.
Nair, Rajesh; Rost, Burkhard
2008-01-01
Proteins perform many important tasks in living organisms, such as catalysis of biochemical reactions, transport of nutrients, and recognition and transmission of signals. The plethora of aspects of the role of any particular protein is referred to as its "function." One aspect of protein function that has been the target of intensive research by computational biologists is its subcellular localization. Proteins must be localized in the same subcellular compartment to cooperate toward a common physiological function. Aberrant subcellular localization of proteins can result in several diseases, including kidney stones, cancer, and Alzheimer's disease. To date, sequence homology remains the most widely used method for inferring the function of a protein. However, the application of advanced artificial intelligence (AI)-based techniques in recent years has resulted in significant improvements in our ability to predict the subcellular localization of a protein. The prediction accuracy has risen steadily over the years, in large part due to the application of AI-based methods such as hidden Markov models (HMMs), neural networks (NNs), and support vector machines (SVMs), although the availability of larger experimental datasets has also played a role. Automatic methods that mine textual information from the biological literature and molecular biology databases have considerably sped up the process of annotation for proteins for which some information regarding function is available in the literature. State-of-the-art methods based on NNs and HMMs can predict the presence of N-terminal sorting signals extremely accurately. Ab initio methods that predict subcellular localization for any protein sequence using only the native amino acid sequence and features predicted from the native sequence have shown the most remarkable improvements. The prediction accuracy of these methods has increased by over 30% in the past decade. The accuracy of these methods is now on par with high-throughput methods for predicting localization, and they are beginning to play an important role in directing experimental research. In this chapter, we review some of the most important methods for the prediction of subcellular localization.
Zhang, Changsheng; Tang, Bo; Wang, Qian; Lai, Luhua
2014-10-01
Target structure-based virtual screening, which employs protein-small molecule docking to identify potential ligands, has been widely used in small-molecule drug discovery. In the present study, we used a protein-protein docking program to identify proteins that bind to a specific target protein. In the testing phase, an all-to-all protein-protein docking run on a large dataset was performed. The three-dimensional rigid docking program SDOCK was used to examine protein-protein docking on all protein pairs in the dataset. Both the binding affinity and features of the binding energy landscape were considered in the scoring function in order to distinguish positive binding pairs from negative binding pairs. Thus, the lowest docking score, the average Z-score, and convergency of the low-score solutions were incorporated in the analysis. The hybrid scoring function was optimized in the all-to-all docking test. The docking method and the hybrid scoring function were then used to screen for proteins that bind to tumor necrosis factor-α (TNFα), which is a well-known therapeutic target for rheumatoid arthritis and other autoimmune diseases. A protein library containing 677 proteins was used for the screen. Proteins with scores among the top 20% were further examined. Sixteen proteins from the top-ranking 67 proteins were selected for experimental study. Two of these proteins showed significant binding to TNFα in an in vitro binding study. The results of the present study demonstrate the power and potential application of protein-protein docking for the discovery of novel binding proteins for specific protein targets. © 2014 Wiley Periodicals, Inc.
A General, Adaptive, Roadmap-Based Algorithm for Protein Motion Computation.
Molloy, Kevin; Shehu, Amarda
2016-03-01
Precious information on protein function can be extracted from a detailed characterization of protein equilibrium dynamics. This remains elusive in wet and dry laboratories, as function-modulating transitions of a protein between functionally-relevant, thermodynamically-stable and meta-stable structural states often span disparate time scales. In this paper we propose a novel, robotics-inspired algorithm that circumvents time-scale challenges by drawing analogies between protein motion and robot motion. The algorithm adapts the popular roadmap-based framework in robot motion computation to handle the more complex protein conformation space and its underlying rugged energy surface. Given known structures representing stable and meta-stable states of a protein, the algorithm yields a time- and energy-prioritized list of transition paths between the structures, with each path represented as a series of conformations. The algorithm balances computational resources between a global search aimed at obtaining a global view of the network of protein conformations and their connectivity and a detailed local search focused on realizing such connections with physically-realistic models. Promising results are presented on a variety of proteins that demonstrate the general utility of the algorithm and its capability to improve the state of the art without employing system-specific insight.
Determining protein function and interaction from genome analysis
Eisenberg, David; Marcotte, Edward M.; Thompson, Michael J.; Pellegrini, Matteo; Yeates, Todd O.
2004-08-03
A computational method system, and computer program are provided for inferring functional links from genome sequences. One method is based on the observation that some pairs of proteins A' and B' have homologs in another organism fused into a single protein chain AB. A trans-genome comparison of sequences can reveal these AB sequences, which are Rosetta Stone sequences because they decipher an interaction between A' and B. Another method compares the genomic sequence of two or more organisms to create a phylogenetic profile for each protein indicating its presence or absence across all the genomes. The profile provides information regarding functional links between different families of proteins. In yet another method a combination of the above two methods is used to predict functional links.
Disease gene classification with metagraph representations.
Kircali Ata, Sezin; Fang, Yuan; Wu, Min; Li, Xiao-Li; Xiao, Xiaokui
2017-12-01
Protein-protein interaction (PPI) networks play an important role in studying the functional roles of proteins, including their association with diseases. However, protein interaction networks are not sufficient without the support of additional biological knowledge for proteins such as their molecular functions and biological processes. To complement and enrich PPI networks, we propose to exploit biological properties of individual proteins. More specifically, we integrate keywords describing protein properties into the PPI network, and construct a novel PPI-Keywords (PPIK) network consisting of both proteins and keywords as two different types of nodes. As disease proteins tend to have a similar topological characteristics on the PPIK network, we further propose to represent proteins with metagraphs. Different from a traditional network motif or subgraph, a metagraph can capture a particular topological arrangement involving the interactions/associations between both proteins and keywords. Based on the novel metagraph representations for proteins, we further build classifiers for disease protein classification through supervised learning. Our experiments on three different PPI databases demonstrate that the proposed method consistently improves disease protein prediction across various classifiers, by 15.3% in AUC on average. It outperforms the baselines including the diffusion-based methods (e.g., RWR) and the module-based methods by 13.8-32.9% for overall disease protein prediction. For predicting breast cancer genes, it outperforms RWR, PRINCE and the module-based baselines by 6.6-14.2%. Finally, our predictions also turn out to have better correlations with literature findings from PubMed. Copyright © 2017 Elsevier Inc. All rights reserved.
Biopores/membrane proteins in synthetic polymer membranes.
Garni, Martina; Thamboo, Sagana; Schoenenberger, Cora-Ann; Palivan, Cornelia G
2017-04-01
Mimicking cell membranes by simple models based on the reconstitution of membrane proteins in lipid bilayers represents a straightforward approach to understand biological function of these proteins. This biomimetic strategy has been extended to synthetic membranes that have advantages in terms of chemical and mechanical stability, thus providing more robust hybrid membranes. We present here how membrane proteins and biopores have been inserted both in the membrane of nanosized and microsized compartments, and in planar membranes under various conditions. Such bio-hybrid membranes have new properties (as for example, permeability to ions/molecules), and functionality depending on the specificity of the inserted biomolecules. Interestingly, membrane proteins can be functionally inserted in synthetic membranes provided these have appropriate properties to overcome the high hydrophobic mismatch between the size of the biomolecule and the membrane thickness. Functional insertion of membrane proteins and biopores in synthetic membranes of compartments or in planar membranes is possible by an appropriate selection of the amphiphilic copolymers, and conditions of the self-assembly process. These hybrid membranes have new properties and functionality based on the specificity of the biomolecules and the nature of the synthetic membranes. Bio-hybrid membranes represent new solutions for the development of nanoreactors, artificial organelles or active surfaces/membranes that, by further gaining in complexity and functionality, will promote translational applications. This article is part of a Special Issue entitled: Lipid order/lipid defects and lipid-control of protein activity edited by Dirk Schneider. Copyright © 2016. Published by Elsevier B.V.
Peterson, Lenna X; Shin, Woong-Hee; Kim, Hyungrae; Kihara, Daisuke
2018-03-01
We report our group's performance for protein-protein complex structure prediction and scoring in Round 37 of the Critical Assessment of PRediction of Interactions (CAPRI), an objective assessment of protein-protein complex modeling. We demonstrated noticeable improvement in both prediction and scoring compared to previous rounds of CAPRI, with our human predictor group near the top of the rankings and our server scorer group at the top. This is the first time in CAPRI that a server has been the top scorer group. To predict protein-protein complex structures, we used both multi-chain template-based modeling (TBM) and our protein-protein docking program, LZerD. LZerD represents protein surfaces using 3D Zernike descriptors (3DZD), which are based on a mathematical series expansion of a 3D function. Because 3DZD are a soft representation of the protein surface, LZerD is tolerant to small conformational changes, making it well suited to docking unbound and TBM structures. The key to our improved performance in CAPRI Round 37 was to combine multi-chain TBM and docking. As opposed to our previous strategy of performing docking for all target complexes, we used TBM when multi-chain templates were available and docking otherwise. We also describe the combination of multiple scoring functions used by our server scorer group, which achieved the top rank for the scorer phase. © 2017 Wiley Periodicals, Inc.
High-throughput screening based on label-free detection of small molecule microarrays
NASA Astrophysics Data System (ADS)
Zhu, Chenggang; Fei, Yiyan; Zhu, Xiangdong
2017-02-01
Based on small-molecule microarrays (SMMs) and oblique-incidence reflectivity difference (OI-RD) scanner, we have developed a novel high-throughput drug preliminary screening platform based on label-free monitoring of direct interactions between target proteins and immobilized small molecules. The screening platform is especially attractive for screening compounds against targets of unknown function and/or structure that are not compatible with functional assay development. In this screening platform, OI-RD scanner serves as a label-free detection instrument which is able to monitor about 15,000 biomolecular interactions in a single experiment without the need to label any biomolecule. Besides, SMMs serves as a novel format for high-throughput screening by immobilization of tens of thousands of different compounds on a single phenyl-isocyanate functionalized glass slide. Based on the high-throughput screening platform, we sequentially screened five target proteins (purified target proteins or cell lysate containing target protein) in high-throughput and label-free mode. We found hits for respective target protein and the inhibition effects for some hits were confirmed by following functional assays. Compared to traditional high-throughput screening assay, the novel high-throughput screening platform has many advantages, including minimal sample consumption, minimal distortion of interactions through label-free detection, multi-target screening analysis, which has a great potential to be a complementary screening platform in the field of drug discovery.
Zhang, Xu; Zhu, Qing; Tian, Tian; Zhao, Changlong; Zang, Jianye; Xue, Ting; Sun, Baolin
2015-05-15
It has been widely recognized that small RNAs (sRNAs) play important roles in physiology and virulence control in bacteria. In Staphylococcus aureus, many sRNAs have been identified and some of them have been functionally studied. Since it is difficult to identify RNA-binding proteins (RBPs), very little has been known about the RBPs in S. aureus, especially those associated with sRNAs. Here we adopted a tRNA scaffold streptavidin aptamer based pull-down assay to identify RBPs in S. aureus. The tethered RNA was successfully captured by the streptavidin magnetic beads, and proteins binding to RNAIII were isolated and analyzed by mass spectrometry. We have identified 81 proteins, and expressed heterologously 9 of them in Escherichia coli. The binding ability of the recombinant proteins with RNAIII was further analyzed by electrophoresis mobility shift assay, and the result indicates that proteins CshA, RNase J2, Era, Hu, WalR, Pyk, and FtsZ can bind to RNAIII. This study suggests that some proteins can bind to RNA III in S. aureus, and may be involved in RNA III function. And tRSA based pull-down assay is an effective method to search for RBPs in bacteria, which should facilitate the identification and functional study of RBPs in diverse bacterial species.
Baker, Max O D G; Shanmugam, Nirukshan; Pham, Chi L L; Strange, Merryn; Steain, Megan; Sunde, Margaret
2018-05-05
The Receptor-interacting protein kinase Homotypic Interaction Motif (RHIM) is an amino acid sequence that mediates multiple protein:protein interactions in the mammalian programmed cell death pathway known as necroptosis. At least one key RHIM-based complex has been shown to have a functional amyloid fibril structure, which provides a stable hetero-oligomeric platform for downstream signaling. RHIMs and related motifs are present in immunity-related proteins across nature, from viruses to fungi to metazoans. Necroptosis is a hallmark feature of cellular clearance of infection. For this reason, numerous pathogens, including viruses and bacteria, have developed varied methods to modulate necroptosis, focusing on inhibiting RHIM:RHIM interactions, and thus their downstream cell death effects. This review will discuss current understanding of RHIM:RHIM interactions in normal cellular activation of necroptosis, from a structural and cell biology perspective. It will compare the mechanisms by which pathogens subvert these interactions in order to maintain their replicative and infective cycles and consider the similarities between RHIMs and other functional amyloid-forming proteins associated with cell death and innate immunity. It will discuss the implications of the heteromeric nature and structure of RHIM-based amyloid complexes in the context of other functional amyloids. Copyright © 2018. Published by Elsevier Ltd.
Narayan, Vikram; Halada, Petr; Hernychová, Lenka; Chong, Yuh Ping; Žáková, Jitka; Hupp, Ted R; Vojtesek, Borivoj; Ball, Kathryn L
2011-04-22
The interferon-regulated transcription factor and tumor suppressor protein IRF-1 is predicted to be largely disordered outside of the DNA-binding domain. One of the advantages of intrinsically disordered protein domains is thought to be their ability to take part in multiple, specific but low affinity protein interactions; however, relatively few IRF-1-interacting proteins have been described. The recent identification of a functional binding interface for the E3-ubiquitin ligase CHIP within the major disordered domain of IRF-1 led us to ask whether this region might be employed more widely by regulators of IRF-1 function. Here we describe the use of peptide aptamer-based affinity chromatography coupled with mass spectrometry to define a multiprotein binding interface on IRF-1 (Mf2 domain; amino acids 106-140) and to identify Mf2-binding proteins from A375 cells. Based on their function as known transcriptional regulators, a selection of the Mf2 domain-binding proteins (NPM1, TRIM28, and YB-1) have been validated using in vitro and cell-based assays. Interestingly, although NPM1, TRIM28, and YB-1 all bind to the Mf2 domain, they have differing amino acid specificities, demonstrating the degree of combinatorial diversity and specificity available through linear interaction motifs.
Niescierowicz, Katarzyna; Caro, Lydia; Cherezov, Vadim; Vivaudou, Michel; Moreau, Christophe J
2014-01-07
Structural studies of G protein-coupled receptors (GPCRs) extensively use the insertion of globular soluble protein domains to facilitate their crystallization. However, when inserted in the third intracellular loop (i3 loop), the soluble protein domain disrupts their coupling to G proteins and impedes the GPCRs functional characterization by standard G protein-based assays. Therefore, activity tests of crystallization-optimized GPCRs are essentially limited to their ligand binding properties using radioligand binding assays. Functional characterization of additional thermostabilizing mutations requires the insertion of similar mutations in the wild-type receptor to allow G protein-activation tests. We demonstrate that ion channel-coupled receptor technology is a complementary approach for a comprehensive functional characterization of crystallization-optimized GPCRs and potentially of any engineered GPCR. Ligand-induced conformational changes of the GPCRs are translated into electrical signal and detected by simple current recordings, even though binding of G proteins is sterically blocked by the added soluble protein domain. Copyright © 2014 Elsevier Ltd. All rights reserved.
2012-01-01
Background The NCBI Conserved Domain Database (CDD) consists of a collection of multiple sequence alignments of protein domains that are at various stages of being manually curated into evolutionary hierarchies based on conserved and divergent sequence and structural features. These domain models are annotated to provide insights into the relationships between sequence, structure and function via web-based BLAST searches. Results Here we automate the generation of conserved domain (CD) hierarchies using a combination of heuristic and Markov chain Monte Carlo (MCMC) sampling procedures and starting from a (typically very large) multiple sequence alignment. This procedure relies on statistical criteria to define each hierarchy based on the conserved and divergent sequence patterns associated with protein functional-specialization. At the same time this facilitates the sequence and structural annotation of residues that are functionally important. These statistical criteria also provide a means to objectively assess the quality of CD hierarchies, a non-trivial task considering that the protein subgroups are often very distantly related—a situation in which standard phylogenetic methods can be unreliable. Our aim here is to automatically generate (typically sub-optimal) hierarchies that, based on statistical criteria and visual comparisons, are comparable to manually curated hierarchies; this serves as the first step toward the ultimate goal of obtaining optimal hierarchical classifications. A plot of runtimes for the most time-intensive (non-parallelizable) part of the algorithm indicates a nearly linear time complexity so that, even for the extremely large Rossmann fold protein class, results were obtained in about a day. Conclusions This approach automates the rapid creation of protein domain hierarchies and thus will eliminate one of the most time consuming aspects of conserved domain database curation. At the same time, it also facilitates protein domain annotation by identifying those pattern residues that most distinguish each protein domain subgroup from other related subgroups. PMID:22726767
PatchSurfers: Two methods for local molecular property-based binding ligand prediction.
Shin, Woong-Hee; Bures, Mark Gregory; Kihara, Daisuke
2016-01-15
Protein function prediction is an active area of research in computational biology. Function prediction can help biologists make hypotheses for characterization of genes and help interpret biological assays, and thus is a productive area for collaboration between experimental and computational biologists. Among various function prediction methods, predicting binding ligand molecules for a target protein is an important class because ligand binding events for a protein are usually closely intertwined with the proteins' biological function, and also because predicted binding ligands can often be directly tested by biochemical assays. Binding ligand prediction methods can be classified into two types: those which are based on protein-protein (or pocket-pocket) comparison, and those that compare a target pocket directly to ligands. Recently, our group proposed two computational binding ligand prediction methods, Patch-Surfer, which is a pocket-pocket comparison method, and PL-PatchSurfer, which compares a pocket to ligand molecules. The two programs apply surface patch-based descriptions to calculate similarity or complementarity between molecules. A surface patch is characterized by physicochemical properties such as shape, hydrophobicity, and electrostatic potentials. These properties on the surface are represented using three-dimensional Zernike descriptors (3DZD), which are based on a series expansion of a 3 dimensional function. Utilizing 3DZD for describing the physicochemical properties has two main advantages: (1) rotational invariance and (2) fast comparison. Here, we introduce Patch-Surfer and PL-PatchSurfer with an emphasis on PL-PatchSurfer, which is more recently developed. Illustrative examples of PL-PatchSurfer performance on binding ligand prediction as well as virtual drug screening are also provided. Copyright © 2015 Elsevier Inc. All rights reserved.
Huang, Wenwen; Ebrahimi, Davoud; Dinjaski, Nina; Tarakanova, Anna; Buehler, Markus J; Wong, Joyce Y; Kaplan, David L
2017-04-18
Tailored biomaterials with tunable functional properties are crucial for a variety of task-specific applications ranging from healthcare to sustainable, novel bio-nanodevices. To generate polymeric materials with predictive functional outcomes, exploiting designs from nature while morphing them toward non-natural systems offers an important strategy. Silks are Nature's building blocks and are produced by arthropods for a variety of uses that are essential for their survival. Due to the genetic control of encoded protein sequence, mechanical properties, biocompatibility, and biodegradability, silk proteins have been selected as prototype models to emulate for the tunable designs of biomaterial systems. The bottom up strategy of material design opens important opportunities to create predictive functional outcomes, following the exquisite polymeric templates inspired by silks. Recombinant DNA technology provides a systematic approach to recapitulate, vary, and evaluate the core structure peptide motifs in silks and then biosynthesize silk-based polymers by design. Post-biosynthesis processing allows for another dimension of material design by controlled or assisted assembly. Multiscale modeling, from the theoretical prospective, provides strategies to explore interactions at different length scales, leading to selective material properties. Synergy among experimental and modeling approaches can provide new and more rapid insights into the most appropriate structure-function relationships to pursue while also furthering our understanding in terms of the range of silk-based systems that can be generated. This approach utilizes nature as a blueprint for initial polymer designs with useful functions (e.g., silk fibers) but also employs modeling-guided experiments to expand the initial polymer designs into new domains of functional materials that do not exist in nature. The overall path to these new functional outcomes is greatly accelerated via the integration of modeling with experiment. In this Account, we summarize recent advances in understanding and functionalization of silk-based protein systems, with a focus on the integration of simulation and experiment for biopolymer design. Spider silk was selected as an exemplary protein to address the fundamental challenges in polymer designs, including specific insights into the role of molecular weight, hydrophobic/hydrophilic partitioning, and shear stress for silk fiber formation. To expand current silk designs toward biointerfaces and stimuli responsive materials, peptide modules from other natural proteins were added to silk designs to introduce new functions, exploiting the modular nature of silk proteins and fibrous proteins in general. The integrated approaches explored suggest that protein folding, silk volume fraction, and protein amino acid sequence changes (e.g., mutations) are critical factors for functional biomaterial designs. In summary, the integrated modeling-experimental approach described in this Account suggests a more rationally directed and more rapid method for the design of polymeric materials. It is expected that this combined use of experimental and computational approaches has a broad applicability not only for silk-based systems, but also for other polymer and composite materials.
PredictProtein—an open resource for online prediction of protein structural and functional features
Yachdav, Guy; Kloppmann, Edda; Kajan, Laszlo; Hecht, Maximilian; Goldberg, Tatyana; Hamp, Tobias; Hönigschmid, Peter; Schafferhans, Andrea; Roos, Manfred; Bernhofer, Michael; Richter, Lothar; Ashkenazy, Haim; Punta, Marco; Schlessinger, Avner; Bromberg, Yana; Schneider, Reinhard; Vriend, Gerrit; Sander, Chris; Ben-Tal, Nir; Rost, Burkhard
2014-01-01
PredictProtein is a meta-service for sequence analysis that has been predicting structural and functional features of proteins since 1992. Queried with a protein sequence it returns: multiple sequence alignments, predicted aspects of structure (secondary structure, solvent accessibility, transmembrane helices (TMSEG) and strands, coiled-coil regions, disulfide bonds and disordered regions) and function. The service incorporates analysis methods for the identification of functional regions (ConSurf), homology-based inference of Gene Ontology terms (metastudent), comprehensive subcellular localization prediction (LocTree3), protein–protein binding sites (ISIS2), protein–polynucleotide binding sites (SomeNA) and predictions of the effect of point mutations (non-synonymous SNPs) on protein function (SNAP2). Our goal has always been to develop a system optimized to meet the demands of experimentalists not highly experienced in bioinformatics. To this end, the PredictProtein results are presented as both text and a series of intuitive, interactive and visually appealing figures. The web server and sources are available at http://ppopen.rostlab.org. PMID:24799431
Swanson, Jon; Audie, Joseph
2018-01-01
A fundamental and unsolved problem in biophysical chemistry is the development of a computationally simple, physically intuitive, and generally applicable method for accurately predicting and physically explaining protein-protein binding affinities from protein-protein interaction (PPI) complex coordinates. Here, we propose that the simplification of a previously described six-term PPI scoring function to a four term function results in a simple expression of all physically and statistically meaningful terms that can be used to accurately predict and explain binding affinities for a well-defined subset of PPIs that are characterized by (1) crystallographic coordinates, (2) rigid-body association, (3) normal interface size, and hydrophobicity and hydrophilicity, and (4) high quality experimental binding affinity measurements. We further propose that the four-term scoring function could be regarded as a core expression for future development into a more general PPI scoring function. Our work has clear implications for PPI modeling and structure-based drug design.
Optimal network alignment with graphlet degree vectors.
Milenković, Tijana; Ng, Weng Leong; Hayes, Wayne; Przulj, Natasa
2010-06-30
Important biological information is encoded in the topology of biological networks. Comparative analyses of biological networks are proving to be valuable, as they can lead to transfer of knowledge between species and give deeper insights into biological function, disease, and evolution. We introduce a new method that uses the Hungarian algorithm to produce optimal global alignment between two networks using any cost function. We design a cost function based solely on network topology and use it in our network alignment. Our method can be applied to any two networks, not just biological ones, since it is based only on network topology. We use our new method to align protein-protein interaction networks of two eukaryotic species and demonstrate that our alignment exposes large and topologically complex regions of network similarity. At the same time, our alignment is biologically valid, since many of the aligned protein pairs perform the same biological function. From the alignment, we predict function of yet unannotated proteins, many of which we validate in the literature. Also, we apply our method to find topological similarities between metabolic networks of different species and build phylogenetic trees based on our network alignment score. The phylogenetic trees obtained in this way bear a striking resemblance to the ones obtained by sequence alignments. Our method detects topologically similar regions in large networks that are statistically significant. It does this independent of protein sequence or any other information external to network topology.
2013-01-01
Despite its prominence for characterization of complex mixtures, LC–MS/MS frequently fails to identify many proteins. Network-based analysis methods, based on protein–protein interaction networks (PPINs), biological pathways, and protein complexes, are useful for recovering non-detected proteins, thereby enhancing analytical resolution. However, network-based analysis methods do come in varied flavors for which the respective efficacies are largely unknown. We compare the recovery performance and functional insights from three distinct instances of PPIN-based approaches, viz., Proteomics Expansion Pipeline (PEP), Functional Class Scoring (FCS), and Maxlink, in a test scenario of valproic acid (VPA)-treated mice. We find that the most comprehensive functional insights, as well as best non-detected protein recovery performance, are derived from FCS utilizing real biological complexes. This outstrips other network-based methods such as Maxlink or Proteomics Expansion Pipeline (PEP). From FCS, we identified known biological complexes involved in epigenetic modifications, neuronal system development, and cytoskeletal rearrangements. This is congruent with the observed phenotype where adult mice showed an increase in dendritic branching to allow the rewiring of visual cortical circuitry and an improvement in their visual acuity when tested behaviorally. In addition, PEP also identified a novel complex, comprising YWHAB, NR1, NR2B, ACTB, and TJP1, which is functionally related to the observed phenotype. Although our results suggest different network analysis methods can produce different results, on the whole, the findings are mutually supportive. More critically, the non-overlapping information each provides can provide greater holistic understanding of complex phenotypes. PMID:23557376
Genomewide Function Conservation and Phylogeny in the Herpesviridae
Albà, M. Mar; Das, Rhiju; Orengo, Christine A.; Kellam, Paul
2001-01-01
The Herpesviridae are a large group of well-characterized double-stranded DNA viruses for which many complete genome sequences have been determined. We have extracted protein sequences from all predicted open reading frames of 19 herpesvirus genomes. Sequence comparison and protein sequence clustering methods have been used to construct herpesvirus protein homologous families. This resulted in 1692 proteins being clustered into 243 multiprotein families and 196 singleton proteins. Predicted functions were assigned to each homologous family based on genome annotation and published data and each family classified into seven broad functional groups. Phylogenetic profiles were constructed for each herpesvirus from the homologous protein families and used to determine conserved functions and genomewide phylogenetic trees. These trees agreed with molecular-sequence-derived trees and allowed greater insight into the phylogeny of ungulate and murine gammaherpesviruses. PMID:11156614
Roles for text mining in protein function prediction.
Verspoor, Karin M
2014-01-01
The Human Genome Project has provided science with a hugely valuable resource: the blueprints for life; the specification of all of the genes that make up a human. While the genes have all been identified and deciphered, it is proteins that are the workhorses of the human body: they are essential to virtually all cell functions and are the primary mechanism through which biological function is carried out. Hence in order to fully understand what happens at a molecular level in biological organisms, and eventually to enable development of treatments for diseases where some aspect of a biological system goes awry, we must understand the functions of proteins. However, experimental characterization of protein function cannot scale to the vast amount of DNA sequence data now available. Computational protein function prediction has therefore emerged as a problem at the forefront of modern biology (Radivojac et al., Nat Methods 10(13):221-227, 2013).Within the varied approaches to computational protein function prediction that have been explored, there are several that make use of biomedical literature mining. These methods take advantage of information in the published literature to associate specific proteins with specific protein functions. In this chapter, we introduce two main strategies for doing this: association of function terms, represented as Gene Ontology terms (Ashburner et al., Nat Genet 25(1):25-29, 2000), to proteins based on information in published articles, and a paradigm called LEAP-FS (Literature-Enhanced Automated Prediction of Functional Sites) in which literature mining is used to validate the predictions of an orthogonal computational protein function prediction method.
Sasse, Alexander; de Vries, Sjoerd J; Schindler, Christina E M; de Beauchêne, Isaure Chauvot; Zacharias, Martin
2017-01-01
Protein-protein docking protocols aim to predict the structures of protein-protein complexes based on the structure of individual partners. Docking protocols usually include several steps of sampling, clustering, refinement and re-scoring. The scoring step is one of the bottlenecks in the performance of many state-of-the-art protocols. The performance of scoring functions depends on the quality of the generated structures and its coupling to the sampling algorithm. A tool kit, GRADSCOPT (GRid Accelerated Directly SCoring OPTimizing), was designed to allow rapid development and optimization of different knowledge-based scoring potentials for specific objectives in protein-protein docking. Different atomistic and coarse-grained potentials can be created by a grid-accelerated directly scoring dependent Monte-Carlo annealing or by a linear regression optimization. We demonstrate that the scoring functions generated by our approach are similar to or even outperform state-of-the-art scoring functions for predicting near-native solutions. Of additional importance, we find that potentials specifically trained to identify the native bound complex perform rather poorly on identifying acceptable or medium quality (near-native) solutions. In contrast, atomistic long-range contact potentials can increase the average fraction of near-native poses by up to a factor 2.5 in the best scored 1% decoys (compared to existing scoring), emphasizing the need of specific docking potentials for different steps in the docking protocol.
Computational protein design-the next generation tool to expand synthetic biology applications.
Gainza-Cirauqui, Pablo; Correia, Bruno Emanuel
2018-05-02
One powerful approach to engineer synthetic biology pathways is the assembly of proteins sourced from one or more natural organisms. However, synthetic pathways often require custom functions or biophysical properties not displayed by natural proteins, limitations that could be overcome through modern protein engineering techniques. Structure-based computational protein design is a powerful tool to engineer new functional capabilities in proteins, and it is beginning to have a profound impact in synthetic biology. Here, we review efforts to increase the capabilities of synthetic biology using computational protein design. We focus primarily on computationally designed proteins not only validated in vitro, but also shown to modulate different activities in living cells. Efforts made to validate computational designs in cells can illustrate both the challenges and opportunities in the intersection of protein design and synthetic biology. We also highlight protein design approaches, which although not validated as conveyors of new cellular function in situ, may have rapid and innovative applications in synthetic biology. We foresee that in the near-future, computational protein design will vastly expand the functional capabilities of synthetic cells. Copyright © 2018. Published by Elsevier Ltd.
Du, Pufeng; Wang, Lusheng
2014-01-01
One of the fundamental tasks in biology is to identify the functions of all proteins to reveal the primary machinery of a cell. Knowledge of the subcellular locations of proteins will provide key hints to reveal their functions and to understand the intricate pathways that regulate biological processes at the cellular level. Protein subcellular location prediction has been extensively studied in the past two decades. A lot of methods have been developed based on protein primary sequences as well as protein-protein interaction network. In this paper, we propose to use the protein-protein interaction network as an infrastructure to integrate existing sequence based predictors. When predicting the subcellular locations of a given protein, not only the protein itself, but also all its interacting partners were considered. Unlike existing methods, our method requires neither the comprehensive knowledge of the protein-protein interaction network nor the experimentally annotated subcellular locations of most proteins in the protein-protein interaction network. Besides, our method can be used as a framework to integrate multiple predictors. Our method achieved 56% on human proteome in absolute-true rate, which is higher than the state-of-the-art methods. PMID:24466278
Nicholson, Judith; Scherl, Alex; Way, Luke; Blackburn, Elizabeth A; Walkinshaw, Malcolm D; Ball, Kathryn L; Hupp, Ted R
2014-06-01
Linear motifs mediate protein-protein interactions (PPI) that allow expansion of a target protein interactome at a systems level. This study uses a proteomics approach and linear motif sub-stratifications to expand on PPIs of MDM2. MDM2 is a multi-functional protein with over one hundred known binding partners not stratified by hierarchy or function. A new linear motif based on a MDM2 interaction consensus is used to select novel MDM2 interactors based on Nutlin-3 responsiveness in a cell-based proteomics screen. MDM2 binds a subset of peptide motifs corresponding to real proteins with a range of allosteric responses to MDM2 ligands. We validate cyclophilin B as a novel protein with a consensus MDM2 binding motif that is stabilised by Nutlin-3 in vivo, thus identifying one of the few known interactors of MDM2 that is stabilised by Nutlin-3. These data invoke two modes of peptide binding at the MDM2 N-terminus that rely on a consensus core motif to control the equilibrium between MDM2 binding proteins. This approach stratifies MDM2 interacting proteins based on the linear motif feature and provides a new biomarker assay to define clinically relevant Nutlin-3 responsive MDM2 interactors. Copyright © 2014 Elsevier Inc. All rights reserved.
Porter Starr, Kathryn N; Pieper, Carl F; Orenduff, Melissa C; McDonald, Shelley R; McClure, Luisa B; Zhou, Run; Payne, Martha E; Bales, Connie W
2016-10-01
Obesity is a significant cause of functional limitations in older adults; yet, concerns that weight reduction could diminish muscle along with fat mass have impeded progress toward an intervention. Meal-based enhancement of protein intake could protect function and/or lean mass but has not been studied during geriatric obesity reduction. In this 6-month randomized controlled trial, 67 obese (body mass index ≥30kg/m(2)) older (≥60 years) adults with a Short Physical Performance Battery score of 4-10 were randomly assigned to a traditional (Control) weight loss regimen or one with higher protein intake (>30g) at each meal (Protein). All participants were prescribed a hypo-caloric diet, and weighed and provided dietary guidance weekly. Physical function (Short Physical Performance Battery) and lean mass (BOD POD), along with secondary measures, were assessed at 0, 3, and 6 months. At the 6-month endpoint, there was significant (p < .001) weight loss in both the Control (-7.5±6.2kg) and Protein (-8.7±7.4kg) groups. Both groups also improved function but the increase in the Protein (+2.4±1.7 units; p < .001) was greater than in the Control (+0.9±1.7 units; p < .01) group (p = .02). Obese, functionally limited older adults undergoing a 6-month weight loss intervention with a meal-based enhancement of protein quantity and quality lost similar amounts of weight but had greater functional improvements relative to the Control group. If confirmed, this dietary approach could have important implications for improving the functional status of this vulnerable population (ClinicalTrials.gov identifier: NCT01715753). © The Author 2016. Published by Oxford University Press on behalf of The Gerontological Society of America.
Pieper, Carl F.; Orenduff, Melissa C.; McDonald, Shelley R.; McClure, Luisa B.; Zhou, Run; Payne, Martha E.; Bales, Connie W.
2016-01-01
Abstract Background: Obesity is a significant cause of functional limitations in older adults; yet, concerns that weight reduction could diminish muscle along with fat mass have impeded progress toward an intervention. Meal-based enhancement of protein intake could protect function and/or lean mass but has not been studied during geriatric obesity reduction. Methods: In this 6-month randomized controlled trial, 67 obese (body mass index ≥30kg/m2) older (≥60 years) adults with a Short Physical Performance Battery score of 4–10 were randomly assigned to a traditional (Control) weight loss regimen or one with higher protein intake (>30g) at each meal (Protein). All participants were prescribed a hypo-caloric diet, and weighed and provided dietary guidance weekly. Physical function (Short Physical Performance Battery) and lean mass (BOD POD), along with secondary measures, were assessed at 0, 3, and 6 months. Results: At the 6-month endpoint, there was significant (p < .001) weight loss in both the Control (−7.5±6.2kg) and Protein (−8.7±7.4kg) groups. Both groups also improved function but the increase in the Protein (+2.4±1.7 units; p < .001) was greater than in the Control (+0.9±1.7 units; p < .01) group (p = .02). Conclusion: Obese, functionally limited older adults undergoing a 6-month weight loss intervention with a meal-based enhancement of protein quantity and quality lost similar amounts of weight but had greater functional improvements relative to the Control group. If confirmed, this dietary approach could have important implications for improving the functional status of this vulnerable population (ClinicalTrials.gov identifier: NCT01715753). PMID:26786203
Karmakar, Shilpita; Saha, Sutapa; Banerjee, Debasis; Chakrabarti, Abhijit
2015-01-01
Harris platelet syndrome (HPS), also known as asymptomatic constitutional macrothrombocytopenia (ACMT), is an autosomal dominant platelet disorder characterized by mild-to-severe thrombocytopenia and giant platelets with normal platelet aggregation and absence of bleeding symptoms. We have attempted a comparative proteomics study for profiling of platelet proteins in healthy vs. pathological states to discover characteristic protein expression changes in macrothrombocytes and decipher the factors responsible for the functionally active yet morphologically distinct platelets. We have used 2-D gel-based protein separation techniques coupled with MALDI-ToF/ToF-based mass spectrometric identification and characterization of the proteins to investigate the differential proteome profiling of platelet proteins isolated from the peripheral blood samples of patients and normal volunteers. Our study revealed altered levels of actin-binding proteins such as myosin light chain, coactosin-like protein, actin-related protein 2/3 complex, and transgelin2 that hint toward the cytoskeletal changes necessary to maintain the structural and functional integrity of macrothrombocytes. We have also observed over expressed levels of peroxiredoxin2 that signifies the prevailing oxidative stress in these cells. Additionally, altered levels of protein disulfide isomerase and transthyretin provide insights into the measures adapted by the macrothrombocytes to maintain their normal functional activity. This first proteomics study of platelets from ACMT may provide an understanding of the structural stability and normal functioning of these platelets in spite of their large size. © 2014 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Rebelling for a Reason: Protein Structural “Outliers”
Arumugam, Gandhimathi; Nair, Anu G.; Hariharaputran, Sridhar; Ramanathan, Sowdhamini
2013-01-01
Analysis of structural variation in domain superfamilies can reveal constraints in protein evolution which aids protein structure prediction and classification. Structure-based sequence alignment of distantly related proteins, organized in PASS2 database, provides clues about structurally conserved regions among different functional families. Some superfamily members show large structural differences which are functionally relevant. This paper analyses the impact of structural divergence on function for multi-member superfamilies, selected from the PASS2 superfamily alignment database. Functional annotations within superfamilies, with structural outliers or ‘rebels’, are discussed in the context of structural variations. Overall, these data reinforce the idea that functional similarities cannot be extrapolated from mere structural conservation. The implication for fold-function prediction is that the functional annotations can only be inherited with very careful consideration, especially at low sequence identities. PMID:24073209
Functionalization of 3D scaffolds with protein-releasing biomaterials for intracellular delivery.
Seras-Franzoso, Joaquin; Steurer, Christoph; Roldán, Mònica; Vendrell, Meritxell; Vidaurre-Agut, Carla; Tarruella, Anna; Saldaña, Laura; Vilaboa, Nuria; Parera, Marc; Elizondo, Elisa; Ratera, Imma; Ventosa, Nora; Veciana, Jaume; Campillo-Fernández, Alberto J; García-Fruitós, Elena; Vázquez, Esther; Villaverde, Antonio
2013-10-10
Appropriate combinations of mechanical and biological stimuli are required to promote proper colonization of substrate materials in regenerative medicine. In this context, 3D scaffolds formed by compatible and biodegradable materials are under continuous development in an attempt to mimic the extracellular environment of mammalian cells. We have here explored how novel 3D porous scaffolds constructed by polylactic acid, polycaprolactone or chitosan can be decorated with bacterial inclusion bodies, submicron protein particles formed by releasable functional proteins. A simple dipping-based decoration method tested here specifically favors the penetration of the functional particles deeper than 300μm from the materials' surface. The functionalized surfaces support the intracellular delivery of biologically active proteins to up to more than 80% of the colonizing cells, a process that is slightly influenced by the chemical nature of the scaffold. The combination of 3D soft scaffolds and protein-based sustained release systems (Bioscaffolds) offers promise in the fabrication of bio-inspired hybrid matrices for multifactorial control of cell proliferation in tissue engineering under complex architectonic setting-ups. © 2013.
iDBPs: a web server for the identification of DNA binding proteins
Nimrod, Guy; Schushan, Maya; Szilágyi, András; Leslie, Christina; Ben-Tal, Nir
2010-01-01
Summary: The iDBPs server uses the three-dimensional (3D) structure of a query protein to predict whether it binds DNA. First, the algorithm predicts the functional region of the protein based on its evolutionary profile; the assumption is that large clusters of conserved residues are good markers of functional regions. Next, various characteristics of the predicted functional region as well as global features of the protein are calculated, such as the average surface electrostatic potential, the dipole moment and cluster-based amino acid conservation patterns. Finally, a random forests classifier is used to predict whether the query protein is likely to bind DNA and to estimate the prediction confidence. We have trained and tested the classifier on various datasets and shown that it outperformed related methods. On a dataset that reflects the fraction of DNA binding proteins (DBPs) in a proteome, the area under the ROC curve was 0.90. The application of the server to an updated version of the N-Func database, which contains proteins of unknown function with solved 3D-structure, suggested new putative DBPs for experimental studies. Availability: http://idbps.tau.ac.il/ Contact: NirB@tauex.tau.ac.il Supplementary information: Supplementary data are available at Bioinformatics online. PMID:20089514
CORUM: the comprehensive resource of mammalian protein complexes
Ruepp, Andreas; Brauner, Barbara; Dunger-Kaltenbach, Irmtraud; Frishman, Goar; Montrone, Corinna; Stransky, Michael; Waegele, Brigitte; Schmidt, Thorsten; Doudieu, Octave Noubibou; Stümpflen, Volker; Mewes, H. Werner
2008-01-01
Protein complexes are key molecular entities that integrate multiple gene products to perform cellular functions. The CORUM (http://mips.gsf.de/genre/proj/corum/index.html) database is a collection of experimentally verified mammalian protein complexes. Information is manually derived by critical reading of the scientific literature from expert annotators. Information about protein complexes includes protein complex names, subunits, literature references as well as the function of the complexes. For functional annotation, we use the FunCat catalogue that enables to organize the protein complex space into biologically meaningful subsets. The database contains more than 1750 protein complexes that are built from 2400 different genes, thus representing 12% of the protein-coding genes in human. A web-based system is available to query, view and download the data. CORUM provides a comprehensive dataset of protein complexes for discoveries in systems biology, analyses of protein networks and protein complex-associated diseases. Comparable to the MIPS reference dataset of protein complexes from yeast, CORUM intends to serve as a reference for mammalian protein complexes. PMID:17965090
α-Crystallins Are Small Heat Shock Proteins: Functional and Structural Properties.
Tikhomirova, T S; Selivanova, O M; Galzitskaya, O V
2017-02-01
During its life cycle, a cell can be subjected to various external negative effects. Many proteins provide cell protection, including small heat shock proteins (sHsp) that have chaperone-like activity. These proteins have several important functions involving prevention of apoptosis and retention of cytoskeletal integrity; also, sHsp take part in the recovery of enzyme activity. The action mechanism of sHsp is based on the binding of hydrophobic regions exposed to the surface of a molten globule. α-Crystallins presented in chordate cells as two αA- and αB-isoforms are the most studied small heat shock proteins. In this review, we describe the main functions of α-crystallins, features of their secondary and tertiary structures, and examples of their partners in protein-protein interactions.
Enzyme Functionalized AuNPs and Glucometer-based Protein Detection
NASA Astrophysics Data System (ADS)
Dai, Tao; Fang, Jie; Yu, Wen; Xie, Guoming
2017-12-01
We here developed a novel method for protein detection by using protein aptamer-functionalized magnetic beads for protein recognition and invertase-functionalized AuNPs catalyze sucrose generate glucose that can be detected by a glucometer. First, the invertase and DNA probe P2 are immobilized onto the gold nanoparticles (I.P2@AuNPs). Next protein aptamer P1 are immobilized onto the streptavidin-coated Magnetic beads (P1@MB). P1 and P2 can complementary to form double-stranded DNA. When target protein presence, P1 combine with target and release I/P2@AuNPs. Then magnetic separation, take supernatant fluid and add sucrose after a period of reaction, detection of glucose concentration by glucometer, thus achieve the sensitive and selective detection of the target protein.
Crystal growth of enzymes in low gravity (L-5)
NASA Technical Reports Server (NTRS)
Morita, Yuhei
1993-01-01
Recent developments in protein engineering have expanded the possibilities of studies of enzymes and other proteins. Now such studies are not limited to the elucidation of the relationship between the structure and function of the protein. They also aim at the production of proteins with new and practical functions, based on results obtained during investigation of structure and function. For continuing research in this field, investigation of the tertiary structure of proteins is important. X-ray diffraction of single crystals of protein is usually used for this purpose. The main difficulty is the preparation of the crystals. The theme of the research is to prepare such crystals at very low gravity, with the main purpose being to obtain large single crystals of proteins suitable for x-ray diffraction studies.
Design of Bioinorganic Materials at the Interface of Coordination and Biosupramolecular Chemistry.
Maity, Basudev; Ueno, Takafumi
2017-04-01
Protein assemblies have recently become known as potential molecular scaffolds for applications in materials science and bio-nanotechnology. Efforts to design protein assemblies for construction of protein-based hybrid materials with metal ions, metal complexes, nanomaterials and proteins now represent a growing field with a common aim of providing novel functions and mimicking natural functions. However, the important roles of protein assemblies in coordination and biosupramolecular chemistry have not been systematically investigated and characterized. In this personal account, we focus on our recent progress in rational design of protein assemblies using bioinorganic chemistry for (1) exploration of unnatural reactions, (2) construction of functional protein architectures, and (3) in vivo applications. © 2017 The Chemical Society of Japan & Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
CoMoDo: identifying dynamic protein domains based on covariances of motion.
Wieninger, Silke A; Ullmann, G Matthias
2015-06-09
Most large proteins are built of several domains, compact units which enable functional protein motions. Different domain assignment approaches exist, which mostly rely on concepts of stability, folding, and evolution. We describe the automatic assignment method CoMoDo, which identifies domains based on protein dynamics. Covariances of atomic fluctuations, here calculated by an Elastic Network Model, are used to group residues into domains of different hierarchical levels. The so-called dynamic domains facilitate the study of functional protein motions involved in biological processes like ligand binding and signal transduction. By applying CoMoDo to a large number of proteins, we demonstrate that dynamic domains exhibit features absent in the commonly assigned structural domains, which can deliver insight into the interactions between domains and between subunits of multimeric proteins. CoMoDo is distributed as free open source software at www.bisb.uni-bayreuth.de/CoMoDo.html .
Kagale, Sateesh; Uzuhashi, Shihomi; Wigness, Merek; Bender, Tricia; Yang, Wen; Borhan, M. Hossein; Rozwadowski, Kevin
2012-01-01
Plant viral expression vectors are advantageous for high-throughput functional characterization studies of genes due to their capability for rapid, high-level transient expression of proteins. We have constructed a series of tobacco mosaic virus (TMV) based vectors that are compatible with Gateway technology to enable rapid assembly of expression constructs and exploitation of ORFeome collections. In addition to the potential of producing recombinant protein at grams per kilogram FW of leaf tissue, these vectors facilitate either N- or C-terminal fusions to a broad series of epitope tag(s) and fluorescent proteins. We demonstrate the utility of these vectors in affinity purification, immunodetection and subcellular localisation studies. We also apply the vectors to characterize protein-protein interactions and demonstrate their utility in screening plant pathogen effectors. Given its broad utility in defining protein properties, this vector series will serve as a useful resource to expedite gene characterization efforts. PMID:23166857
TrypsNetDB: An integrated framework for the functional characterization of trypanosomatid proteins
Gazestani, Vahid H.; Yip, Chun Wai; Nikpour, Najmeh; Berghuis, Natasha
2017-01-01
Trypanosomatid parasites cause serious infections in humans and production losses in livestock. Due to the high divergence from other eukaryotes, such as humans and model organisms, the functional roles of many trypanosomatid proteins cannot be predicted by homology-based methods, rendering a significant portion of their proteins as uncharacterized. Recent technological advances have led to the availability of multiple systematic and genome-wide datasets on trypanosomatid parasites that are informative regarding the biological role(s) of their proteins. Here, we report TrypsNetDB (http://trypsNetDB.org), a web-based resource for the functional annotation of 16 different species/strains of trypanosomatid parasites. The database not only visualizes the network context of the queried protein(s) in an intuitive way but also examines the response of the represented network in more than 50 different biological contexts and its enrichment for various biological terms and pathways, protein sequence signatures, and potential RNA regulatory elements. The interactome core of the database, as of Jan 23, 2017, contains 101,187 interactions among 13,395 trypanosomatid proteins inferred from 97 genome-wide and focused studies on the interactome of these organisms. PMID:28158179
Zemla, Adam T; Lang, Dorothy M; Kostova, Tanya; Andino, Raul; Ecale Zhou, Carol L
2011-06-02
Most of the currently used methods for protein function prediction rely on sequence-based comparisons between a query protein and those for which a functional annotation is provided. A serious limitation of sequence similarity-based approaches for identifying residue conservation among proteins is the low confidence in assigning residue-residue correspondences among proteins when the level of sequence identity between the compared proteins is poor. Multiple sequence alignment methods are more satisfactory--still, they cannot provide reliable results at low levels of sequence identity. Our goal in the current work was to develop an algorithm that could help overcome these difficulties by facilitating the identification of structurally (and possibly functionally) relevant residue-residue correspondences between compared protein structures. Here we present StralSV (structure-alignment sequence variability), a new algorithm for detecting closely related structure fragments and quantifying residue frequency from tight local structure alignments. We apply StralSV in a study of the RNA-dependent RNA polymerase of poliovirus, and we demonstrate that the algorithm can be used to determine regions of the protein that are relatively unique, or that share structural similarity with proteins that would be considered distantly related. By quantifying residue frequencies among many residue-residue pairs extracted from local structural alignments, one can infer potential structural or functional importance of specific residues that are determined to be highly conserved or that deviate from a consensus. We further demonstrate that considerable detailed structural and phylogenetic information can be derived from StralSV analyses. StralSV is a new structure-based algorithm for identifying and aligning structure fragments that have similarity to a reference protein. StralSV analysis can be used to quantify residue-residue correspondences and identify residues that may be of particular structural or functional importance, as well as unusual or unexpected residues at a given sequence position. StralSV is provided as a web service at http://proteinmodel.org/AS2TS/STRALSV/.
The Classification of Protein Domains.
Dawson, Natalie; Sillitoe, Ian; Marsden, Russell L; Orengo, Christine A
2017-01-01
The significant expansion in protein sequence and structure data that we are now witnessing brings with it a pressing need to bring order to the protein world. Such order enables us to gain insights into the evolution of proteins, their function and the extent to which the functional repertoire can vary across the three kingdoms of life. This has lead to the creation of a wide range of protein family classifications that aim to group proteins based upon their evolutionary relationships.In this chapter we discuss the approaches and methods that are frequently used in the classification of proteins, with a specific emphasis on the classification of protein domains. The construction of both domain sequence and domain structure databases is considered and we show how the use of domain family annotations to assign structural and functional information is enhancing our understanding of genomes.
The Structure and Function of Non-Collagenous Bone Proteins
NASA Technical Reports Server (NTRS)
Hook, Magnus; McQuillan, David J.
1997-01-01
The research done under the cooperative research agreement for the project titled 'The structure and function of non-collagenous bone proteins' represented the first phase of an ongoing program to define the structural and functional relationships of the principal noncollagenous proteins in bone. An ultimate goal of this research is to enable design and execution of useful pharmacological compounds that will have a beneficial effect in treatment of osteoporosis, both land-based and induced by long-duration space travel. The goals of the now complete first phase were as follows: 1. Establish and/or develop powerful recombinant protein expression systems; 2. Develop and refine isolation and purification of recombinant proteins; 3. Express wild-type non-collagenous bone proteins; 4. Express site-specific mutant proteins and domains of wild-type proteins to enhance likelihood of crystal formation for subsequent solution of structure.
Walia, Rasna R; Caragea, Cornelia; Lewis, Benjamin A; Towfic, Fadi; Terribilini, Michael; El-Manzalawy, Yasser; Dobbs, Drena; Honavar, Vasant
2012-05-10
RNA molecules play diverse functional and structural roles in cells. They function as messengers for transferring genetic information from DNA to proteins, as the primary genetic material in many viruses, as catalysts (ribozymes) important for protein synthesis and RNA processing, and as essential and ubiquitous regulators of gene expression in living organisms. Many of these functions depend on precisely orchestrated interactions between RNA molecules and specific proteins in cells. Understanding the molecular mechanisms by which proteins recognize and bind RNA is essential for comprehending the functional implications of these interactions, but the recognition 'code' that mediates interactions between proteins and RNA is not yet understood. Success in deciphering this code would dramatically impact the development of new therapeutic strategies for intervening in devastating diseases such as AIDS and cancer. Because of the high cost of experimental determination of protein-RNA interfaces, there is an increasing reliance on statistical machine learning methods for training predictors of RNA-binding residues in proteins. However, because of differences in the choice of datasets, performance measures, and data representations used, it has been difficult to obtain an accurate assessment of the current state of the art in protein-RNA interface prediction. We provide a review of published approaches for predicting RNA-binding residues in proteins and a systematic comparison and critical assessment of protein-RNA interface residue predictors trained using these approaches on three carefully curated non-redundant datasets. We directly compare two widely used machine learning algorithms (Naïve Bayes (NB) and Support Vector Machine (SVM)) using three different data representations in which features are encoded using either sequence- or structure-based windows. Our results show that (i) Sequence-based classifiers that use a position-specific scoring matrix (PSSM)-based representation (PSSMSeq) outperform those that use an amino acid identity based representation (IDSeq) or a smoothed PSSM (SmoPSSMSeq); (ii) Structure-based classifiers that use smoothed PSSM representation (SmoPSSMStr) outperform those that use PSSM (PSSMStr) as well as sequence identity based representation (IDStr). PSSMSeq classifiers, when tested on an independent test set of 44 proteins, achieve performance that is comparable to that of three state-of-the-art structure-based predictors (including those that exploit geometric features) in terms of Matthews Correlation Coefficient (MCC), although the structure-based methods achieve substantially higher Specificity (albeit at the expense of Sensitivity) compared to sequence-based methods. We also find that the expected performance of the classifiers on a residue level can be markedly different from that on a protein level. Our experiments show that the classifiers trained on three different non-redundant protein-RNA interface datasets achieve comparable cross-validation performance. However, we find that the results are significantly affected by differences in the distance threshold used to define interface residues. Our results demonstrate that protein-RNA interface residue predictors that use a PSSM-based encoding of sequence windows outperform classifiers that use other encodings of sequence windows. While structure-based methods that exploit geometric features can yield significant increases in the Specificity of protein-RNA interface residue predictions, such increases are offset by decreases in Sensitivity. These results underscore the importance of comparing alternative methods using rigorous statistical procedures, multiple performance measures, and datasets that are constructed based on several alternative definitions of interface residues and redundancy cutoffs as well as including evaluations on independent test sets into the comparisons.
Activation of different split functionalities upon re-association of RNA-DNA hybrids
Afonin, Kirill A.; Viard, Mathias; Martins, Angelica N.; Lockett, Stephen J.; Maciag, Anna E.; Freed, Eric O.; Heldman, Eliahu; Jaeger, Luc; Blumenthal, Robert; Shapiro, Bruce A.
2013-01-01
Split-protein systems, an approach that relies on fragmentation of proteins with their further conditional re-association to form functional complexes, are increasingly used for various biomedical applications. This approach offers tight control of the protein functions and improved detection sensitivity. Here we show a similar technique based on a pair of RNA-DNA hybrids that can be generally used for triggering different split functionalities. Individually, each hybrid is inactive but when two cognate hybrids re-associate, different functionalities are triggered inside mammalian cells. As a proof of concept this work is mainly focused on activation of RNA interference; however the release of other functionalities (resonance energy transfer and RNA aptamer) is also shown. Furthermore, in vivo studies demonstrate a significant uptake of the hybrids by tumors together with specific gene silencing. This split-functionality approach presents a new route in the development of “smart” nucleic acids based nanoparticles and switches for various biomedical applications. PMID:23542902
Stomberski, Colin T; Hess, Douglas T; Stamler, Jonathan S
2018-01-10
Protein S-nitrosylation, the oxidative modification of cysteine by nitric oxide (NO) to form protein S-nitrosothiols (SNOs), mediates redox-based signaling that conveys, in large part, the ubiquitous influence of NO on cellular function. S-nitrosylation regulates protein activity, stability, localization, and protein-protein interactions across myriad physiological processes, and aberrant S-nitrosylation is associated with diverse pathophysiologies. Recent Advances: It is recently recognized that S-nitrosylation endows S-nitroso-protein (SNO-proteins) with S-nitrosylase activity, that is, the potential to trans-S-nitrosylate additional proteins, thereby propagating SNO-based signals, analogous to kinase-mediated signaling cascades. In addition, it is increasingly appreciated that cellular S-nitrosylation is governed by dynamically coupled equilibria between SNO-proteins and low-molecular-weight SNOs, which are controlled by a growing set of enzymatic denitrosylases comprising two main classes (high and low molecular weight). S-nitrosylases and denitrosylases, which together control steady-state SNO levels, may be identified with distinct physiology and pathophysiology ranging from cardiovascular and respiratory disorders to neurodegeneration and cancer. The target specificity of protein S-nitrosylation and the stability and reactivity of protein SNOs are determined substantially by enzymatic machinery comprising highly conserved transnitrosylases and denitrosylases. Understanding the differential functionality of SNO-regulatory enzymes is essential, and is amenable to genetic and pharmacological analyses, read out as perturbation of specific equilibria within the SNO circuitry. The emerging picture of NO biology entails equilibria among potentially thousands of different SNOs, governed by denitrosylases and nitrosylases. Thus, to elucidate the operation and consequences of S-nitrosylation in cellular contexts, studies should consider the roles of SNO-proteins as both targets and transducers of S-nitrosylation, functioning according to enzymatically governed equilibria. Antioxid. Redox Signal. 00, 000-000.
Sequence patterns mediating functions of disordered proteins.
Exarchos, Konstantinos P; Kourou, Konstantina; Exarchos, Themis P; Papaloukas, Costas; Karamouzis, Michalis V; Fotiadis, Dimitrios I
2015-01-01
Disordered proteins lack specific 3D structure in their native state and have been implicated with numerous cellular functions as well as with the induction of severe diseases, e.g., cardiovascular and neurodegenerative diseases as well as diabetes. Due to their conformational flexibility they are often found to interact with a multitude of protein molecules; this one-to-many interaction which is vital for their versatile functioning involves short consensus protein sequences, which are normally detected using slow and cumbersome experimental procedures. In this work we exploit information from disorder-oriented protein interaction networks focused specifically on humans, in order to assemble, by means of overrepresentation, a set of sequence patterns that mediate the functioning of disordered proteins; hence, we are able to identify how a single protein achieves such functional promiscuity. Next, we study the sequential characteristics of the extracted patterns, which exhibit a striking preference towards a very limited subset of amino acids; specifically, residues leucine, glutamic acid, and serine are particularly frequent among the extracted patterns, and we also observe a nontrivial propensity towards alanine and glycine. Furthermore, based on the extracted patterns we set off to infer potential functional implications in order to verify our findings and potentially further extrapolate our knowledge regarding the functioning of disordered proteins. We observe that the extracted patterns are primarily involved with regulation, binding and posttranslational modifications, which constitute the most prominent functions of disordered proteins.
Shi, Ruijia; Xu, Cunshuan
2011-06-01
The study of rat proteins is an indispensable task in experimental medicine and drug development. The function of a rat protein is closely related to its subcellular location. Based on the above concept, we construct the benchmark rat proteins dataset and develop a combined approach for predicting the subcellular localization of rat proteins. From protein primary sequence, the multiple sequential features are obtained by using of discrete Fourier analysis, position conservation scoring function and increment of diversity, and these sequential features are selected as input parameters of the support vector machine. By the jackknife test, the overall success rate of prediction is 95.6% on the rat proteins dataset. Our method are performed on the apoptosis proteins dataset and the Gram-negative bacterial proteins dataset with the jackknife test, the overall success rates are 89.9% and 96.4%, respectively. The above results indicate that our proposed method is quite promising and may play a complementary role to the existing predictors in this area.
Rapid comparison of properties on protein surface
Sael, Lee; La, David; Li, Bin; Rustamov, Raif; Kihara, Daisuke
2008-01-01
The mapping of physicochemical characteristics onto the surface of a protein provides crucial insights into its function and evolution. This information can be further used in the characterization and identification of similarities within protein surface regions. We propose a novel method which quantitatively compares global and local properties on the protein surface. We have tested the method on comparison of electrostatic potentials and hydrophobicity. The method is based on 3D Zernike descriptors, which provides a compact representation of a given property defined on a protein surface. Compactness and rotational invariance of this descriptor enable fast comparison suitable for database searches. The usefulness of this method is exemplified by studying several protein families including globins, thermophilic and mesophilic proteins, and active sites of TIM β/α barrel proteins. In all the cases studied, the descriptor is able to cluster proteins into functionally relevant groups. The proposed approach can also be easily extended to other surface properties. This protein surface-based approach will add a new way of viewing and comparing proteins to conventional methods, which compare proteins in terms of their primary sequence or tertiary structure. PMID:18618695
Rapid comparison of properties on protein surface.
Sael, Lee; La, David; Li, Bin; Rustamov, Raif; Kihara, Daisuke
2008-10-01
The mapping of physicochemical characteristics onto the surface of a protein provides crucial insights into its function and evolution. This information can be further used in the characterization and identification of similarities within protein surface regions. We propose a novel method which quantitatively compares global and local properties on the protein surface. We have tested the method on comparison of electrostatic potentials and hydrophobicity. The method is based on 3D Zernike descriptors, which provides a compact representation of a given property defined on a protein surface. Compactness and rotational invariance of this descriptor enable fast comparison suitable for database searches. The usefulness of this method is exemplified by studying several protein families including globins, thermophilic and mesophilic proteins, and active sites of TIM beta/alpha barrel proteins. In all the cases studied, the descriptor is able to cluster proteins into functionally relevant groups. The proposed approach can also be easily extended to other surface properties. This protein surface-based approach will add a new way of viewing and comparing proteins to conventional methods, which compare proteins in terms of their primary sequence or tertiary structure.
Ołdziej, S; Czaplewski, C; Liwo, A; Chinchio, M; Nanias, M; Vila, J A; Khalili, M; Arnautova, Y A; Jagielska, A; Makowski, M; Schafroth, H D; Kaźmierkiewicz, R; Ripoll, D R; Pillardy, J; Saunders, J A; Kang, Y K; Gibson, K D; Scheraga, H A
2005-05-24
Recent improvements in the protein-structure prediction method developed in our laboratory, based on the thermodynamic hypothesis, are described. The conformational space is searched extensively at the united-residue level by using our physics-based UNRES energy function and the conformational space annealing method of global optimization. The lowest-energy coarse-grained structures are then converted to an all-atom representation and energy-minimized with the ECEPP/3 force field. The procedure was assessed in two recent blind tests of protein-structure prediction. During the first blind test, we predicted large fragments of alpha and alpha+beta proteins [60-70 residues with C(alpha) rms deviation (rmsd) <6 A]. However, for alpha+beta proteins, significant topological errors occurred despite low rmsd values. In the second exercise, we predicted whole structures of five proteins (two alpha and three alpha+beta, with sizes of 53-235 residues) with remarkably good accuracy. In particular, for the genomic target TM0487 (a 102-residue alpha+beta protein from Thermotoga maritima), we predicted the complete, topologically correct structure with 7.3-A C(alpha) rmsd. So far this protein is the largest alpha+beta protein predicted based solely on the amino acid sequence and a physics-based potential-energy function and search procedure. For target T0198, a phosphate transport system regulator PhoU from T. maritima (a 235-residue mainly alpha-helical protein), we predicted the topology of the whole six-helix bundle correctly within 8 A rmsd, except the 32 C-terminal residues, most of which form a beta-hairpin. These and other examples described in this work demonstrate significant progress in physics-based protein-structure prediction.
2013-01-01
Background Protein-protein interactions (PPIs) play a key role in understanding the mechanisms of cellular processes. The availability of interactome data has catalyzed the development of computational approaches to elucidate functional behaviors of proteins on a system level. Gene Ontology (GO) and its annotations are a significant resource for functional characterization of proteins. Because of wide coverage, GO data have often been adopted as a benchmark for protein function prediction on the genomic scale. Results We propose a computational approach, called M-Finder, for functional association pattern mining. This method employs semantic analytics to integrate the genome-wide PPIs with GO data. We also introduce an interactive web application tool that visualizes a functional association network linked to a protein specified by a user. The proposed approach comprises two major components. First, the PPIs that have been generated by high-throughput methods are weighted in terms of their functional consistency using GO and its annotations. We assess two advanced semantic similarity metrics which quantify the functional association level of each interacting protein pair. We demonstrate that these measures outperform the other existing methods by evaluating their agreement to other biological features, such as sequence similarity, the presence of common Pfam domains, and core PPIs. Second, the information flow-based algorithm is employed to discover a set of proteins functionally associated with the protein in a query and their links efficiently. This algorithm reconstructs a functional association network of the query protein. The output network size can be flexibly determined by parameters. Conclusions M-Finder provides a useful framework to investigate functional association patterns with any protein. This software will also allow users to perform further systematic analysis of a set of proteins for any specific function. It is available online at http://bionet.ecs.baylor.edu/mfinder PMID:24565382
Biological Chemistry and Functionality of Protein Sulfenic Acids and Related Thiol Modifications
Devarie-Baez, Nelmi O.; Silva Lopez, Elsa I.; Furdui, Cristina M.
2016-01-01
Selective modification of proteins at cysteine residues by reactive oxygen, nitrogen or sulfur species formed under physiological and pathological states is emerging as a critical regulator of protein activity impacting cellular function. This review focuses primarily on protein sulfenylation (-SOH), a metastable reversible modification connecting reduced cysteine thiols to many products of cysteine oxidation. An overview is first provided on the chemistry principles underlining synthesis, stability and reactivity of sulfenic acids in model compounds and proteins, followed by a brief description of analytical methods currently employed to characterize these oxidative species. The following chapters present a selection of redox-regulated proteins for which the -SOH formation was experimentally confirmed and linked to protein function. These chapters are organized based on the participation of these proteins in the regulation of signaling, metabolism and epigenetics. The last chapter discusses the therapeutic implications of altered redox microenvironment and protein oxidation in disease. PMID:26340608
Mutsuddi, Mousumi; Mukherjee, Ashim; Shen, Baohe; Manley, James L; Nambu, John R
2010-01-01
The Drosophila Dichaete gene encodes a member of the Sox family of high mobility group (HMG) domain proteins that have crucial gene regulatory functions in diverse developmental processes. The subcellular localization and transcriptional regulatory activities of Sox proteins can be regulated by several post-translational modifications. To identify genes that functionally interact with Dichaete, we undertook a genetic modifier screen based on a Dichaete gain-of-function phenotype in the adult eye. Mutations in several genes, including decapentaplegic, engrailed and pelle, behaved as dominant modifiers of this eye phenotype. Further analysis of pelle mutants revealed that loss of pelle function results in alterations in the distinctive cytoplasmic distribution of Dichaete protein within the developing oocyte, as well as defects in the elaboration of individual egg chambers. The death domain-containing region of the Pelle protein kinase was found to associate with both Dichaete and mouse Sox2 proteins, and Pelle can phosphorylate Dichaete protein in vitro. Overall, these findings reveal that maternal functions of pelle are essential for proper localization of Dichaete protein in the oocyte and normal egg chamber formation. Dichaete appears to be a novel phosphorylation substrate for Pelle and may function in a Pelle-dependent signaling pathway during oogenesis.
Versatile multi-functionalization of protein nanofibrils for biosensor applications
NASA Astrophysics Data System (ADS)
Sasso, L.; Suei, S.; Domigan, L.; Healy, J.; Nock, V.; Williams, M. A. K.; Gerrard, J. A.
2014-01-01
Protein nanofibrils offer advantages over other nanostructures due to the ease in their self-assembly and the versatility of surface chemistry available. Yet, an efficient and general methodology for their post-assembly functionalization remains a significant challenge. We introduce a generic approach, based on biotinylation and thiolation, for the multi-functionalization of protein nanofibrils self-assembled from whey proteins. Biochemical characterization shows the effects of the functionalization onto the nanofibrils' surface, giving insights into the changes in surface chemistry of the nanostructures. We show how these methods can be used to decorate whey protein nanofibrils with several components such as fluorescent quantum dots, enzymes, and metal nanoparticles. A multi-functionalization approach is used, as a proof of principle, for the development of a glucose biosensor platform, where the protein nanofibrils act as nanoscaffolds for glucose oxidase. Biotinylation is used for enzyme attachment and thiolation for nanoscaffold anchoring onto a gold electrode surface. Characterization via cyclic voltammetry shows an increase in glucose-oxidase mediated current response due to thiol-metal interactions with the gold electrode. The presented approach for protein nanofibril multi-functionalization is novel and has the potential of being applied to other protein nanostructures with similar surface chemistry.Protein nanofibrils offer advantages over other nanostructures due to the ease in their self-assembly and the versatility of surface chemistry available. Yet, an efficient and general methodology for their post-assembly functionalization remains a significant challenge. We introduce a generic approach, based on biotinylation and thiolation, for the multi-functionalization of protein nanofibrils self-assembled from whey proteins. Biochemical characterization shows the effects of the functionalization onto the nanofibrils' surface, giving insights into the changes in surface chemistry of the nanostructures. We show how these methods can be used to decorate whey protein nanofibrils with several components such as fluorescent quantum dots, enzymes, and metal nanoparticles. A multi-functionalization approach is used, as a proof of principle, for the development of a glucose biosensor platform, where the protein nanofibrils act as nanoscaffolds for glucose oxidase. Biotinylation is used for enzyme attachment and thiolation for nanoscaffold anchoring onto a gold electrode surface. Characterization via cyclic voltammetry shows an increase in glucose-oxidase mediated current response due to thiol-metal interactions with the gold electrode. The presented approach for protein nanofibril multi-functionalization is novel and has the potential of being applied to other protein nanostructures with similar surface chemistry. Electronic supplementary information (ESI) available: Cyclic voltammetry characterization of biosensor platforms including bare Au electrodes (Fig. S1), biosensor response to various glucose concentrations (Fig. S2), and AFM roughness measurements due to WPNF modifications (Fig. S3). See DOI: 10.1039/c3nr05752f
FunSimMat: a comprehensive functional similarity database
Schlicker, Andreas; Albrecht, Mario
2008-01-01
Functional similarity based on Gene Ontology (GO) annotation is used in diverse applications like gene clustering, gene expression data analysis, protein interaction prediction and evaluation. However, there exists no comprehensive resource of functional similarity values although such a database would facilitate the use of functional similarity measures in different applications. Here, we describe FunSimMat (Functional Similarity Matrix, http://funsimmat.bioinf.mpi-inf.mpg.de/), a large new database that provides several different semantic similarity measures for GO terms. It offers various precomputed functional similarity values for proteins contained in UniProtKB and for protein families in Pfam and SMART. The web interface allows users to efficiently perform both semantic similarity searches with GO terms and functional similarity searches with proteins or protein families. All results can be downloaded in tab-delimited files for use with other tools. An additional XML–RPC interface gives automatic online access to FunSimMat for programs and remote services. PMID:17932054
Rice proteome analysis: a step toward functional analysis of the rice genome.
Komatsu, Setsuko; Tanaka, Naoki
2005-03-01
The technique of proteome analysis using 2-DE has the power to monitor global changes that occur in the protein complement of tissues and subcellular compartments. In this review, we describe construction of the rice proteome database, the cataloging of rice proteins, and the functional characterization of some of the proteins identified. Initially, proteins extracted from various tissues and organelles were separated by 2-DE and an image analyzer was used to construct a display or reference map of the proteins. The rice proteome database currently contains 23 reference maps based on 2-DE of proteins from different rice tissues and subcellular compartments. These reference maps comprise 13 129 rice proteins, and the amino acid sequences of 5092 of these proteins are entered in the database. Major proteins involved in growth or stress responses have been identified by using a proteomics approach and some of these proteins have unique functions. Furthermore, initial work has also begun on analyzing the phosphoproteome and protein-protein interactions in rice. The information obtained from the rice proteome database will aid in the molecular cloning of rice genes and in predicting the function of unknown proteins.
Nutritive and Bioactive Proteins in Breastmilk.
Haschke, Ferdinand; Haiden, Nadja; Thakkar, Sagar K
2016-01-01
Protein ingested with breast milk provides indispensable amino acids which are necessary for new protein synthesis for growth and replacement of losses via urine, feces, and the skin. Protein gain in the body of an infant is highest during the first months when protein concentrations in breast milk are higher than during later stages of lactation. Low-birth-weight infants have higher protein needs than term infants and need protein supplements during feeding with breastmilk. Based on our better understanding of protein evolution in breastmilk during the stages of lactation, new infant formulas with lower protein concentration but better protein quality have been created, successfully tested, and are now available in many countries. Besides providing indispensable amino acids, bioactive protein in breast milk can be broadly classified into 4 major functions, that is, providing protection from microbial insults and immune protection, aiding in digestive functions, gut development, and being carriers for other nutrients. Individual proteins and their proposed bioactivities are summarized in this paper in brief. Indeed, some proteins like lactoferrin and sIgA have been extensively studied for their biological functions, whereas others may require more data in support to further validate their proposed functions. © 2017 S. Karger AG, Basel.
Hutchins, James R. A.
2014-01-01
The genomic era has enabled research projects that use approaches including genome-scale screens, microarray analysis, next-generation sequencing, and mass spectrometry–based proteomics to discover genes and proteins involved in biological processes. Such methods generate data sets of gene, transcript, or protein hits that researchers wish to explore to understand their properties and functions and thus their possible roles in biological systems of interest. Recent years have seen a profusion of Internet-based resources to aid this process. This review takes the viewpoint of the curious biologist wishing to explore the properties of protein-coding genes and their products, identified using genome-based technologies. Ten key questions are asked about each hit, addressing functions, phenotypes, expression, evolutionary conservation, disease association, protein structure, interactors, posttranslational modifications, and inhibitors. Answers are provided by presenting the latest publicly available resources, together with methods for hit-specific and data set–wide information retrieval, suited to any genome-based analytical technique and experimental species. The utility of these resources is demonstrated for 20 factors regulating cell proliferation. Results obtained using some of these are discussed in more depth using the p53 tumor suppressor as an example. This flexible and universally applicable approach for characterizing experimental hits helps researchers to maximize the potential of their projects for biological discovery. PMID:24723265
Zhang, Juanni; Tian, Jianniao; He, Yanlong; Chen, Sheng; Jiang, Yixuan; Zhao, Yanchun; Zhao, Shulin
2013-09-07
We report a fluorescence polarization platform for H1N1 detection based on the construction of a DNA functional QD fluorescence polarization probe and a bi-functional protein binding aptamer (Apt-DNA). The assay has a linear range from 10 nM to 100 nM with a detection limit of 3.45 nM and is selective over the mismatched bases.
Lam, Winnie W M; Chan, Keith C C
2012-04-01
Protein molecules interact with each other in protein complexes to perform many vital functions, and different computational techniques have been developed to identify protein complexes in protein-protein interaction (PPI) networks. These techniques are developed to search for subgraphs of high connectivity in PPI networks under the assumption that the proteins in a protein complex are highly interconnected. While these techniques have been shown to be quite effective, it is also possible that the matching rate between the protein complexes they discover and those that are previously determined experimentally be relatively low and the "false-alarm" rate can be relatively high. This is especially the case when the assumption of proteins in protein complexes being more highly interconnected be relatively invalid. To increase the matching rate and reduce the false-alarm rate, we have developed a technique that can work effectively without having to make this assumption. The name of the technique called protein complex identification by discovering functional interdependence (PCIFI) searches for protein complexes in PPI networks by taking into consideration both the functional interdependence relationship between protein molecules and the network topology of the network. The PCIFI works in several steps. The first step is to construct a multiple-function protein network graph by labeling each vertex with one or more of the molecular functions it performs. The second step is to filter out protein interactions between protein pairs that are not functionally interdependent of each other in the statistical sense. The third step is to make use of an information-theoretic measure to determine the strength of the functional interdependence between all remaining interacting protein pairs. Finally, the last step is to try to form protein complexes based on the measure of the strength of functional interdependence and the connectivity between proteins. For performance evaluation, PCIFI was used to identify protein complexes in real PPI network data and the protein complexes it found were matched against those that were previously known in MIPS. The results show that PCIFI can be an effective technique for the identification of protein complexes. The protein complexes it found can match more known protein complexes with a smaller false-alarm rate and can provide useful insights into the understanding of the functional interdependence relationships between proteins in protein complexes.
Efficient conformational space exploration in ab initio protein folding simulation.
Ullah, Ahammed; Ahmed, Nasif; Pappu, Subrata Dey; Shatabda, Swakkhar; Ullah, A Z M Dayem; Rahman, M Sohel
2015-08-01
Ab initio protein folding simulation largely depends on knowledge-based energy functions that are derived from known protein structures using statistical methods. These knowledge-based energy functions provide us with a good approximation of real protein energetics. However, these energy functions are not very informative for search algorithms and fail to distinguish the types of amino acid interactions that contribute largely to the energy function from those that do not. As a result, search algorithms frequently get trapped into the local minima. On the other hand, the hydrophobic-polar (HP) model considers hydrophobic interactions only. The simplified nature of HP energy function makes it limited only to a low-resolution model. In this paper, we present a strategy to derive a non-uniform scaled version of the real 20×20 pairwise energy function. The non-uniform scaling helps tackle the difficulty faced by a real energy function, whereas the integration of 20×20 pairwise information overcomes the limitations faced by the HP energy function. Here, we have applied a derived energy function with a genetic algorithm on discrete lattices. On a standard set of benchmark protein sequences, our approach significantly outperforms the state-of-the-art methods for similar models. Our approach has been able to explore regions of the conformational space which all the previous methods have failed to explore. Effectiveness of the derived energy function is presented by showing qualitative differences and similarities of the sampled structures to the native structures. Number of objective function evaluation in a single run of the algorithm is used as a comparison metric to demonstrate efficiency.
PASS2: an automated database of protein alignments organised as structural superfamilies.
Bhaduri, Anirban; Pugalenthi, Ganesan; Sowdhamini, Ramanathan
2004-04-02
The functional selection and three-dimensional structural constraints of proteins in nature often relates to the retention of significant sequence similarity between proteins of similar fold and function despite poor sequence identity. Organization of structure-based sequence alignments for distantly related proteins, provides a map of the conserved and critical regions of the protein universe that is useful for the analysis of folding principles, for the evolutionary unification of protein families and for maximizing the information return from experimental structure determination. The Protein Alignment organised as Structural Superfamily (PASS2) database represents continuously updated, structural alignments for evolutionary related, sequentially distant proteins. An automated and updated version of PASS2 is, in direct correspondence with SCOP 1.63, consisting of sequences having identity below 40% among themselves. Protein domains have been grouped into 628 multi-member superfamilies and 566 single member superfamilies. Structure-based sequence alignments for the superfamilies have been obtained using COMPARER, while initial equivalencies have been derived from a preliminary superposition using LSQMAN or STAMP 4.0. The final sequence alignments have been annotated for structural features using JOY4.0. The database is supplemented with sequence relatives belonging to different genomes, conserved spatially interacting and structural motifs, probabilistic hidden markov models of superfamilies based on the alignments and useful links to other databases. Probabilistic models and sensitive position specific profiles obtained from reliable superfamily alignments aid annotation of remote homologues and are useful tools in structural and functional genomics. PASS2 presents the phylogeny of its members both based on sequence and structural dissimilarities. Clustering of members allows us to understand diversification of the family members. The search engine has been improved for simpler browsing of the database. The database resolves alignments among the structural domains consisting of evolutionarily diverged set of sequences. Availability of reliable sequence alignments of distantly related proteins despite poor sequence identity and single-member superfamilies permit better sampling of structures in libraries for fold recognition of new sequences and for the understanding of protein structure-function relationships of individual superfamilies. PASS2 is accessible at http://www.ncbs.res.in/~faculty/mini/campass/pass2.html
Fang, Chun; Noguchi, Tamotsu; Yamana, Hayato
2014-10-01
Evolutionary conservation information included in position-specific scoring matrix (PSSM) has been widely adopted by sequence-based methods for identifying protein functional sites, because all functional sites, whether in ordered or disordered proteins, are found to be conserved at some extent. However, different functional sites have different conservation patterns, some of them are linear contextual, some of them are mingled with highly variable residues, and some others seem to be conserved independently. Every value in PSSMs is calculated independently of each other, without carrying the contextual information of residues in the sequence. Therefore, adopting the direct output of PSSM for prediction fails to consider the relationship between conservation patterns of residues and the distribution of conservation scores in PSSMs. In order to demonstrate the importance of combining PSSMs with the specific conservation patterns of functional sites for prediction, three different PSSM-based methods for identifying three kinds of functional sites have been analyzed. Results suggest that, different PSSM-based methods differ in their capability to identify different patterns of functional sites, and better combining PSSMs with the specific conservation patterns of residues would largely facilitate the prediction.
Evaluating a variety of text-mined features for automatic protein function prediction with GOstruct.
Funk, Christopher S; Kahanda, Indika; Ben-Hur, Asa; Verspoor, Karin M
2015-01-01
Most computational methods that predict protein function do not take advantage of the large amount of information contained in the biomedical literature. In this work we evaluate both ontology term co-mention and bag-of-words features mined from the biomedical literature and analyze their impact in the context of a structured output support vector machine model, GOstruct. We find that even simple literature based features are useful for predicting human protein function (F-max: Molecular Function =0.408, Biological Process =0.461, Cellular Component =0.608). One advantage of using literature features is their ability to offer easy verification of automated predictions. We find through manual inspection of misclassifications that some false positive predictions could be biologically valid predictions based upon support extracted from the literature. Additionally, we present a "medium-throughput" pipeline that was used to annotate a large subset of co-mentions; we suggest that this strategy could help to speed up the rate at which proteins are curated.
Protein-Based Drug-Delivery Materials
Jao, Dave; Xue, Ye; Medina, Jethro; Hu, Xiao
2017-01-01
There is a pressing need for long-term, controlled drug release for sustained treatment of chronic or persistent medical conditions and diseases. Guided drug delivery is difficult because therapeutic compounds need to survive numerous transport barriers and binding targets throughout the body. Nanoscale protein-based polymers are increasingly used for drug and vaccine delivery to cross these biological barriers and through blood circulation to their molecular site of action. Protein-based polymers compared to synthetic polymers have the advantages of good biocompatibility, biodegradability, environmental sustainability, cost effectiveness and availability. This review addresses the sources of protein-based polymers, compares the similarity and differences, and highlights characteristic properties and functionality of these protein materials for sustained and controlled drug release. Targeted drug delivery using highly functional multicomponent protein composites to guide active drugs to the site of interest will also be discussed. A systematical elucidation of drug-delivery efficiency in the case of molecular weight, particle size, shape, morphology, and porosity of materials will then be demonstrated to achieve increased drug absorption. Finally, several important biomedical applications of protein-based materials with drug-delivery function—including bone healing, antibiotic release, wound healing, and corneal regeneration, as well as diabetes, neuroinflammation and cancer treatments—are summarized at the end of this review. PMID:28772877
Prediction and redesign of protein–protein interactions
Lua, Rhonald C.; Marciano, David C.; Katsonis, Panagiotis; Adikesavan, Anbu K.; Wilkins, Angela D.; Lichtarge, Olivier
2014-01-01
Understanding the molecular basis of protein function remains a central goal of biology, with the hope to elucidate the role of human genes in health and in disease, and to rationally design therapies through targeted molecular perturbations. We review here some of the computational techniques and resources available for characterizing a critical aspect of protein function – those mediated by protein–protein interactions (PPI). We describe several applications and recent successes of the Evolutionary Trace (ET) in identifying molecular events and shapes that underlie protein function and specificity in both eukaryotes and prokaryotes. ET is a part of analytical approaches based on the successes and failures of evolution that enable the rational control of PPI. PMID:24878423
Bahramali, Golnaz; Goliaei, Bahram; Minuchehr, Zarrin; Marashi, Sayed-Amir
2017-02-01
Chameleon proteins are proteins which include sequences that can adopt α-helix-β-strand (HE-chameleon) or α-helix-coil (HC-chameleon) or β-strand-coil (CE-chameleon) structures to operate their crucial biological functions. In this study, using a network-based approach, we examined the chameleon proteins to give a better knowledge on these proteins. We focused on proteins with identical chameleon sequences with more than or equal to seven residues long in different PDB entries, which adopt HE-chameleon, HC-chameleon, and CE-chameleon structures in the same protein. One hundred and ninety-one human chameleon proteins were identified via our in-house program. Then, protein-protein interaction (PPI) networks, Gene ontology (GO) enrichment, disease network, and pathway enrichment analyses were performed for our derived data set. We discovered that there are chameleon sequences which reside in protein-protein interaction regions between two proteins critical for their dual function. Analysis of the PPI networks for chameleon proteins introduced five hub proteins, namely TP53, EGFR, HSP90AA1, PPARA, and HIF1A, which were presented in four PPI clusters. The outcomes demonstrate that the chameleon regions are in critical domains of these proteins and are important in the development and treatment of human cancers. The present report is the first network-based functional study of chameleon proteins using computational approaches and might provide a new perspective for understanding the mechanisms of diseases helping us in developing new medical therapies along with discovering new proteins with chameleon properties which are highly important in cancer.
McDermott, Jason E.; Bruillard, Paul; Overall, Christopher C.; ...
2015-03-09
There are many examples of groups of proteins that have similar function, but the determinants of functional specificity may be hidden by lack of sequencesimilarity, or by large groups of similar sequences with different functions. Transporters are one such protein group in that the general function, transport, can be easily inferred from the sequence, but the substrate specificity can be impossible to predict from sequence with current methods. In this paper we describe a linguistic-based approach to identify functional patterns from groups of unaligned protein sequences and its application to predict multi-drug resistance transporters (MDRs) from bacteria. We first showmore » that our method can recreate known patterns from PROSITE for several motifs from unaligned sequences. We then show that the method, MDRpred, can predict MDRs with greater accuracy and positive predictive value than a collection of currently available family-based models from the Pfam database. Finally, we apply MDRpred to a large collection of protein sequences from an environmental microbiome study to make novel predictions about drug resistance in a potential environmental reservoir.« less
Du, Yushen; Wu, Nicholas C.; Jiang, Lin; Zhang, Tianhao; Gong, Danyang; Shu, Sara; Wu, Ting-Ting
2016-01-01
ABSTRACT Identification and annotation of functional residues are fundamental questions in protein sequence analysis. Sequence and structure conservation provides valuable information to tackle these questions. It is, however, limited by the incomplete sampling of sequence space in natural evolution. Moreover, proteins often have multiple functions, with overlapping sequences that present challenges to accurate annotation of the exact functions of individual residues by conservation-based methods. Using the influenza A virus PB1 protein as an example, we developed a method to systematically identify and annotate functional residues. We used saturation mutagenesis and high-throughput sequencing to measure the replication capacity of single nucleotide mutations across the entire PB1 protein. After predicting protein stability upon mutations, we identified functional PB1 residues that are essential for viral replication. To further annotate the functional residues important to the canonical or noncanonical functions of viral RNA-dependent RNA polymerase (vRdRp), we performed a homologous-structure analysis with 16 different vRdRp structures. We achieved high sensitivity in annotating the known canonical polymerase functional residues. Moreover, we identified a cluster of noncanonical functional residues located in the loop region of the PB1 β-ribbon. We further demonstrated that these residues were important for PB1 protein nuclear import through the interaction with Ran-binding protein 5. In summary, we developed a systematic and sensitive method to identify and annotate functional residues that are not restrained by sequence conservation. Importantly, this method is generally applicable to other proteins about which homologous-structure information is available. PMID:27803181
Silk Materials Functionalized via Genetic Engineering for Biomedical Applications.
Deptuch, Tomasz; Dams-Kozlowska, Hanna
2017-12-12
The great mechanical properties, biocompatibility and biodegradability of silk-based materials make them applicable to the biomedical field. Genetic engineering enables the construction of synthetic equivalents of natural silks. Knowledge about the relationship between the structure and function of silk proteins enables the design of bioengineered silks that can serve as the foundation of new biomaterials. Furthermore, in order to better address the needs of modern biomedicine, genetic engineering can be used to obtain silk-based materials with new functionalities. Sequences encoding new peptides or domains can be added to the sequences encoding the silk proteins. The expression of one cDNA fragment indicates that each silk molecule is related to a functional fragment. This review summarizes the proposed genetic functionalization of silk-based materials that can be potentially useful for biomedical applications.
Ghadie, Mohamed Ali; Lambourne, Luke; Vidal, Marc; Xia, Yu
2017-08-01
Alternative splicing is known to remodel protein-protein interaction networks ("interactomes"), yet large-scale determination of isoform-specific interactions remains challenging. We present a domain-based method to predict the isoform interactome from the reference interactome. First, we construct the domain-resolved reference interactome by mapping known domain-domain interactions onto experimentally-determined interactions between reference proteins. Then, we construct the isoform interactome by predicting that an isoform loses an interaction if it loses the domain mediating the interaction. Our prediction framework is of high-quality when assessed by experimental data. The predicted human isoform interactome reveals extensive network remodeling by alternative splicing. Protein pairs interacting with different isoforms of the same gene tend to be more divergent in biological function, tissue expression, and disease phenotype than protein pairs interacting with the same isoforms. Our prediction method complements experimental efforts, and demonstrates that integrating structural domain information with interactomes provides insights into the functional impact of alternative splicing.
Lambourne, Luke; Vidal, Marc
2017-01-01
Alternative splicing is known to remodel protein-protein interaction networks (“interactomes”), yet large-scale determination of isoform-specific interactions remains challenging. We present a domain-based method to predict the isoform interactome from the reference interactome. First, we construct the domain-resolved reference interactome by mapping known domain-domain interactions onto experimentally-determined interactions between reference proteins. Then, we construct the isoform interactome by predicting that an isoform loses an interaction if it loses the domain mediating the interaction. Our prediction framework is of high-quality when assessed by experimental data. The predicted human isoform interactome reveals extensive network remodeling by alternative splicing. Protein pairs interacting with different isoforms of the same gene tend to be more divergent in biological function, tissue expression, and disease phenotype than protein pairs interacting with the same isoforms. Our prediction method complements experimental efforts, and demonstrates that integrating structural domain information with interactomes provides insights into the functional impact of alternative splicing. PMID:28846689
Fricker, Lloyd D.
2010-01-01
Peptides are known to play many important physiological roles in signaling. A large number of peptides have been detected in mouse brain extracts using mass spectrometry-based peptidomics studies, and 850 peptides have been identified. Half of these peptides are derived from secretory pathway proteins and many are known bioactive neuropeptides which activate G protein-coupled receptors; these are termed “classical neuropeptides.” In addition, 427 peptides were identified that are derived from non-secretory pathway proteins; the majority are cystosolic, and the remainder are mitochondrial, nuclear, lysosomal, or membrane proteins. Many of these peptides represent the N- or C-terminus of the protein, rather than internal fragments, raising the possibility that they are formed by selective processing rather than protein degradation. In addition to consideration of the cleavage site required to generate the intracellular peptides, their potential functions are discussed. Several of the cytosolic peptides were previously found to interact with receptors and/or otherwise influence cellular activity; examples include hemophins, hemopressins, diazepam binding inhibitor, and hippocampal cholinergic neurostimulating peptide. The possibility that these peptides are secreted from cells and function in cell-cell signaling is discussed. If these intracellular peptides can be shown to be secreted in levels sufficient to produce a biological effect, they would appropriately be called “non-classical neuropeptides” by analogy with non-classical neurotransmitters such as nitric oxide and anandamide. It is also possible that intracellular peptides function as “microproteins” and modulate protein-protein interactions; evidence for this function is discussed, along with future directions that are needed to establish this and other possible functions for peptides. PMID:20428524
Su, Ji Guo; Qi, Li Sheng; Li, Chun Hua; Zhu, Yan Ying; Du, Hui Jing; Hou, Yan Xue; Hao, Rui; Wang, Ji Hua
2014-08-01
Allostery is a rapid and efficient way in many biological processes to regulate protein functions, where binding of an effector at the allosteric site alters the activity and function at a distant active site. Allosteric regulation of protein biological functions provides a promising strategy for novel drug design. However, how to effectively identify the allosteric sites remains one of the major challenges for allosteric drug design. In the present work, a thermodynamic method based on the elastic network model was proposed to predict the allosteric sites on the protein surface. In our method, the thermodynamic coupling between the allosteric and active sites was considered, and then the allosteric sites were identified as those where the binding of an effector molecule induces a large change in the binding free energy of the protein with its ligand. Using the proposed method, two proteins, i.e., the 70 kD heat shock protein (Hsp70) and GluA2 alpha-amino-3-hydroxy-5-methyl-4-isoxazole propionic acid (AMPA) receptor, were studied and the allosteric sites on the protein surface were successfully identified. The predicted results are consistent with the available experimental data, which indicates that our method is a simple yet effective approach for the identification of allosteric sites on proteins.
NASA Astrophysics Data System (ADS)
Su, Ji Guo; Qi, Li Sheng; Li, Chun Hua; Zhu, Yan Ying; Du, Hui Jing; Hou, Yan Xue; Hao, Rui; Wang, Ji Hua
2014-08-01
Allostery is a rapid and efficient way in many biological processes to regulate protein functions, where binding of an effector at the allosteric site alters the activity and function at a distant active site. Allosteric regulation of protein biological functions provides a promising strategy for novel drug design. However, how to effectively identify the allosteric sites remains one of the major challenges for allosteric drug design. In the present work, a thermodynamic method based on the elastic network model was proposed to predict the allosteric sites on the protein surface. In our method, the thermodynamic coupling between the allosteric and active sites was considered, and then the allosteric sites were identified as those where the binding of an effector molecule induces a large change in the binding free energy of the protein with its ligand. Using the proposed method, two proteins, i.e., the 70 kD heat shock protein (Hsp70) and GluA2 alpha-amino-3-hydroxy-5-methyl-4-isoxazole propionic acid (AMPA) receptor, were studied and the allosteric sites on the protein surface were successfully identified. The predicted results are consistent with the available experimental data, which indicates that our method is a simple yet effective approach for the identification of allosteric sites on proteins.
Doppelt-Azeroual, Olivia; Delfaud, François; Moriaud, Fabrice; de Brevern, Alexandre G
2010-04-01
Ligand-protein interactions are essential for biological processes, and precise characterization of protein binding sites is crucial to understand protein functions. MED-SuMo is a powerful technology to localize similar local regions on protein surfaces. Its heuristic is based on a 3D representation of macromolecules using specific surface chemical features associating chemical characteristics with geometrical properties. MED-SMA is an automated and fast method to classify binding sites. It is based on MED-SuMo technology, which builds a similarity graph, and it uses the Markov Clustering algorithm. Purine binding sites are well studied as drug targets. Here, purine binding sites of the Protein DataBank (PDB) are classified. Proteins potentially inhibited or activated through the same mechanism are gathered. Results are analyzed according to PROSITE annotations and to carefully refined functional annotations extracted from the PDB. As expected, binding sites associated with related mechanisms are gathered, for example, the Small GTPases. Nevertheless, protein kinases from different Kinome families are also found together, for example, Aurora-A and CDK2 proteins which are inhibited by the same drugs. Representative examples of different clusters are presented. The effectiveness of the MED-SMA approach is demonstrated as it gathers binding sites of proteins with similar structure-activity relationships. Moreover, an efficient new protocol associates structures absent of cocrystallized ligands to the purine clusters enabling those structures to be associated with a specific binding mechanism. Applications of this classification by binding mode similarity include target-based drug design and prediction of cross-reactivity and therefore potential toxic side effects.
Doppelt-Azeroual, Olivia; Delfaud, François; Moriaud, Fabrice; de Brevern, Alexandre G
2010-01-01
Ligand–protein interactions are essential for biological processes, and precise characterization of protein binding sites is crucial to understand protein functions. MED-SuMo is a powerful technology to localize similar local regions on protein surfaces. Its heuristic is based on a 3D representation of macromolecules using specific surface chemical features associating chemical characteristics with geometrical properties. MED-SMA is an automated and fast method to classify binding sites. It is based on MED-SuMo technology, which builds a similarity graph, and it uses the Markov Clustering algorithm. Purine binding sites are well studied as drug targets. Here, purine binding sites of the Protein DataBank (PDB) are classified. Proteins potentially inhibited or activated through the same mechanism are gathered. Results are analyzed according to PROSITE annotations and to carefully refined functional annotations extracted from the PDB. As expected, binding sites associated with related mechanisms are gathered, for example, the Small GTPases. Nevertheless, protein kinases from different Kinome families are also found together, for example, Aurora-A and CDK2 proteins which are inhibited by the same drugs. Representative examples of different clusters are presented. The effectiveness of the MED-SMA approach is demonstrated as it gathers binding sites of proteins with similar structure-activity relationships. Moreover, an efficient new protocol associates structures absent of cocrystallized ligands to the purine clusters enabling those structures to be associated with a specific binding mechanism. Applications of this classification by binding mode similarity include target-based drug design and prediction of cross-reactivity and therefore potential toxic side effects. PMID:20162627
An Evolution-Based Approach to De Novo Protein Design and Case Study on Mycobacterium tuberculosis
Brender, Jeffrey R.; Czajka, Jeff; Marsh, David; Gray, Felicia; Cierpicki, Tomasz; Zhang, Yang
2013-01-01
Computational protein design is a reverse procedure of protein folding and structure prediction, where constructing structures from evolutionarily related proteins has been demonstrated to be the most reliable method for protein 3-dimensional structure prediction. Following this spirit, we developed a novel method to design new protein sequences based on evolutionarily related protein families. For a given target structure, a set of proteins having similar fold are identified from the PDB library by structural alignments. A structural profile is then constructed from the protein templates and used to guide the conformational search of amino acid sequence space, where physicochemical packing is accommodated by single-sequence based solvation, torsion angle, and secondary structure predictions. The method was tested on a computational folding experiment based on a large set of 87 protein structures covering different fold classes, which showed that the evolution-based design significantly enhances the foldability and biological functionality of the designed sequences compared to the traditional physics-based force field methods. Without using homologous proteins, the designed sequences can be folded with an average root-mean-square-deviation of 2.1 Å to the target. As a case study, the method is extended to redesign all 243 structurally resolved proteins in the pathogenic bacteria Mycobacterium tuberculosis, which is the second leading cause of death from infectious disease. On a smaller scale, five sequences were randomly selected from the design pool and subjected to experimental validation. The results showed that all the designed proteins are soluble with distinct secondary structure and three have well ordered tertiary structure, as demonstrated by circular dichroism and NMR spectroscopy. Together, these results demonstrate a new avenue in computational protein design that uses knowledge of evolutionary conservation from protein structural families to engineer new protein molecules of improved fold stability and biological functionality. PMID:24204234
NASA Astrophysics Data System (ADS)
Wang, Wei; Sun, Yeqing; Zhao, Qian; Han, Lu
2016-07-01
Highly ionizing radiation (HZE) in space is considered as main factor causing biological effects. Radiobiological studies during space flights are unrepeatable due to the variable space radiation environment, ground-base ion radiations are usually performed to simulate of the space biological effect. Spaceflights present a low-dose rate (0.1˜~0.3mGy/day) radiation environment inside aerocrafts while ground-base ion radiations present a much higher dose rate (100˜~500mGy/min). Whether ground-base ion radiation can reflect effects of space radiation is worth of evaluation. In this research, we compared the functional proteomic profiles of rice plants between on-ground simulated HZE particle radiation and spaceflight treatments. Three independent ground-base seed ionizing radiation experiments with different cumulative doses (dose range: 2˜~20000mGy) and different liner energy transfer (LET) values (13.3˜~500keV/μμm) and two independent seed spaceflight experiments onboard Chinese 20th satellite and SZ-6 spacecraft were carried out. Alterations in the proteome were analyzed by two-dimensional difference gel electrophoresis (2-D DIGE) with MALDI-TOF/TOF mass spectrometry identifications. 45 and 59 proteins showed significant (p<0.05) and reproducible quantitative differences in ground-base ion radiation and spaceflight experiments respectively. The functions of ground-base radiation and spaceflight proteins were both involved in a wide range of biological processes. Gene Ontology enrichment analysis further revealed that ground-base radiation responsive proteins were mainly involved in removal of superoxide radicals, defense response to stimulus and photosynthesis, while spaceflight responsive proteins mainly participate in nucleoside metabolic process, protein folding and phosphorylation. The results implied that ground-base radiations cannot truly reflect effects of spaceflight radiations, ground-base radiation was a kind of indirect effect to rice causing oxidation and metabolism stresses, but space radiation was a kind of direct effect leading to macromolecule (DNA and protein) damage and signal pathway disorders. This functional proteomic analysis work might provide a new evaluation method for further on-ground simulated HZE radiation experiments.
Kong, Xianming; Yu, Qian; Lv, Zhongpeng; Du, Xuezhong
2013-10-11
Tandem assays of protein and glucose in combination with mannose-functionalized Fe3 O4 @SiO2 and Ag@SiO2 tag particles have promising potential in effective magnetic separation and highly sensitive and selective SERS assays of biomaterials. It is for the first time that tandem assay of glucose is developed using SERS based on the Con A-sandwiched microstructures between the functionalized magnetic and tag particles. Copyright © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
A human functional protein interaction network and its application to cancer data analysis
2010-01-01
Background One challenge facing biologists is to tease out useful information from massive data sets for further analysis. A pathway-based analysis may shed light by projecting candidate genes onto protein functional relationship networks. We are building such a pathway-based analysis system. Results We have constructed a protein functional interaction network by extending curated pathways with non-curated sources of information, including protein-protein interactions, gene coexpression, protein domain interaction, Gene Ontology (GO) annotations and text-mined protein interactions, which cover close to 50% of the human proteome. By applying this network to two glioblastoma multiforme (GBM) data sets and projecting cancer candidate genes onto the network, we found that the majority of GBM candidate genes form a cluster and are closer than expected by chance, and the majority of GBM samples have sequence-altered genes in two network modules, one mainly comprising genes whose products are localized in the cytoplasm and plasma membrane, and another comprising gene products in the nucleus. Both modules are highly enriched in known oncogenes, tumor suppressors and genes involved in signal transduction. Similar network patterns were also found in breast, colorectal and pancreatic cancers. Conclusions We have built a highly reliable functional interaction network upon expert-curated pathways and applied this network to the analysis of two genome-wide GBM and several other cancer data sets. The network patterns revealed from our results suggest common mechanisms in the cancer biology. Our system should provide a foundation for a network or pathway-based analysis platform for cancer and other diseases. PMID:20482850
Computational gene network study on antibiotic resistance genes of Acinetobacter baumannii.
Anitha, P; Anbarasu, Anand; Ramaiah, Sudha
2014-05-01
Multi Drug Resistance (MDR) in Acinetobacter baumannii is one of the major threats for emerging nosocomial infections in hospital environment. Multidrug-resistance in A. baumannii may be due to the implementation of multi-combination resistance mechanisms such as β-lactamase synthesis, Penicillin-Binding Proteins (PBPs) changes, alteration in porin proteins and in efflux pumps against various existing classes of antibiotics. Multiple antibiotic resistance genes are involved in MDR. These resistance genes are transferred through plasmids, which are responsible for the dissemination of antibiotic resistance among Acinetobacter spp. In addition, these resistance genes may also have a tendency to interact with each other or with their gene products. Therefore, it becomes necessary to understand the impact of these interactions in antibiotic resistance mechanism. Hence, our study focuses on protein and gene network analysis on various resistance genes, to elucidate the role of the interacting proteins and to study their functional contribution towards antibiotic resistance. From the search tool for the retrieval of interacting gene/protein (STRING), a total of 168 functional partners for 15 resistance genes were extracted based on the confidence scoring system. The network study was then followed up with functional clustering of associated partners using molecular complex detection (MCODE). Later, we selected eight efficient clusters based on score. Interestingly, the associated protein we identified from the network possessed greater functional similarity with known resistance genes. This network-based approach on resistance genes of A. baumannii could help in identifying new genes/proteins and provide clues on their association in antibiotic resistance. Copyright © 2014 Elsevier Ltd. All rights reserved.
Computational approaches for drug discovery.
Hung, Che-Lun; Chen, Chi-Chun
2014-09-01
Cellular proteins are the mediators of multiple organism functions being involved in physiological mechanisms and disease. By discovering lead compounds that affect the function of target proteins, the target diseases or physiological mechanisms can be modulated. Based on knowledge of the ligand-receptor interaction, the chemical structures of leads can be modified to improve efficacy, selectivity and reduce side effects. One rational drug design technology, which enables drug discovery based on knowledge of target structures, functional properties and mechanisms, is computer-aided drug design (CADD). The application of CADD can be cost-effective using experiments to compare predicted and actual drug activity, the results from which can used iteratively to improve compound properties. The two major CADD-based approaches are structure-based drug design, where protein structures are required, and ligand-based drug design, where ligand and ligand activities can be used to design compounds interacting with the protein structure. Approaches in structure-based drug design include docking, de novo design, fragment-based drug discovery and structure-based pharmacophore modeling. Approaches in ligand-based drug design include quantitative structure-affinity relationship and pharmacophore modeling based on ligand properties. Based on whether the structure of the receptor and its interaction with the ligand are known, different design strategies can be seed. After lead compounds are generated, the rule of five can be used to assess whether these have drug-like properties. Several quality validation methods, such as cost function analysis, Fisher's cross-validation analysis and goodness of hit test, can be used to estimate the metrics of different drug design strategies. To further improve CADD performance, multi-computers and graphics processing units may be applied to reduce costs. © 2014 Wiley Periodicals, Inc.
Arraying proteins by cell-free synthesis.
He, Mingyue; Wang, Ming-Wei
2007-10-01
Recent advances in life science have led to great motivation for the development of protein arrays to study functions of genome-encoded proteins. While traditional cell-based methods have been commonly used for generating protein arrays, they are usually a time-consuming process with a number of technical challenges. Cell-free protein synthesis offers an attractive system for making protein arrays, not only does it rapidly converts the genetic information into functional proteins without the need for DNA cloning, but also presents a flexible environment amenable to production of folded proteins or proteins with defined modifications. Recent advancements have made it possible to rapidly generate protein arrays from PCR DNA templates through parallel on-chip protein synthesis. This article reviews current cell-free protein array technologies and their proteomic applications.
Expanded microbial genome coverage and improved protein family annotation in the COG database
Galperin, Michael Y.; Makarova, Kira S.; Wolf, Yuri I.; Koonin, Eugene V.
2015-01-01
Microbial genome sequencing projects produce numerous sequences of deduced proteins, only a small fraction of which have been or will ever be studied experimentally. This leaves sequence analysis as the only feasible way to annotate these proteins and assign to them tentative functions. The Clusters of Orthologous Groups of proteins (COGs) database (http://www.ncbi.nlm.nih.gov/COG/), first created in 1997, has been a popular tool for functional annotation. Its success was largely based on (i) its reliance on complete microbial genomes, which allowed reliable assignment of orthologs and paralogs for most genes; (ii) orthology-based approach, which used the function(s) of the characterized member(s) of the protein family (COG) to assign function(s) to the entire set of carefully identified orthologs and describe the range of potential functions when there were more than one; and (iii) careful manual curation of the annotation of the COGs, aimed at detailed prediction of the biological function(s) for each COG while avoiding annotation errors and overprediction. Here we present an update of the COGs, the first since 2003, and a comprehensive revision of the COG annotations and expansion of the genome coverage to include representative complete genomes from all bacterial and archaeal lineages down to the genus level. This re-analysis of the COGs shows that the original COG assignments had an error rate below 0.5% and allows an assessment of the progress in functional genomics in the past 12 years. During this time, functions of many previously uncharacterized COGs have been elucidated and tentative functional assignments of many COGs have been validated, either by targeted experiments or through the use of high-throughput methods. A particularly important development is the assignment of functions to several widespread, conserved proteins many of which turned out to participate in translation, in particular rRNA maturation and tRNA modification. The new version of the COGs is expected to become an important tool for microbial genomics. PMID:25428365
A three-way approach for protein function classification
2017-01-01
The knowledge of protein functions plays an essential role in understanding biological cells and has a significant impact on human life in areas such as personalized medicine, better crops and improved therapeutic interventions. Due to expense and inherent difficulty of biological experiments, intelligent methods are generally relied upon for automatic assignment of functions to proteins. The technological advancements in the field of biology are improving our understanding of biological processes and are regularly resulting in new features and characteristics that better describe the role of proteins. It is inevitable to neglect and overlook these anticipated features in designing more effective classification techniques. A key issue in this context, that is not being sufficiently addressed, is how to build effective classification models and approaches for protein function prediction by incorporating and taking advantage from the ever evolving biological information. In this article, we propose a three-way decision making approach which provides provisions for seeking and incorporating future information. We considered probabilistic rough sets based models such as Game-Theoretic Rough Sets (GTRS) and Information-Theoretic Rough Sets (ITRS) for inducing three-way decisions. An architecture of protein functions classification with probabilistic rough sets based three-way decisions is proposed and explained. Experiments are carried out on Saccharomyces cerevisiae species dataset obtained from Uniprot database with the corresponding functional classes extracted from the Gene Ontology (GO) database. The results indicate that as the level of biological information increases, the number of deferred cases are reduced while maintaining similar level of accuracy. PMID:28234929
A three-way approach for protein function classification.
Ur Rehman, Hafeez; Azam, Nouman; Yao, JingTao; Benso, Alfredo
2017-01-01
The knowledge of protein functions plays an essential role in understanding biological cells and has a significant impact on human life in areas such as personalized medicine, better crops and improved therapeutic interventions. Due to expense and inherent difficulty of biological experiments, intelligent methods are generally relied upon for automatic assignment of functions to proteins. The technological advancements in the field of biology are improving our understanding of biological processes and are regularly resulting in new features and characteristics that better describe the role of proteins. It is inevitable to neglect and overlook these anticipated features in designing more effective classification techniques. A key issue in this context, that is not being sufficiently addressed, is how to build effective classification models and approaches for protein function prediction by incorporating and taking advantage from the ever evolving biological information. In this article, we propose a three-way decision making approach which provides provisions for seeking and incorporating future information. We considered probabilistic rough sets based models such as Game-Theoretic Rough Sets (GTRS) and Information-Theoretic Rough Sets (ITRS) for inducing three-way decisions. An architecture of protein functions classification with probabilistic rough sets based three-way decisions is proposed and explained. Experiments are carried out on Saccharomyces cerevisiae species dataset obtained from Uniprot database with the corresponding functional classes extracted from the Gene Ontology (GO) database. The results indicate that as the level of biological information increases, the number of deferred cases are reduced while maintaining similar level of accuracy.
Mudgal, Richa; Srinivasan, Narayanaswamy; Chandra, Nagasuma
2017-07-01
Functional annotation is seldom straightforward with complexities arising due to functional divergence in protein families or functional convergence between non-homologous protein families, leading to mis-annotations. An enzyme may contain multiple domains and not all domains may be involved in a given function, adding to the complexity in function annotation. To address this, we use binding site information from bound cognate ligands and catalytic residues, since it can help in resolving fold-function relationships at a finer level and with higher confidence. A comprehensive database of 2,020 fold-function-binding site relationships has been systematically generated. A network-based approach is employed to capture the complexity in these relationships, from which different types of associations are deciphered, that identify versatile protein folds performing diverse functions, same function associated with multiple folds and one-to-one relationships. Binding site similarity networks integrated with fold, function, and ligand similarity information are generated to understand the depth of these relationships. Apart from the observed continuity in the functional site space, network properties of these revealed versatile families with topologically different or dissimilar binding sites and structural families that perform very similar functions. As a case study, subtle changes in the active site of a set of evolutionarily related superfamilies are studied using these networks. Tracing of such similarities in evolutionarily related proteins provide clues into the transition and evolution of protein functions. Insights from this study will be helpful in accurate and reliable functional annotations of uncharacterized proteins, poly-pharmacology, and designing enzymes with new functional capabilities. Proteins 2017; 85:1319-1335. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.
Scoring ligand similarity in structure-based virtual screening.
Zavodszky, Maria I; Rohatgi, Anjali; Van Voorst, Jeffrey R; Yan, Honggao; Kuhn, Leslie A
2009-01-01
Scoring to identify high-affinity compounds remains a challenge in virtual screening. On one hand, protein-ligand scoring focuses on weighting favorable and unfavorable interactions between the two molecules. Ligand-based scoring, on the other hand, focuses on how well the shape and chemistry of each ligand candidate overlay on a three-dimensional reference ligand. Our hypothesis is that a hybrid approach, using ligand-based scoring to rank dockings selected by protein-ligand scoring, can ensure that high-ranking molecules mimic the shape and chemistry of a known ligand while also complementing the binding site. Results from applying this approach to screen nearly 70 000 National Cancer Institute (NCI) compounds for thrombin inhibitors tend to support the hypothesis. EON ligand-based ranking of docked molecules yielded the majority (4/5) of newly discovered, low to mid-micromolar inhibitors from a panel of 27 assayed compounds, whereas ranking docked compounds by protein-ligand scoring alone resulted in one new inhibitor. Since the results depend on the choice of scoring function, an analysis of properties was performed on the top-scoring docked compounds according to five different protein-ligand scoring functions, plus EON scoring using three different reference compounds. The results indicate that the choice of scoring function, even among scoring functions measuring the same types of interactions, can have an unexpectedly large effect on which compounds are chosen from screening. Furthermore, there was almost no overlap between the top-scoring compounds from protein-ligand versus ligand-based scoring, indicating the two approaches provide complementary information. Matchprint analysis, a new addition to the SLIDE (Screening Ligands by Induced-fit Docking, Efficiently) screening toolset, facilitated comparison of docked molecules' interactions with those of known inhibitors. The majority of interactions conserved among top-scoring compounds for a given scoring function, and from the different scoring functions, proved to be conserved interactions in known inhibitors. This was particularly true in the S1 pocket, which was occupied by all the docked compounds. (c) 2009 John Wiley & Sons, Ltd.
Narayan, Vikram; Halada, Petr; Hernychová, Lenka; Chong, Yuh Ping; Žáková, Jitka; Hupp, Ted R.; Vojtesek, Borivoj; Ball, Kathryn L.
2011-01-01
The interferon-regulated transcription factor and tumor suppressor protein IRF-1 is predicted to be largely disordered outside of the DNA-binding domain. One of the advantages of intrinsically disordered protein domains is thought to be their ability to take part in multiple, specific but low affinity protein interactions; however, relatively few IRF-1-interacting proteins have been described. The recent identification of a functional binding interface for the E3-ubiquitin ligase CHIP within the major disordered domain of IRF-1 led us to ask whether this region might be employed more widely by regulators of IRF-1 function. Here we describe the use of peptide aptamer-based affinity chromatography coupled with mass spectrometry to define a multiprotein binding interface on IRF-1 (Mf2 domain; amino acids 106–140) and to identify Mf2-binding proteins from A375 cells. Based on their function as known transcriptional regulators, a selection of the Mf2 domain-binding proteins (NPM1, TRIM28, and YB-1) have been validated using in vitro and cell-based assays. Interestingly, although NPM1, TRIM28, and YB-1 all bind to the Mf2 domain, they have differing amino acid specificities, demonstrating the degree of combinatorial diversity and specificity available through linear interaction motifs. PMID:21245151
A high throughput mutagenic analysis of yeast sumo structure and function
Newman, Heather A.; Lu, Jian; Carson, Caryn; Boeke, Jef D.
2017-01-01
Sumoylation regulates a wide range of essential cellular functions through diverse mechanisms that remain to be fully understood. Using S. cerevisiae, a model organism with a single essential SUMO gene (SMT3), we developed a library of >250 mutant strains with single or multiple amino acid substitutions of surface or core residues in the Smt3 protein. By screening this library using plate-based assays, we have generated a comprehensive structure-function based map of Smt3, revealing essential amino acid residues and residues critical for function under a variety of genotoxic and proteotoxic stress conditions. Functionally important residues mapped to surfaces affecting Smt3 precursor processing and deconjugation from protein substrates, covalent conjugation to protein substrates, and non-covalent interactions with E3 ligases and downstream effector proteins containing SUMO-interacting motifs. Lysine residues potentially involved in formation of polymeric chains were also investigated, revealing critical roles for polymeric chains, but redundancy in specific chain linkages. Collectively, our findings provide important insights into the molecular basis of signaling through sumoylation. Moreover, the library of Smt3 mutants represents a valuable resource for further exploring the functions of sumoylation in cellular stress response and other SUMO-dependent pathways. PMID:28166236
Genetic code expansion for multiprotein complex engineering.
Koehler, Christine; Sauter, Paul F; Wawryszyn, Mirella; Girona, Gemma Estrada; Gupta, Kapil; Landry, Jonathan J M; Fritz, Markus Hsi-Yang; Radic, Ksenija; Hoffmann, Jan-Erik; Chen, Zhuo A; Zou, Juan; Tan, Piau Siong; Galik, Bence; Junttila, Sini; Stolt-Bergner, Peggy; Pruneri, Giancarlo; Gyenesei, Attila; Schultz, Carsten; Biskup, Moritz Bosse; Besir, Hueseyin; Benes, Vladimir; Rappsilber, Juri; Jechlinger, Martin; Korbel, Jan O; Berger, Imre; Braese, Stefan; Lemke, Edward A
2016-12-01
We present a baculovirus-based protein engineering method that enables site-specific introduction of unique functionalities in a eukaryotic protein complex recombinantly produced in insect cells. We demonstrate the versatility of this efficient and robust protein production platform, 'MultiBacTAG', (i) for the fluorescent labeling of target proteins and biologics using click chemistries, (ii) for glycoengineering of antibodies, and (iii) for structure-function studies of novel eukaryotic complexes using single-molecule Förster resonance energy transfer as well as site-specific crosslinking strategies.
CytoCluster: A Cytoscape Plugin for Cluster Analysis and Visualization of Biological Networks.
Li, Min; Li, Dongyan; Tang, Yu; Wu, Fangxiang; Wang, Jianxin
2017-08-31
Nowadays, cluster analysis of biological networks has become one of the most important approaches to identifying functional modules as well as predicting protein complexes and network biomarkers. Furthermore, the visualization of clustering results is crucial to display the structure of biological networks. Here we present CytoCluster, a cytoscape plugin integrating six clustering algorithms, HC-PIN (Hierarchical Clustering algorithm in Protein Interaction Networks), OH-PIN (identifying Overlapping and Hierarchical modules in Protein Interaction Networks), IPCA (Identifying Protein Complex Algorithm), ClusterONE (Clustering with Overlapping Neighborhood Expansion), DCU (Detecting Complexes based on Uncertain graph model), IPC-MCE (Identifying Protein Complexes based on Maximal Complex Extension), and BinGO (the Biological networks Gene Ontology) function. Users can select different clustering algorithms according to their requirements. The main function of these six clustering algorithms is to detect protein complexes or functional modules. In addition, BinGO is used to determine which Gene Ontology (GO) categories are statistically overrepresented in a set of genes or a subgraph of a biological network. CytoCluster can be easily expanded, so that more clustering algorithms and functions can be added to this plugin. Since it was created in July 2013, CytoCluster has been downloaded more than 9700 times in the Cytoscape App store and has already been applied to the analysis of different biological networks. CytoCluster is available from http://apps.cytoscape.org/apps/cytocluster.
CytoCluster: A Cytoscape Plugin for Cluster Analysis and Visualization of Biological Networks
Li, Min; Li, Dongyan; Tang, Yu; Wang, Jianxin
2017-01-01
Nowadays, cluster analysis of biological networks has become one of the most important approaches to identifying functional modules as well as predicting protein complexes and network biomarkers. Furthermore, the visualization of clustering results is crucial to display the structure of biological networks. Here we present CytoCluster, a cytoscape plugin integrating six clustering algorithms, HC-PIN (Hierarchical Clustering algorithm in Protein Interaction Networks), OH-PIN (identifying Overlapping and Hierarchical modules in Protein Interaction Networks), IPCA (Identifying Protein Complex Algorithm), ClusterONE (Clustering with Overlapping Neighborhood Expansion), DCU (Detecting Complexes based on Uncertain graph model), IPC-MCE (Identifying Protein Complexes based on Maximal Complex Extension), and BinGO (the Biological networks Gene Ontology) function. Users can select different clustering algorithms according to their requirements. The main function of these six clustering algorithms is to detect protein complexes or functional modules. In addition, BinGO is used to determine which Gene Ontology (GO) categories are statistically overrepresented in a set of genes or a subgraph of a biological network. CytoCluster can be easily expanded, so that more clustering algorithms and functions can be added to this plugin. Since it was created in July 2013, CytoCluster has been downloaded more than 9700 times in the Cytoscape App store and has already been applied to the analysis of different biological networks. CytoCluster is available from http://apps.cytoscape.org/apps/cytocluster. PMID:28858211
Handfield, Louis-François; Chong, Yolanda T.; Simmons, Jibril; Andrews, Brenda J.; Moses, Alan M.
2013-01-01
Protein subcellular localization has been systematically characterized in budding yeast using fluorescently tagged proteins. Based on the fluorescence microscopy images, subcellular localization of many proteins can be classified automatically using supervised machine learning approaches that have been trained to recognize predefined image classes based on statistical features. Here, we present an unsupervised analysis of protein expression patterns in a set of high-resolution, high-throughput microscope images. Our analysis is based on 7 biologically interpretable features which are evaluated on automatically identified cells, and whose cell-stage dependency is captured by a continuous model for cell growth. We show that it is possible to identify most previously identified localization patterns in a cluster analysis based on these features and that similarities between the inferred expression patterns contain more information about protein function than can be explained by a previous manual categorization of subcellular localization. Furthermore, the inferred cell-stage associated to each fluorescence measurement allows us to visualize large groups of proteins entering the bud at specific stages of bud growth. These correspond to proteins localized to organelles, revealing that the organelles must be entering the bud in a stereotypical order. We also identify and organize a smaller group of proteins that show subtle differences in the way they move around the bud during growth. Our results suggest that biologically interpretable features based on explicit models of cell morphology will yield unprecedented power for pattern discovery in high-resolution, high-throughput microscopy images. PMID:23785265
Komatsu, Setsuko; Wang, Xin; Yin, Xiaojian; Nanjo, Yohei; Ohyanagi, Hajime; Sakata, Katsumi
2017-06-23
The Soybean Proteome Database (SPD) stores data on soybean proteins obtained with gel-based and gel-free proteomic techniques. The database was constructed to provide information on proteins for functional analyses. The majority of the data is focused on soybean (Glycine max 'Enrei'). The growth and yield of soybean are strongly affected by environmental stresses such as flooding. The database was originally constructed using data on soybean proteins separated by two-dimensional polyacrylamide gel electrophoresis, which is a gel-based proteomic technique. Since 2015, the database has been expanded to incorporate data obtained by label-free mass spectrometry-based quantitative proteomics, which is a gel-free proteomic technique. Here, the portions of the database consisting of gel-free proteomic data are described. The gel-free proteomic database contains 39,212 proteins identified in 63 sample sets, such as temporal and organ-specific samples of soybean plants grown under flooding stress or non-stressed conditions. In addition, data on organellar proteins identified in mitochondria, nuclei, and endoplasmic reticulum are stored. Furthermore, the database integrates multiple omics data such as genomics, transcriptomics, metabolomics, and proteomics. The SPD database is accessible at http://proteome.dc.affrc.go.jp/Soybean/. The Soybean Proteome Database stores data obtained from both gel-based and gel-free proteomic techniques. The gel-free proteomic database comprises 39,212 proteins identified in 63 sample sets, such as different organs of soybean plants grown under flooding stress or non-stressed conditions in a time-dependent manner. In addition, organellar proteins identified in mitochondria, nuclei, and endoplasmic reticulum are stored in the gel-free proteomics database. A total of 44,704 proteins, including 5490 proteins identified using a gel-based proteomic technique, are stored in the SPD. It accounts for approximately 80% of all predicted proteins from genome sequences, though there are over lapped proteins. Based on the demonstrated application of data stored in the database for functional analyses, it is suggested that these data will be useful for analyses of biological mechanisms in soybean. Furthermore, coupled with recent advances in information and communication technology, the usefulness of this database would increase in the analyses of biological mechanisms. Copyright © 2017 Elsevier B.V. All rights reserved.
Mass Spectrometry Analysis of Spatial Protein Networks by Colocalization Analysis (COLA).
Mardakheh, Faraz K
2017-01-01
A major challenge in systems biology is comprehensive mapping of protein interaction networks. Crucially, such interactions are often dynamic in nature, necessitating methods that can rapidly mine the interactome across varied conditions and treatments to reveal change in the interaction networks. Recently, we described a fast mass spectrometry-based method to reveal functional interactions in mammalian cells on a global scale, by revealing spatial colocalizations between proteins (COLA) (Mardakheh et al., Mol Biosyst 13:92-105, 2017). As protein localization and function are inherently linked, significant colocalization between two proteins is a strong indication for their functional interaction. COLA uses rapid complete subcellular fractionation, coupled with quantitative proteomics to generate a subcellular localization profile for each protein quantified by the mass spectrometer. Robust clustering is then applied to reveal significant similarities in protein localization profiles, indicative of colocalization.
Ribosomal proteins: functions beyond the ribosome.
Zhou, Xiang; Liao, Wen-Juan; Liao, Jun-Ming; Liao, Peng; Lu, Hua
2015-04-01
Although ribosomal proteins are known for playing an essential role in ribosome assembly and protein translation, their ribosome-independent functions have also been greatly appreciated. Over the past decade, more than a dozen of ribosomal proteins have been found to activate the tumor suppressor p53 pathway in response to ribosomal stress. In addition, these ribosomal proteins are involved in various physiological and pathological processes. This review is composed to overview the current understanding of how ribosomal stress provokes the accumulation of ribosome-free ribosomal proteins, as well as the ribosome-independent functions of ribosomal proteins in tumorigenesis, immune signaling, and development. We also propose the potential of applying these pieces of knowledge to the development of ribosomal stress-based cancer therapeutics. © The Author (2015). Published by Oxford University Press on behalf of Journal of Molecular Cell Biology, IBCB, SIBS, CAS. All rights reserved.
Evidence for functional pre-coupled complexes of receptor heteromers and adenylyl cyclase.
Navarro, Gemma; Cordomí, Arnau; Casadó-Anguera, Verónica; Moreno, Estefanía; Cai, Ning-Sheng; Cortés, Antoni; Canela, Enric I; Dessauer, Carmen W; Casadó, Vicent; Pardo, Leonardo; Lluís, Carme; Ferré, Sergi
2018-03-28
G protein-coupled receptors (GPCRs), G proteins and adenylyl cyclase (AC) comprise one of the most studied transmembrane cell signaling pathways. However, it is unknown whether the ligand-dependent interactions between these signaling molecules are based on random collisions or the rearrangement of pre-coupled elements in a macromolecular complex. Furthermore, it remains controversial whether a GPCR homodimer coupled to a single heterotrimeric G protein constitutes a common functional unit. Using a peptide-based approach, we here report evidence for the existence of functional pre-coupled complexes of heteromers of adenosine A 2A receptor and dopamine D 2 receptor homodimers coupled to their cognate Gs and Gi proteins and to subtype 5 AC. We also demonstrate that this macromolecular complex provides the necessary frame for the canonical Gs-Gi interactions at the AC level, sustaining the ability of a Gi-coupled GPCR to counteract AC activation mediated by a Gs-coupled GPCR.
ORCAN-a web-based meta-server for real-time detection and functional annotation of orthologs.
Zielezinski, Andrzej; Dziubek, Michal; Sliski, Jan; Karlowski, Wojciech M
2017-04-15
ORCAN (ORtholog sCANner) is a web-based meta-server for one-click evolutionary and functional annotation of protein sequences. The server combines information from the most popular orthology-prediction resources, including four tools and four online databases. Functional annotation utilizes five additional comparisons between the query and identified homologs, including: sequence similarity, protein domain architectures, functional motifs, Gene Ontology term assignments and a list of associated articles. Furthermore, the server uses a plurality-based rating system to evaluate the orthology relationships and to rank the reference proteins by their evolutionary and functional relevance to the query. Using a dataset of ∼1 million true yeast orthologs as a sample reference set, we show that combining multiple orthology-prediction tools in ORCAN increases the sensitivity and precision by 1-2 percent points. The service is available for free at http://www.combio.pl/orcan/ . wmk@amu.edu.pl. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Acuity of a Cryptochrome and Vision-Based Magnetoreception System in Birds
Solov'yov, Ilia A.; Mouritsen, Henrik; Schulten, Klaus
2010-01-01
Abstract The magnetic compass of birds is embedded in the visual system and it has been hypothesized that the primary sensory mechanism is based on a radical pair reaction. Previous models of magnetoreception have assumed that the radical pair-forming molecules are rigidly fixed in space, and this assumption has been a major objection to the suggested hypothesis. In this article, we investigate theoretically how much disorder is permitted for the radical pair-forming, protein-based magnetic compass in the eye to remain functional. Our study shows that only one rotational degree of freedom of the radical pair-forming protein needs to be partially constrained, while the other two rotational degrees of freedom do not impact the magnetoreceptive properties of the protein. The result implies that any membrane-associated protein is sufficiently restricted in its motion to function as a radical pair-based magnetoreceptor. We relate our theoretical findings to the cryptochromes, currently considered the likeliest candidate to furnish radical pair-based magnetoreception. PMID:20655831
HMPAS: Human Membrane Protein Analysis System
2013-01-01
Background Membrane proteins perform essential roles in diverse cellular functions and are regarded as major pharmaceutical targets. The significance of membrane proteins has led to the developing dozens of resources related with membrane proteins. However, most of these resources are built for specific well-known membrane protein groups, making it difficult to find common and specific features of various membrane protein groups. Methods We collected human membrane proteins from the dispersed resources and predicted novel membrane protein candidates by using ortholog information and our membrane protein classifiers. The membrane proteins were classified according to the type of interaction with the membrane, subcellular localization, and molecular function. We also made new feature dataset to characterize the membrane proteins in various aspects including membrane protein topology, domain, biological process, disease, and drug. Moreover, protein structure and ICD-10-CM based integrated disease and drug information was newly included. To analyze the comprehensive information of membrane proteins, we implemented analysis tools to identify novel sequence and functional features of the classified membrane protein groups and to extract features from protein sequences. Results We constructed HMPAS with 28,509 collected known membrane proteins and 8,076 newly predicted candidates. This system provides integrated information of human membrane proteins individually and in groups organized by 45 subcellular locations and 1,401 molecular functions. As a case study, we identified associations between the membrane proteins and diseases and present that membrane proteins are promising targets for diseases related with nervous system and circulatory system. A web-based interface of this system was constructed to facilitate researchers not only to retrieve organized information of individual proteins but also to use the tools to analyze the membrane proteins. Conclusions HMPAS provides comprehensive information about human membrane proteins including specific features of certain membrane protein groups. In this system, user can acquire the information of individual proteins and specified groups focused on their conserved sequence features, involved cellular processes, and diseases. HMPAS may contribute as a valuable resource for the inference of novel cellular mechanisms and pharmaceutical targets associated with the human membrane proteins. HMPAS is freely available at http://fcode.kaist.ac.kr/hmpas. PMID:24564858
Water-Rich Fluid Material Containing Orderly Condensed Proteins.
Nojima, Tatsuya; Iyoda, Tomokazu
2017-01-24
A fluid material with high protein content (120-310 mg mL -1 ) was formed through the ordered self-assembly of native proteins segregated from water. This material is instantly prepared by the simple mixing of a protein solution with anionic and cationic surfactants. By changing the ratio of the surfactants based on the electrostatic characteristics of the target protein, we observed that the surfactants could function as a versatile molecular glue for protein assembly. Moreover, these protein assemblies could be disassembled back into an aqueous solution depending on the salt conditions. Owing to the water-retaining properties of the hydrophilic part of surfactants, the proteins in this material are in a water-rich environment, which maintains their native structure and function. The inclusion of water also provides functional extensibility to this material, as demonstrated by the preparation of an enzymatically active gel. We anticipate that the unique features of this material will permit the use of proteins not only in solution but also as elements of integrated functionalized materials. © 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
Quality assessment of protein model-structures based on structural and functional similarities
2012-01-01
Background Experimental determination of protein 3D structures is expensive, time consuming and sometimes impossible. A gap between number of protein structures deposited in the World Wide Protein Data Bank and the number of sequenced proteins constantly broadens. Computational modeling is deemed to be one of the ways to deal with the problem. Although protein 3D structure prediction is a difficult task, many tools are available. These tools can model it from a sequence or partial structural information, e.g. contact maps. Consequently, biologists have the ability to generate automatically a putative 3D structure model of any protein. However, the main issue becomes evaluation of the model quality, which is one of the most important challenges of structural biology. Results GOBA - Gene Ontology-Based Assessment is a novel Protein Model Quality Assessment Program. It estimates the compatibility between a model-structure and its expected function. GOBA is based on the assumption that a high quality model is expected to be structurally similar to proteins functionally similar to the prediction target. Whereas DALI is used to measure structure similarity, protein functional similarity is quantified using standardized and hierarchical description of proteins provided by Gene Ontology combined with Wang's algorithm for calculating semantic similarity. Two approaches are proposed to express the quality of protein model-structures. One is a single model quality assessment method, the other is its modification, which provides a relative measure of model quality. Exhaustive evaluation is performed on data sets of model-structures submitted to the CASP8 and CASP9 contests. Conclusions The validation shows that the method is able to discriminate between good and bad model-structures. The best of tested GOBA scores achieved 0.74 and 0.8 as a mean Pearson correlation to the observed quality of models in our CASP8 and CASP9-based validation sets. GOBA also obtained the best result for two targets of CASP8, and one of CASP9, compared to the contest participants. Consequently, GOBA offers a novel single model quality assessment program that addresses the practical needs of biologists. In conjunction with other Model Quality Assessment Programs (MQAPs), it would prove useful for the evaluation of single protein models. PMID:22998498
The alphabet of intrinsic disorder
Uversky, Vladimir N
2013-01-01
The ability of a protein to fold into unique functional state or to stay intrinsically disordered is encoded in its amino acid sequence. Both ordered and intrinsically disordered proteins (IDPs) are natural polypeptides that use the same arsenal of 20 proteinogenic amino acid residues as their major building blocks. The exceptional structural plasticity of IDPs, their capability to exist as heterogeneous structural ensembles and their wide array of important disorder-based biological functions that complements functional repertoire of ordered proteins are all rooted within the peculiar differential usage of these building blocks by ordered proteins and IDPs. In fact, some residues (so-called disorder-promoting residues) are noticeably more common in IDPs than in sequences of ordered proteins, which, in their turn, are enriched in several order-promoting residues. Furthermore, residues can be arranged according to their “disorder promoting potencies,” which are evaluated based on the relative abundances of various amino acids in ordered and disordered proteins. This review continues a series of publications on the roles of different amino acids in defining the phenomenon of protein intrinsic disorder and concerns glutamic acid, which is the second most disorder-promoting residue. PMID:28516010
Purschke, Benedict; Tanzmeister, Helene; Meinlschmidt, Pia; Baumgartner, Sabine; Lauter, Kathrin; Jäger, Henry
2018-04-01
Edible insects emerged as an alternative source of high-quality proteins. Therefore, the effect of an extraction procedure for the recovery of migratory locust (Locusta migratoria) protein concentrate (MLPC) on the compositional characteristics and techno-functional properties was studied. The influence of pH value (2-10) and salt concentration (0, 1 and 3% w/v) on techno-functional properties was evaluated. Proteins were identified and characterized by RP-HPLC, SDS-PAGE and LC-MS/MS. The initial crude protein content of the whole locusts (65.9% on dry base) could be enhanced to 82.3% (MLPC). Solubility profiles of MLPC showed maximum solubility at pH9 (100%). Promising functionality comparable to egg white protein in terms of emulsifying activity at pH5, foamability at pH3 and 3% NaCl, and foam stability at pH9 were found. Consequently, MLPC offers a nutritious protein source with good functional properties at certain conditions, which could be used as food ingredient in a variety of food systems. Copyright © 2018 Elsevier Ltd. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zemla, A; Lang, D; Kostova, T
2010-11-29
Most of the currently used methods for protein function prediction rely on sequence-based comparisons between a query protein and those for which a functional annotation is provided. A serious limitation of sequence similarity-based approaches for identifying residue conservation among proteins is the low confidence in assigning residue-residue correspondences among proteins when the level of sequence identity between the compared proteins is poor. Multiple sequence alignment methods are more satisfactory - still, they cannot provide reliable results at low levels of sequence identity. Our goal in the current work was to develop an algorithm that could overcome these difficulties and facilitatemore » the identification of structurally (and possibly functionally) relevant residue-residue correspondences between compared protein structures. Here we present StralSV, a new algorithm for detecting closely related structure fragments and quantifying residue frequency from tight local structure alignments. We apply StralSV in a study of the RNA-dependent RNA polymerase of poliovirus and demonstrate that the algorithm can be used to determine regions of the protein that are relatively unique or that shared structural similarity with structures that are distantly related. By quantifying residue frequencies among many residue-residue pairs extracted from local alignments, one can infer potential structural or functional importance of specific residues that are determined to be highly conserved or that deviate from a consensus. We further demonstrate that considerable detailed structural and phylogenetic information can be derived from StralSV analyses. StralSV is a new structure-based algorithm for identifying and aligning structure fragments that have similarity to a reference protein. StralSV analysis can be used to quantify residue-residue correspondences and identify residues that may be of particular structural or functional importance, as well as unusual or unexpected residues at a given sequence position.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Anna Johnston, SNL 9215
2002-09-01
PDB to AMPL Conversion was written to convert protein data base files to AMPL files. The protein data bases on the internet contain a wealth of information about the structue and makeup of proteins. Each file contains information derived by one or more experiments and contains information on how the experiment waw performed, the amino acid building blocks of each chain, and often the three-dimensional structure of the protein extracted from the experiments. The way a protein folds determines much about its function. Thus, studying the three-dimensional structure of the protein is of great interest. Analysing the contact maps ismore » one way to examine the structure. A contact map is a graph which has a linear back bone of amino acids for nodes (i.e., adjacent amino acids are always connected) and vertices between non-adjacent nodes if they are close enough to be considered in contact. If the graphs are similar then the folds of the protein and their function should also be similar. This software extracts the contact maps from a protein data base file and puts in into AMPL data format. This format is designed for use in AMPL, a programming language for simplifying linear programming formulations.« less
A bacterial type III secretion-based protein delivery tool for broad applications in cell biology.
Ittig, Simon J; Schmutz, Christoph; Kasper, Christoph A; Amstutz, Marlise; Schmidt, Alexander; Sauteur, Loïc; Vigano, M Alessandra; Low, Shyan Huey; Affolter, Markus; Cornelis, Guy R; Nigg, Erich A; Arrieumerlou, Cécile
2015-11-23
Methods enabling the delivery of proteins into eukaryotic cells are essential to address protein functions. Here we propose broad applications to cell biology for a protein delivery tool based on bacterial type III secretion (T3S). We show that bacterial, viral, and human proteins, fused to the N-terminal fragment of the Yersinia enterocolitica T3S substrate YopE, are effectively delivered into target cells in a fast and controllable manner via the injectisome of extracellular bacteria. This method enables functional interaction studies by the simultaneous injection of multiple proteins and allows the targeting of proteins to different subcellular locations by use of nanobody-fusion proteins. After delivery, proteins can be freed from the YopE fragment by a T3S-translocated viral protease or fusion to ubiquitin and cleavage by endogenous ubiquitin proteases. Finally, we show that this delivery tool is suitable to inject proteins in living animals and combine it with phosphoproteomics to characterize the systems-level impact of proapoptotic human truncated BID on the cellular network. © 2015 Ittig et al.
Small Cofactors May Assist Protein Emergence from RNA World: Clues from RNA-Protein Complexes
Shen, Liang; Ji, Hong-Fang
2011-01-01
It is now widely accepted that at an early stage in the evolution of life an RNA world arose, in which RNAs both served as the genetic material and catalyzed diverse biochemical reactions. Then, proteins have gradually replaced RNAs because of their superior catalytic properties in catalysis over time. Therefore, it is important to investigate how primitive functional proteins emerged from RNA world, which can shed light on the evolutionary pathway of life from RNA world to the modern world. In this work, we proposed that the emergence of most primitive functional proteins are assisted by the early primitive nucleotide cofactors, while only a minority are induced directly by RNAs based on the analysis of RNA-protein complexes. Furthermore, the present findings have significant implication for exploring the composition of primitive RNA, i.e., adenine base as principal building blocks. PMID:21789260
Rice proteome database: a step toward functional analysis of the rice genome.
Komatsu, Setsuko
2005-09-01
The technique of proteome analysis using two-dimensional polyacrylamide gel electrophoresis (2D-PAGE) has the power to monitor global changes that occur in the protein complement of tissues and subcellular compartments. In this study, the proteins of rice were cataloged, a rice proteome database was constructed, and a functional characterization of some of the identified proteins was undertaken. Proteins extracted from various tissues and subcellular compartments in rice were separated by 2D-PAGE and an image analyzer was used to construct a display of the proteins. The Rice Proteome Database contains 23 reference maps based on 2D-PAGE of proteins from various rice tissues and subcellular compartments. These reference maps comprise 13129 identified proteins, and the amino acid sequences of 5092 proteins are entered in the database. Major proteins involved in growth or stress responses were identified using the proteome approach. Some of these proteins, including a beta-tubulin, calreticulin, and ribulose-1,5-bisphosphate carboxylase/oxygenase activase in rice, have unexpected functions. The information obtained from the Rice Proteome Database will aid in cloning the genes for and predicting the function of unknown proteins.
Classifying proteins into functional groups based on all-versus-all BLAST of 10 million proteins.
Kolker, Natali; Higdon, Roger; Broomall, William; Stanberry, Larissa; Welch, Dean; Lu, Wei; Haynes, Winston; Barga, Roger; Kolker, Eugene
2011-01-01
To address the monumental challenge of assigning function to millions of sequenced proteins, we completed the first of a kind all-versus-all sequence alignments using BLAST for 9.9 million proteins in the UniRef100 database. Microsoft Windows Azure produced over 3 billion filtered records in 6 days using 475 eight-core virtual machines. Protein classification into functional groups was then performed using Hive and custom jars implemented on top of Apache Hadoop utilizing the MapReduce paradigm. First, using the Clusters of Orthologous Genes (COG) database, a length normalized bit score (LNBS) was determined to be the best similarity measure for classification of proteins. LNBS achieved sensitivity and specificity of 98% each. Second, out of 5.1 million bacterial proteins, about two-thirds were assigned to significantly extended COG groups, encompassing 30 times more assigned proteins. Third, the remaining proteins were classified into protein functional groups using an innovative implementation of a single-linkage algorithm on an in-house Hadoop compute cluster. This implementation significantly reduces the run time for nonindexed queries and optimizes efficient clustering on a large scale. The performance was also verified on Amazon Elastic MapReduce. This clustering assigned nearly 2 million proteins to approximately half a million different functional groups. A similar approach was applied to classify 2.8 million eukaryotic sequences resulting in over 1 million proteins being assign to existing KOG groups and the remainder clustered into 100,000 functional groups.
Improving prediction of heterodimeric protein complexes using combination with pairwise kernel.
Ruan, Peiying; Hayashida, Morihiro; Akutsu, Tatsuya; Vert, Jean-Philippe
2018-02-19
Since many proteins become functional only after they interact with their partner proteins and form protein complexes, it is essential to identify the sets of proteins that form complexes. Therefore, several computational methods have been proposed to predict complexes from the topology and structure of experimental protein-protein interaction (PPI) network. These methods work well to predict complexes involving at least three proteins, but generally fail at identifying complexes involving only two different proteins, called heterodimeric complexes or heterodimers. There is however an urgent need for efficient methods to predict heterodimers, since the majority of known protein complexes are precisely heterodimers. In this paper, we use three promising kernel functions, Min kernel and two pairwise kernels, which are Metric Learning Pairwise Kernel (MLPK) and Tensor Product Pairwise Kernel (TPPK). We also consider the normalization forms of Min kernel. Then, we combine Min kernel or its normalization form and one of the pairwise kernels by plugging. We applied kernels based on PPI, domain, phylogenetic profile, and subcellular localization properties to predicting heterodimers. Then, we evaluate our method by employing C-Support Vector Classification (C-SVC), carrying out 10-fold cross-validation, and calculating the average F-measures. The results suggest that the combination of normalized-Min-kernel and MLPK leads to the best F-measure and improved the performance of our previous work, which had been the best existing method so far. We propose new methods to predict heterodimers, using a machine learning-based approach. We train a support vector machine (SVM) to discriminate interacting vs non-interacting protein pairs, based on informations extracted from PPI, domain, phylogenetic profiles and subcellular localization. We evaluate in detail new kernel functions to encode these data, and report prediction performance that outperforms the state-of-the-art.
Semantic integration to identify overlapping functional modules in protein interaction networks
Cho, Young-Rae; Hwang, Woochang; Ramanathan, Murali; Zhang, Aidong
2007-01-01
Background The systematic analysis of protein-protein interactions can enable a better understanding of cellular organization, processes and functions. Functional modules can be identified from the protein interaction networks derived from experimental data sets. However, these analyses are challenging because of the presence of unreliable interactions and the complex connectivity of the network. The integration of protein-protein interactions with the data from other sources can be leveraged for improving the effectiveness of functional module detection algorithms. Results We have developed novel metrics, called semantic similarity and semantic interactivity, which use Gene Ontology (GO) annotations to measure the reliability of protein-protein interactions. The protein interaction networks can be converted into a weighted graph representation by assigning the reliability values to each interaction as a weight. We presented a flow-based modularization algorithm to efficiently identify overlapping modules in the weighted interaction networks. The experimental results show that the semantic similarity and semantic interactivity of interacting pairs were positively correlated with functional co-occurrence. The effectiveness of the algorithm for identifying modules was evaluated using functional categories from the MIPS database. We demonstrated that our algorithm had higher accuracy compared to other competing approaches. Conclusion The integration of protein interaction networks with GO annotation data and the capability of detecting overlapping modules substantially improve the accuracy of module identification. PMID:17650343
Canola Proteins for Human Consumption: Extraction, Profile, and Functional Properties
Tan, Siong H; Mailer, Rodney J; Blanchard, Christopher L; Agboola, Samson O
2011-01-01
Canola protein isolate has been suggested as an alternative to other proteins for human food use due to a balanced amino acid profile and potential functional properties such as emulsifying, foaming, and gelling abilities. This is, therefore, a review of the studies on the utilization of canola protein in human food, comprising the extraction processes for protein isolates and fractions, the molecular character of the extracted proteins, as well as their food functional properties. A majority of studies were based on proteins extracted from the meal using alkaline solution, presumably due to its high nitrogen yield, followed by those utilizing salt extraction combined with ultrafiltration. Characteristics of canola and its predecessor rapeseed protein fractions such as nitrogen yield, molecular weight profile, isoelectric point, solubility, and thermal properties have been reported and were found to be largely related to the extraction methods. However, very little research has been carried out on the hydrophobicity and structure profiles of the protein extracts that are highly relevant to a proper understanding of food functional properties. Alkaline extracts were generally not very suitable as functional ingredients and contradictory results about many of the measured properties of canola proteins, especially their emulsification tendencies, have also been documented. Further research into improved extraction methods is recommended, as is a more systematic approach to the measurement of desired food functional properties for valid comparison between studies. PMID:21535703
Mizukami, Shin; Hori, Yuichiro; Kikuchi, Kazuya
2014-01-21
The use of genetic engineering techniques allows researchers to combine functional proteins with fluorescent proteins (FPs) to produce fusion proteins that can be visualized in living cells, tissues, and animals. However, several limitations of FPs, such as slow maturation kinetics or issues with photostability under laser illumination, have led researchers to examine new technologies beyond FP-based imaging. Recently, new protein-labeling technologies using protein/peptide tags and tag-specific probes have attracted increasing attention. Although several protein-labeling systems are com mercially available, researchers continue to work on addressing some of the limitations of this technology. To reduce the level of background fluorescence from unlabeled probes, researchers have pursued fluorogenic labeling, in which the labeling probes do not fluoresce until the target proteins are labeled. In this Account, we review two different fluorogenic protein-labeling systems that we have recently developed. First we give a brief history of protein labeling technologies and describe the challenges involved in protein labeling. In the second section, we discuss a fluorogenic labeling system based on a noncatalytic mutant of β-lactamase, which forms specific covalent bonds with β-lactam antibiotics such as ampicillin or cephalosporin. Based on fluorescence (or Förster) resonance energy transfer and other physicochemical principles, we have developed several types of fluorogenic labeling probes. To extend the utility of this labeling system, we took advantage of a hydrophobic β-lactam prodrug structure to achieve intracellular protein labeling. We also describe a small protein tag, photoactive yellow protein (PYP)-tag, and its probes. By utilizing a quenching mechanism based on close intramolecular contact, we incorporated a turn-on switch into the probes for fluorogenic protein labeling. One of these probes allowed us to rapidly image a protein while avoiding washout. In the future, we expect that protein-labeling systems with finely designed probes will lead to novel methodologies that allow researchers to image biomolecules and to perturb protein functions.
Biophysics of protein evolution and evolutionary protein biophysics
Sikosek, Tobias; Chan, Hue Sun
2014-01-01
The study of molecular evolution at the level of protein-coding genes often entails comparing large datasets of sequences to infer their evolutionary relationships. Despite the importance of a protein's structure and conformational dynamics to its function and thus its fitness, common phylogenetic methods embody minimal biophysical knowledge of proteins. To underscore the biophysical constraints on natural selection, we survey effects of protein mutations, highlighting the physical basis for marginal stability of natural globular proteins and how requirement for kinetic stability and avoidance of misfolding and misinteractions might have affected protein evolution. The biophysical underpinnings of these effects have been addressed by models with an explicit coarse-grained spatial representation of the polypeptide chain. Sequence–structure mappings based on such models are powerful conceptual tools that rationalize mutational robustness, evolvability, epistasis, promiscuous function performed by ‘hidden’ conformational states, resolution of adaptive conflicts and conformational switches in the evolution from one protein fold to another. Recently, protein biophysics has been applied to derive more accurate evolutionary accounts of sequence data. Methods have also been developed to exploit sequence-based evolutionary information to predict biophysical behaviours of proteins. The success of these approaches demonstrates a deep synergy between the fields of protein biophysics and protein evolution. PMID:25165599
Hu, Zhaolong; Ho, James C S; Nallani, Madhavan
2017-08-01
A plethora of polymer-based scaffolds have been designed to facilitate biochemical and biophysical investigation of membrane proteins, with a common goal to stabilize and present them in a functional format. In this review, an up-to-date account of such polymer-based supports and incorporation methodologies are presented. Furthermore, conceptual and imminent technological advances, with associated technical challenges are proposed. Copyright © 2017 Elsevier Ltd. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wiedner, Susan D.; Burnum, Kristin E.; Pederson, Leeanna M.
2012-08-03
Environmental and metabolic adaptability is critical for survival of the fungal human pathogen Aspergillus fumigatus in the immunocompromised lung. We employed an activity-based protein profiling (ABPP) approach utilizing a new aryl vinyl sulfonate probe and a serine hydrolase probe combined with quantitative LC-MS based accurate mass and time (AMT) tag proteomics for the identification of functional pathway adaptation of A. fumigatus to environmental variability relevant to pulmonary Invasive Aspergillosis. When the fungal pathogen was grown with human serum, metabolism and energy processes were markedly decreased compared to no serum culture. Additionally, functional pathways associated with amino acid and protein biosynthesismore » were limited as the fungus scavenged from the serum to obtain essential nutrients. Our approach revealed significant metabolic adaptation by A. fumigatus, and provides direct insight into this pathogen’s ability to survive and proliferate.« less
Lang, Tiange; Yin, Kangquan; Liu, Jinyu; Cao, Kunfang; Cannon, Charles H; Du, Fang K
2014-01-01
Predicting protein domains is essential for understanding a protein's function at the molecular level. However, up till now, there has been no direct and straightforward method for predicting protein domains in species without a reference genome sequence. In this study, we developed a functionality with a set of programs that can predict protein domains directly from genomic sequence data without a reference genome. Using whole genome sequence data, the programming functionality mainly comprised DNA assembly in combination with next-generation sequencing (NGS) assembly methods and traditional methods, peptide prediction and protein domain prediction. The proposed new functionality avoids problems associated with de novo assembly due to micro reads and small single repeats. Furthermore, we applied our functionality for the prediction of leucine rich repeat (LRR) domains in four species of Ficus with no reference genome, based on NGS genomic data. We found that the LRRNT_2 and LRR_8 domains are related to plant transpiration efficiency, as indicated by the stomata index, in the four species of Ficus. The programming functionality established in this study provides new insights for protein domain prediction, which is particularly timely in the current age of NGS data expansion.
Structure and function of homodomain-leucine zipper (HD-Zip) proteins.
Elhiti, Mohamed; Stasolla, Claudio
2009-02-01
Homeodomain-leucine zipper (HD-Zip) proteins are transcription factors unique to plants and are encoded by more than 25 genes in Arabidopsis thaliana. Based on sequence analyses these proteins have been classified into four distinct groups: HD-Zip I-IV. HD-Zip proteins are characterized by the presence of two functional domains; a homeodomain (HD) responsible for DNA binding and a leucine zipper domain (Zip) located immediately C-terminal to the homeodomain and involved in protein-protein interaction. Despite sequence similarities HD-ZIP proteins participate in a variety of processes during plant growth and development. HD-Zip I proteins are generally involved in responses related to abiotic stress, abscisic acid (ABA), blue light, de-etiolation and embryogenesis. HD-Zip II proteins participate in light response, shade avoidance and auxin signalling. Members of the third group (HD-Zip III) control embryogenesis, leaf polarity, lateral organ initiation and meristem function. HD-Zip IV proteins play significant roles during anthocyanin accumulation, differentiation of epidermal cells, trichome formation and root development.
De novo inference of protein function from coarse-grained dynamics.
Bhadra, Pratiti; Pal, Debnath
2014-10-01
Inference of molecular function of proteins is the fundamental task in the quest for understanding cellular processes. The task is getting increasingly difficult with thousands of new proteins discovered each day. The difficulty arises primarily due to lack of high-throughput experimental technique for assessing protein molecular function, a lacunae that computational approaches are trying hard to fill. The latter too faces a major bottleneck in absence of clear evidence based on evolutionary information. Here we propose a de novo approach to annotate protein molecular function through structural dynamics match for a pair of segments from two dissimilar proteins, which may share even <10% sequence identity. To screen these matches, corresponding 1 µs coarse-grained (CG) molecular dynamics trajectories were used to compute normalized root-mean-square-fluctuation graphs and select mobile segments, which were, thereafter, matched for all pairs using unweighted three-dimensional autocorrelation vectors. Our in-house custom-built forcefield (FF), extensively validated against dynamics information obtained from experimental nuclear magnetic resonance data, was specifically used to generate the CG dynamics trajectories. The test for correspondence of dynamics-signature of protein segments and function revealed 87% true positive rate and 93.5% true negative rate, on a dataset of 60 experimentally validated proteins, including moonlighting proteins and those with novel functional motifs. A random test against 315 unique fold/function proteins for a negative test gave >99% true recall. A blind prediction on a novel protein appears consistent with additional evidences retrieved therein. This is the first proof-of-principle of generalized use of structural dynamics for inferring protein molecular function leveraging our custom-made CG FF, useful to all. © 2014 Wiley Periodicals, Inc.
Lewis, Aurélia E.; Sommer, Lilly; Arntzen, Magnus Ø.; Strahm, Yvan; Morrice, Nicholas A.; Divecha, Nullin; D'Santos, Clive S.
2011-01-01
Considerable insight into phosphoinositide-regulated cytoplasmic functions has been gained by identifying phosphoinositide-effector proteins. Phosphoinositide-regulated nuclear functions however are fewer and less clear. To address this, we established a proteomic method based on neomycin extraction of intact nuclei to enrich for nuclear phosphoinositide-effector proteins. We identified 168 proteins harboring phosphoinositide-binding domains. Although the vast majority of these contained lysine/arginine-rich patches with the following motif, K/R-(Xn = 3–7)-K-X-K/R-K/R, we also identified a smaller subset of known phosphoinositide-binding proteins containing pleckstrin homology or plant homeodomain modules. Proteins with no prior history of phosphoinositide interaction were identified, some of which have functional roles in RNA splicing and processing and chromatin assembly. The remaining proteins represent potentially other novel nuclear phosphoinositide-effector proteins and as such strengthen our appreciation of phosphoinositide-regulated nuclear functions. DNA topology was exemplar among these: Biochemical assays validated our proteomic data supporting a direct interaction between phosphatidylinositol 4,5-bisphosphate and DNA Topoisomerase IIα. In addition, a subset of neomycin extracted proteins were further validated as phosphatidyl 4,5-bisphosphate-interacting proteins by quantitative lipid pull downs. In summary, data sets such as this serve as a resource for a global view of phosphoinositide-regulated nuclear functions. PMID:21048195
Resilient protein co-expression network in male orbitofrontal cortex layer 2/3 during human aging.
Pabba, Mohan; Scifo, Enzo; Kapadia, Fenika; Nikolova, Yuliya S; Ma, Tianzhou; Mechawar, Naguib; Tseng, George C; Sibille, Etienne
2017-10-01
The orbitofrontal cortex (OFC) is vulnerable to normal and pathologic aging. Currently, layer resolution large-scale proteomic studies describing "normal" age-related alterations at OFC are not available. Here, we performed a large-scale exploratory high-throughput mass spectrometry-based protein analysis on OFC layer 2/3 from 15 "young" (15-43 years) and 18 "old" (62-88 years) human male subjects. We detected 4193 proteins and identified 127 differentially expressed (DE) proteins (p-value ≤0.05; effect size >20%), including 65 up- and 62 downregulated proteins (e.g., GFAP, CALB1). Using a previously described categorization of biological aging based on somatic tissues, that is, peripheral "hallmarks of aging," and considering overlap in protein function, we show the highest representation of altered cell-cell communication (54%), deregulated nutrient sensing (39%), and loss of proteostasis (35%) in the set of OFC layer 2/3 DE proteins. DE proteins also showed a significant association with several neurologic disorders; for example, Alzheimer's disease and schizophrenia. Notably, despite age-related changes in individual protein levels, protein co-expression modules were remarkably conserved across age groups, suggesting robust functional homeostasis. Collectively, these results provide biological insight into aging and associated homeostatic mechanisms that maintain normal brain function with advancing age. Copyright © 2017 Elsevier Inc. All rights reserved.
Protein expression, characterization and activity comparisons of wild type and mutant DUSP5 proteins
Nayak, Jaladhi; Gastonguay, Adam J.; Talipov, Marat R.; ...
2014-12-18
Background: The mitogen-activated protein kinases (MAPKs) pathway is critical for cellular signaling, and proteins such as phosphatases that regulate this pathway are important for normal tissue development. Based on our previous work on dual specificity phosphatase-5 (DUSP5), and its role in embryonic vascular development and disease, we hypothesized that mutations in DUSP5 will affect its function. Results: In this study, we tested this hypothesis by generating full-length glutathione-S-transferase-tagged DUSP5 and serine 147 proline mutant (S147P) proteins from bacteria. Light scattering analysis, circular dichroism, enzymatic assays and molecular modeling approaches have been performed to extensively characterize the protein form and function.more » We demonstrate that both proteins are active and, interestingly, the S147P protein is hypoactive as compared to the DUSP5 WT protein in two distinct biochemical substrate assays. Furthermore, due to the novel positioning of the S147P mutation, we utilize computational modeling to reconstruct full-length DUSP5 and S147P to predict a possible mechanism for the reduced activity of S147P. Conclusion: Taken together, this is the first evidence of the generation and characterization of an active, full-length, mutant DUSP5 protein which will facilitate future structure-function and drug development-based studies.« less
Hyeon, Jeong Eun; Jeon, Sang Duck; Han, Sung Ok
2013-11-01
The cellulosome is one of nature's most elegant and elaborate nanomachines and a key biological and biotechnological macromolecule that can be used as a multi-functional protein complex tool. Each protein module in the cellulosome system is potentially useful in an advanced biotechnology application. The high-affinity interactions between the cohesin and dockerin domains can be used in protein-based biosensors to improve both sensitivity and selectivity. The scaffolding protein includes a carbohydrate-binding module (CBM) that attaches strongly to cellulose substrates and facilitates the purification of proteins fused with the dockerin module through a one-step CBM purification method. Although the surface layer homology (SLH) domain of CbpA is not present in other strains, replacement of the cell surface anchoring domain allows a foreign protein to be displayed on the surface of other strains. The development of a hydrolysis enzyme complex is a useful strategy for consolidated bioprocessing (CBP), enabling microorganisms with biomass hydrolysis activity. Thus, the development of various configurations of multi-functional protein complexes for use as tools in whole-cell biocatalyst systems has drawn considerable attention as an attractive strategy for bioprocess applications. This review provides a detailed summary of the current achievements in Clostridium-derived multi-functional complex development and the impact of these complexes in various areas of biotechnology. Copyright © 2013 Elsevier Inc. All rights reserved.
Protocol for sortase-mediated construction of DNA-protein hybrids and functional nanostructures.
Koussa, Mounir A; Sotomayor, Marcos; Wong, Wesley P
2014-05-15
Recent methods in DNA nanotechnology are enabling the creation of intricate nanostructures through the use of programmable, bottom-up self-assembly. However, structures consisting only of DNA are limited in their ability to act on other biomolecules. Proteins, on the other hand, perform a variety of functions on biological materials, but directed control of the self-assembly process remains a challenge. While DNA-protein hybrids have the potential to provide the best-of-both-worlds, they can be difficult to create as many of the conventional techniques for linking proteins to DNA render proteins dysfunctional. We present here a sortase-based protocol for covalently coupling proteins to DNA with minimal disturbance to protein function. To accomplish this we have developed a two-step process. First, a small synthetic peptide is bioorthogonally and covalently coupled to a DNA oligo using click chemistry. Next, the DNA-peptide chimera is covalently linked to a protein of interest under protein-compatible conditions using the enzyme sortase. Our protocol allows for the simple coupling and purification of a functional DNA-protein hybrid. We use this technique to form oligos bearing cadherin-23 and protocadherin-15 protein fragments. Upon incorporation into a linear M13 scaffold, these protein-DNA hybrids serve as the gate to a binary nanoswitch. The outlined protocol is reliable and modular, facilitating the construction of libraries of oligos and proteins that can be combined to form functional DNA-protein nanostructures. These structures will enable a new class of functional nanostructures, which could be used for therapeutic and industrial processes. Copyright © 2014. Published by Elsevier Inc.
Protocol for sortase-mediated construction of DNA-protein hybrids and functional nanostructures
Koussa, Mounir A.; Sotomayor, Marcos; Wong, Wesley P.
2014-01-01
Recent methods in DNA nanotechnology are enabling the creation of intricate nanostructures through the use of programmable, bottom-up self-assembly. However, structures consisting only of DNA are limited in their ability to act on other biomolecules. Proteins, on the other hand, perform a variety of functions on biological materials, but directed control of the self-assembly process remains a challenge. While DNA-protein hybrids have the potential to provide the best-of-both-worlds, they can be difficult to create as many of the conventional techniques for linking proteins to DNA render proteins dysfunctional. We present here a sortase-based protocol for covalently coupling proteins to DNA with minimal disturbance to protein function. To accomplish this we have developed a two-step process. First, a small synthetic peptide is bioorthogonally and covalently coupled to a DNA oligo using click chemistry. Next, the DNA-peptide chimera is covalently linked to a protein of interest under protein-compatible conditions using the enzyme sortase. Our protocol allows for the simple coupling and purification of a functional DNA-protein hybrid. We use this technique to form oligos bearing cadherin-23 and protocadherin-15 protein fragments. Upon incorporation into a linear M13 scaffold, these protein-DNA hybrids serve as the gate to a binary nanoswitch. The outlined protocol is reliable and modular, facilitating the construction of libraries of oligos and proteins that can be combined to form functional DNA-protein nanostructures. These structures will enable a new class of functional nanostructures, which could be used for therapeutic and industrial processes. PMID:24568941
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cong, Yongzheng; Katipamula, Shanta; Trader, Cameron D.
2016-01-01
Characterizing protein-ligand binding dynamics is crucial for understanding protein function and developing new therapeutic agents. We have developed a novel microfluidic platform that features rapid mixing of protein and ligand solutions, variable incubation times, and on-chip electrospray ionization to perform label-free, solution-based monitoring of protein-ligand binding dynamics. This platform offers many advantages including automated processing, rapid mixing, and low sample consumption.
Pasiakos, Stefan M; Lieberman, Harris R; McLellan, Tom M
2014-05-01
Protein supplements are frequently consumed by athletes and recreationally-active individuals, although the decision to purchase and consume protein supplements is often based on marketing claims rather than evidence-based research. To provide a systematic and comprehensive analysis of literature examining the hypothesis that protein supplements enhance recovery of muscle function and physical performance by attenuating muscle damage and soreness following a previous bout of exercise. English language articles were searched with PubMed and Google Scholar using protein and supplements together with performance, exercise, competition and muscle, alone or in combination as keywords. Inclusion criteria required studies to recruit healthy adults less than 50 years of age and to evaluate the effects of protein supplements alone or in combination with carbohydrate on performance metrics including time-to-exhaustion, time-trial or isometric or isokinetic muscle strength and markers of muscle damage and soreness. Twenty-seven articles were identified of which 18 dealt exclusively with ingestion of protein supplements to reduce muscle damage and soreness and improve recovery of muscle function following exercise, whereas the remaining 9 articles assessed muscle damage as well as performance metrics during single or repeat bouts of exercise. Papers were evaluated based on experimental design and examined for confounders that explain discrepancies between studies such as dietary control, training state of participants, sample size, direct or surrogate measures of muscle damage, and sensitivity of the performance metric. High quality and consistent data demonstrated there is no apparent relationship between recovery of muscle function and ratings of muscle soreness and surrogate markers of muscle damage when protein supplements are consumed prior to, during or after a bout of endurance or resistance exercise. There also appears to be insufficient experimental data demonstrating ingestion of a protein supplement following a bout of exercise attenuates muscle soreness and/or lowers markers of muscle damage. However, beneficial effects such as reduced muscle soreness and markers of muscle damage become more evident when supplemental protein is consumed after daily training sessions. Furthermore, the data suggest potential ergogenic effects associated with protein supplementation are greatest if participants are in negative nitrogen and/or energy balance. Small sample numbers and lack of dietary control limited the effectiveness of several investigations. In addition, studies did not measure the effects of protein supplementation on direct indices of muscle damage such as myofibrillar disruption and various measures of protein signaling indicative of a change in rates of protein synthesis and degradation. As a result, the interpretation of the data was often limited. Overwhelmingly, studies have consistently demonstrated the acute benefits of protein supplementation on post-exercise muscle anabolism, which, in theory, may facilitate the recovery of muscle function and performance. However, to date, when protein supplements are provided, acute changes in post-exercise protein synthesis and anabolic intracellular signaling have not resulted in measureable reductions in muscle damage and enhanced recovery of muscle function. Limitations in study designs together with the large variability in surrogate markers of muscle damage reduced the strength of the evidence-base.
Characterization and prediction of residues determining protein functional specificity.
Capra, John A; Singh, Mona
2008-07-01
Within a homologous protein family, proteins may be grouped into subtypes that share specific functions that are not common to the entire family. Often, the amino acids present in a small number of sequence positions determine each protein's particular functional specificity. Knowledge of these specificity determining positions (SDPs) aids in protein function prediction, drug design and experimental analysis. A number of sequence-based computational methods have been introduced for identifying SDPs; however, their further development and evaluation have been hindered by the limited number of known experimentally determined SDPs. We combine several bioinformatics resources to automate a process, typically undertaken manually, to build a dataset of SDPs. The resulting large dataset, which consists of SDPs in enzymes, enables us to characterize SDPs in terms of their physicochemical and evolutionary properties. It also facilitates the large-scale evaluation of sequence-based SDP prediction methods. We present a simple sequence-based SDP prediction method, GroupSim, and show that, surprisingly, it is competitive with a representative set of current methods. We also describe ConsWin, a heuristic that considers sequence conservation of neighboring amino acids, and demonstrate that it improves the performance of all methods tested on our large dataset of enzyme SDPs. Datasets and GroupSim code are available online at http://compbio.cs.princeton.edu/specificity/. Supplementary data are available at Bioinformatics online.
Jeong, Chan-Seok; Kim, Dongsup
2016-02-24
Elucidating the cooperative mechanism of interconnected residues is an important component toward understanding the biological function of a protein. Coevolution analysis has been developed to model the coevolutionary information reflecting structural and functional constraints. Recently, several methods have been developed based on a probabilistic graphical model called the Markov random field (MRF), which have led to significant improvements for coevolution analysis; however, thus far, the performance of these models has mainly been assessed by focusing on the aspect of protein structure. In this study, we built an MRF model whose graphical topology is determined by the residue proximity in the protein structure, and derived a novel positional coevolution estimate utilizing the node weight of the MRF model. This structure-based MRF method was evaluated for three data sets, each of which annotates catalytic site, allosteric site, and comprehensively determined functional site information. We demonstrate that the structure-based MRF architecture can encode the evolutionary information associated with biological function. Furthermore, we show that the node weight can more accurately represent positional coevolution information compared to the edge weight. Lastly, we demonstrate that the structure-based MRF model can be reliably built with only a few aligned sequences in linear time. The results show that adoption of a structure-based architecture could be an acceptable approximation for coevolution modeling with efficient computation complexity.
NASA Astrophysics Data System (ADS)
Akhir, Nor Azurah Mat; Nadzirin, Nurul; Mohamed, Rahmah; Firdaus-Raih, Mohd
2015-09-01
Hypothetical proteins of bacterial pathogens represent a large numbers of novel biological mechanisms which could belong to essential pathways in the bacteria. They lack functional characterizations mainly due to the inability of sequence homology based methods to detect functional relationships in the absence of detectable sequence similarity. The dataset derived from this study showed 550 candidates conserved in genomes that has pathogenicity information and only present in the Burkholderiales order. The dataset has been narrowed down to taxonomic clusters. Ten proteins were selected for ORF amplification, seven of them were successfully amplified, and only four proteins were successfully expressed. These proteins will be great candidates in determining the true function via structural biology.
Effect of Oxygen-containing Functional Groups on Protein Stability in Ionic Liquid Solutions
NASA Technical Reports Server (NTRS)
Turner, Megan B.; Holbrey, John D.; Spear, Scott K.; Pusey, Marc L.; Rogers, Robin D.
2004-01-01
The ability of functionalized ionic liquids (ILs) to provide an environment of increased stability for biomolecules has been studied. Serum albumin is an inexpensive, widely available protein that contributes to the overall colloid osmotic blood pressure within the vascular system. Albumin is used in the present study as a marker of biomolecular stability in the presence of various ILs in a range of concentrations. The incorporation of hydroxyl functionality into the methylimidazolium-based cation leads to increased protein stability detected by fluorescence spectroscopy and circular dichroic (CD) spectrometry.
Ko, Hyeok-Jin; Park, Eunhye; Song, Joseph; Yang, Taek Ho; Lee, Hee Jong; Kim, Kyoung Heon
2012-01-01
Autotransporters have been employed as the anchoring scaffold for cell surface display by replacing their passenger domains with heterologous proteins to be displayed. We adopted an autotransporter (YfaL) of Escherichia coli for the cell surface display system. The critical regions in YfaL for surface display were identified for the construction of a ligation-independent cloning (LIC)-based display system. The designed system showed no detrimental effect on either the growth of the host cell or overexpressing heterologous proteins on the cell surface. We functionally displayed monomeric red fluorescent protein (mRFP1) as a reporter protein and diverse agarolytic enzymes from Saccharophagus degradans 2-40, including Aga86C and Aga86E, which previously had failed to be functional expressed. The system could display different sizes of proteins ranging from 25.3 to 143 kDa. We also attempted controlled release of the displayed proteins by incorporating a tobacco etch virus protease cleavage site into the C termini of the displayed proteins. The maximum level of the displayed protein was 6.1 × 104 molecules per a single cell, which corresponds to 5.6% of the entire cell surface of actively growing E. coli. PMID:22344647
Viricel, Clément; de Givry, Simon; Schiex, Thomas; Barbe, Sophie
2018-02-20
Accurate and economic methods to predict change in protein binding free energy upon mutation are imperative to accelerate the design of proteins for a wide range of applications. Free energy is defined by enthalpic and entropic contributions. Following the recent progresses of Artificial Intelligence-based algorithms for guaranteed NP-hard energy optimization and partition function computation, it becomes possible to quickly compute minimum energy conformations and to reliably estimate the entropic contribution of side-chains in the change of free energy of large protein interfaces. Using guaranteed Cost Function Network algorithms, Rosetta energy functions and Dunbrack's rotamer library, we developed and assessed EasyE and JayZ, two methods for binding affinity estimation that ignore or include conformational entropic contributions on a large benchmark of binding affinity experimental measures. If both approaches outperform most established tools, we observe that side-chain conformational entropy brings little or no improvement on most systems but becomes crucial in some rare cases. as open-source Python/C ++ code at sourcesup.renater.fr/projects/easy-jayz. thomas.schiex@inra.fr and sophie.barbe@insa-toulouse.fr. Supplementary data are available at Bioinformatics online.
Van Coillie, Samya; Liang, Lunxi; Zhang, Yao; Wang, Huanbin; Fang, Jing-Yuan; Xu, Jie
2016-04-05
High-throughput methods such as co-immunoprecipitationmass spectrometry (coIP-MS) and yeast 2 hybridization (Y2H) have suggested a broad range of unannotated protein-protein interactions (PPIs), and interpretation of these PPIs remains a challenging task. The advancements in cancer genomic researches allow for the inference of "coactivation pairs" in cancer, which may facilitate the identification of PPIs involved in cancer. Here we present OncoBinder as a tool for the assessment of proteomic interaction data based on the functional synergy of oncoproteins in cancer. This decision tree-based method combines gene mutation, copy number and mRNA expression information to infer the functional status of protein-coding genes. We applied OncoBinder to evaluate the potential binders of EGFR and ERK2 proteins based on the gastric cancer dataset of The Cancer Genome Atlas (TCGA). As a result, OncoBinder identified high confidence interactions (annotated by Kyoto Encyclopedia of Genes and Genomes (KEGG) or validated by low-throughput assays) more efficiently than co-expression based method. Taken together, our results suggest that evaluation of gene functional synergy in cancer may facilitate the interpretation of proteomic interaction data. The OncoBinder toolbox for Matlab is freely accessible online.
Tight junction-based epithelial microenvironment and cell proliferation.
Tsukita, S; Yamazaki, Y; Katsuno, T; Tamura, A; Tsukita, S
2008-11-24
Belt-like tight junctions (TJs), referred to as zonula occludens, have long been regarded as a specialized differentiation of epithelial cell membranes. They are required for cell adhesion and paracellular barrier functions, and are now thought to be partly involved in fence functions and in cell polarization. Recently, the molecular bases of TJs have gradually been unveiled. TJs are constructed by TJ strands, whose basic frameworks are composed of integral membrane proteins with four transmembrane domains, designated claudins. The claudin family is supposedly composed of at least 24 members in mice and humans. Other types of integral membrane proteins with four transmembrane domains, namely occludin and tricellulin, as well as the single transmembrane proteins, JAMs (junctional adhesion molecules) and CAR (coxsackie and adenovirus receptor), are associated with TJ strands, and the high-level organization of TJ strands is likely to be established by membrane-anchored scaffolding proteins, such as ZO-1/2. Recent functional analyses of claudins in cell cultures and in mice have suggested that claudin-based TJs may have pivotal functions in the regulation of the epithelial microenvironment, which is critical for various biological functions such as control of cell proliferation. These represent the dawn of 'Barriology' (defined by Shoichiro Tsukita as the science of barriers in multicellular organisms). Taken together with recent reports regarding changes in claudin expression levels, understanding the regulation of the TJ-based microenvironment system will provide new insights into the regulation of polarization in the respect of epithelial microenvironment system and new viewpoints for developing anticancer strategies.
Faggionato, Davide; Serb, Jeanne M
2017-08-01
The rise of high-throughput RNA sequencing (RNA-seq) and de novo transcriptome assembly has had a transformative impact on how we identify and study genes in the phototransduction cascade of non-model organisms. But the advantage provided by the nearly automated annotation of RNA-seq transcriptomes may at the same time hinder the possibility for gene discovery and the discovery of new gene functions. For example, standard functional annotation based on domain homology to known protein families can only confirm group membership, not identify the emergence of new biochemical function. In this study, we show the importance of developing a strategy that circumvents the limitations of semiautomated annotation and apply this workflow to photosensitivity as a means to discover non-opsin photoreceptors. We hypothesize that non-opsin G-protein-coupled receptor (GPCR) proteins may have chromophore-binding lysines in locations that differ from opsin. Here, we provide the first case study describing non-opsin light-sensitive GPCRs based on tissue-specific RNA-seq data of the common bay scallop Argopecten irradians (Lamarck, 1819). Using a combination of sequence analysis and three-dimensional protein modeling, we identified two candidate proteins. We tested their photochemical properties and provide evidence showing that these two proteins incorporate 11-cis and/or all-trans retinal and react to light photochemically. Based on this case study, we demonstrate that there is potential for the discovery of new light-sensitive GPCRs, and we have developed a workflow that starts from RNA-seq assemblies to the discovery of new non-opsin, GPCR-based photopigments.
Gc protein (vitamin D-binding protein): Gc genotyping and GcMAF precursor activity.
Nagasawa, Hideko; Uto, Yoshihiro; Sasaki, Hideyuki; Okamura, Natsuko; Murakami, Aya; Kubo, Shinichi; Kirk, Kenneth L; Hori, Hitoshi
2005-01-01
The Gc protein (human group-specific component (Gc), a vitamin D-binding protein or Gc globulin), has important physiological functions that include involvement in vitamin D transport and storage, scavenging of extracellular G-actin, enhancement of the chemotactic activity of C5a for neutrophils in inflammation and macrophage activation (mediated by a GalNAc-modified Gc protein (GcMAF)). In this review, the structure and function of the Gc protein is focused on especially with regard to Gc genotyping and GcMAF precursor activity. A discussion of the research strategy "GcMAF as a target for drug discovery" is included, based on our own research.
New frontiers: discovering cilia-independent functions of cilia proteins.
Vertii, Anastassiia; Bright, Alison; Delaval, Benedicte; Hehnly, Heidi; Doxsey, Stephen
2015-10-01
In most vertebrates, mitotic spindles and primary cilia arise from a common origin, the centrosome. In non-cycling cells, the centrosome is the template for primary cilia assembly and, thus, is crucial for their associated sensory and signaling functions. During mitosis, the duplicated centrosomes mature into spindle poles, which orchestrate mitotic spindle assembly, chromosome segregation, and orientation of the cell division axis. Intriguingly, both cilia and spindle poles are centrosome-based, functionally distinct structures that require the action of microtubule-mediated, motor-driven transport for their assembly. Cilia proteins have been found at non-cilia sites, where they have distinct functions, illustrating a diverse and growing list of cellular processes and structures that utilize cilia proteins for crucial functions. In this review, we discuss cilia-independent functions of cilia proteins and re-evaluate their potential contributions to "cilia" disorders. © 2015 The Authors.
Cano-Garrido, Olivia; Sánchez-Chardi, Alejandro; Parés, Sílvia; Giró, Irene; Tatkiewicz, Witold I; Ferrer-Miralles, Neus; Ratera, Imma; Natalello, Antonino; Cubarsi, Rafael; Veciana, Jaume; Bach, Àlex; Villaverde, Antonio; Arís, Anna; Garcia-Fruitós, Elena
2016-10-01
Inclusion bodies (IBs) are protein-based nanoparticles formed in Escherichia coli through stereospecific aggregation processes during the overexpression of recombinant proteins. In the last years, it has been shown that IBs can be used as nanostructured biomaterials to stimulate mammalian cell attachment, proliferation, and differentiation. In addition, these nanoparticles have also been explored as natural delivery systems for protein replacement therapies. Although the production of these protein-based nanomaterials in E. coli is economically viable, important safety concerns related to the presence of endotoxins in the products derived from this microorganism need to be addressed. Lactic acid bacteria (LAB) are a group of food-grade microorganisms that have been classified as safe by biologically regulatory agencies. In this context, we have demonstrated herein, for the first time, the production of fully functional, IB-like protein nanoparticles in LAB. These nanoparticles have been fully characterized using a wide range of techniques, including field emission scanning electron microscopy (FESEM), transmission electron microscopy (TEM), dynamic light scattering (DLS), Fourier transform infrared (FTIR) spectroscopy, zymography, cytometry, confocal microscopy, and wettability and cell coverage measurements. Our results allow us to conclude that these materials share the main physico-chemical characteristics with IBs from E. coli and moreover are devoid of any harmful endotoxin contaminant. These findings reveal a new platform for the production of protein-based safe products with high pharmaceutical interest. The development of both natural and synthetic biomaterials for biomedical applications is a field in constant development. In this context, E. coli is a bacteria that has been widely studied for its ability to naturally produce functional biomaterials with broad biomedical uses. Despite being effective, products derived from this species contain membrane residues able to trigger a non-desired immunogenic responses. Accordingly, exploring alternative bacteria able to synthesize such biomaterials in a safe molecular environment is becoming a challenge. Thus, the present study describes a new type of functional protein-based nanomaterial free of toxic contaminants with a wide range of applications in both human and veterinary medicine. Copyright © 2016 Acta Materialia Inc. Published by Elsevier Ltd. All rights reserved.
Silk Materials Functionalized via Genetic Engineering for Biomedical Applications
Deptuch, Tomasz
2017-01-01
The great mechanical properties, biocompatibility and biodegradability of silk-based materials make them applicable to the biomedical field. Genetic engineering enables the construction of synthetic equivalents of natural silks. Knowledge about the relationship between the structure and function of silk proteins enables the design of bioengineered silks that can serve as the foundation of new biomaterials. Furthermore, in order to better address the needs of modern biomedicine, genetic engineering can be used to obtain silk-based materials with new functionalities. Sequences encoding new peptides or domains can be added to the sequences encoding the silk proteins. The expression of one cDNA fragment indicates that each silk molecule is related to a functional fragment. This review summarizes the proposed genetic functionalization of silk-based materials that can be potentially useful for biomedical applications. PMID:29231863
Nucleic and Amino Acid Sequences Support Structure-Based Viral Classification.
Sinclair, Robert M; Ravantti, Janne J; Bamford, Dennis H
2017-04-15
Viral capsids ensure viral genome integrity by protecting the enclosed nucleic acids. Interactions between the genome and capsid and between individual capsid proteins (i.e., capsid architecture) are intimate and are expected to be characterized by strong evolutionary conservation. For this reason, a capsid structure-based viral classification has been proposed as a way to bring order to the viral universe. The seeming lack of sufficient sequence similarity to reproduce this classification has made it difficult to reject structural convergence as the basis for the classification. We reinvestigate whether the structure-based classification for viral coat proteins making icosahedral virus capsids is in fact supported by previously undetected sequence similarity. Since codon choices can influence nascent protein folding cotranslationally, we searched for both amino acid and nucleotide sequence similarity. To demonstrate the sensitivity of the approach, we identify a candidate gene for the pandoravirus capsid protein. We show that the structure-based classification is strongly supported by amino acid and also nucleotide sequence similarities, suggesting that the similarities are due to common descent. The correspondence between structure-based and sequence-based analyses of the same proteins shown here allow them to be used in future analyses of the relationship between linear sequence information and macromolecular function, as well as between linear sequence and protein folds. IMPORTANCE Viral capsids protect nucleic acid genomes, which in turn encode capsid proteins. This tight coupling of protein shell and nucleic acids, together with strong functional constraints on capsid protein folding and architecture, leads to the hypothesis that capsid protein-coding nucleotide sequences may retain signatures of ancient viral evolution. We have been able to show that this is indeed the case, using the major capsid proteins of viruses forming icosahedral capsids. Importantly, we detected similarity at the nucleotide level between capsid protein-coding regions from viruses infecting cells belonging to all three domains of life, reproducing a previously established structure-based classification of icosahedral viral capsids. Copyright © 2017 Sinclair et al.
Nucleic and Amino Acid Sequences Support Structure-Based Viral Classification
Sinclair, Robert M.; Ravantti, Janne J.
2017-01-01
ABSTRACT Viral capsids ensure viral genome integrity by protecting the enclosed nucleic acids. Interactions between the genome and capsid and between individual capsid proteins (i.e., capsid architecture) are intimate and are expected to be characterized by strong evolutionary conservation. For this reason, a capsid structure-based viral classification has been proposed as a way to bring order to the viral universe. The seeming lack of sufficient sequence similarity to reproduce this classification has made it difficult to reject structural convergence as the basis for the classification. We reinvestigate whether the structure-based classification for viral coat proteins making icosahedral virus capsids is in fact supported by previously undetected sequence similarity. Since codon choices can influence nascent protein folding cotranslationally, we searched for both amino acid and nucleotide sequence similarity. To demonstrate the sensitivity of the approach, we identify a candidate gene for the pandoravirus capsid protein. We show that the structure-based classification is strongly supported by amino acid and also nucleotide sequence similarities, suggesting that the similarities are due to common descent. The correspondence between structure-based and sequence-based analyses of the same proteins shown here allow them to be used in future analyses of the relationship between linear sequence information and macromolecular function, as well as between linear sequence and protein folds. IMPORTANCE Viral capsids protect nucleic acid genomes, which in turn encode capsid proteins. This tight coupling of protein shell and nucleic acids, together with strong functional constraints on capsid protein folding and architecture, leads to the hypothesis that capsid protein-coding nucleotide sequences may retain signatures of ancient viral evolution. We have been able to show that this is indeed the case, using the major capsid proteins of viruses forming icosahedral capsids. Importantly, we detected similarity at the nucleotide level between capsid protein-coding regions from viruses infecting cells belonging to all three domains of life, reproducing a previously established structure-based classification of icosahedral viral capsids. PMID:28122979
Template-based protein structure modeling using the RaptorX web server.
Källberg, Morten; Wang, Haipeng; Wang, Sheng; Peng, Jian; Wang, Zhiyong; Lu, Hui; Xu, Jinbo
2012-07-19
A key challenge of modern biology is to uncover the functional role of the protein entities that compose cellular proteomes. To this end, the availability of reliable three-dimensional atomic models of proteins is often crucial. This protocol presents a community-wide web-based method using RaptorX (http://raptorx.uchicago.edu/) for protein secondary structure prediction, template-based tertiary structure modeling, alignment quality assessment and sophisticated probabilistic alignment sampling. RaptorX distinguishes itself from other servers by the quality of the alignment between a target sequence and one or multiple distantly related template proteins (especially those with sparse sequence profiles) and by a novel nonlinear scoring function and a probabilistic-consistency algorithm. Consequently, RaptorX delivers high-quality structural models for many targets with only remote templates. At present, it takes RaptorX ~35 min to finish processing a sequence of 200 amino acids. Since its official release in August 2011, RaptorX has processed ~6,000 sequences submitted by ~1,600 users from around the world.
Template-based protein structure modeling using the RaptorX web server
Källberg, Morten; Wang, Haipeng; Wang, Sheng; Peng, Jian; Wang, Zhiyong; Lu, Hui; Xu, Jinbo
2016-01-01
A key challenge of modern biology is to uncover the functional role of the protein entities that compose cellular proteomes. To this end, the availability of reliable three-dimensional atomic models of proteins is often crucial. This protocol presents a community-wide web-based method using RaptorX (http://raptorx.uchicago.edu/) for protein secondary structure prediction, template-based tertiary structure modeling, alignment quality assessment and sophisticated probabilistic alignment sampling. RaptorX distinguishes itself from other servers by the quality of the alignment between a target sequence and one or multiple distantly related template proteins (especially those with sparse sequence profiles) and by a novel nonlinear scoring function and a probabilistic-consistency algorithm. Consequently, RaptorX delivers high-quality structural models for many targets with only remote templates. At present, it takes RaptorX ~35 min to finish processing a sequence of 200 amino acids. Since its official release in August 2011, RaptorX has processed ~6,000 sequences submitted by ~1,600 users from around the world. PMID:22814390
Hierarchical Partitioning of Metazoan Protein Conservation Profiles Provides New Functional Insights
Witztum, Jonathan; Persi, Erez; Horn, David; Pasmanik-Chor, Metsada; Chor, Benny
2014-01-01
The availability of many complete, annotated proteomes enables the systematic study of the relationships between protein conservation and functionality. We explore this question based solely on the presence or absence of protein homologues (a.k.a. conservation profiles). We study 18 metazoans, from two distinct points of view: the human's and the fly's. Using the GOrilla gene ontology (GO) analysis tool, we explore functional enrichment of the “universal proteins”, those with homologues in all 17 other species, and of the “non-universal proteins”. A large number of GO terms are strongly enriched in both human and fly universal proteins. Most of these functions are known to be essential. A smaller number of GO terms, exhibiting markedly different properties, are enriched in both human and fly non-universal proteins. We further explore the non-universal proteins, whose conservation profiles are consistent with the “tree of life” (TOL consistent), as well as the TOL inconsistent proteins. Finally, we applied Quantum Clustering to the conservation profiles of the TOL consistent proteins. Each cluster is strongly associated with one or a small number of specific monophyletic clades in the tree of life. The proteins in many of these clusters exhibit strong functional enrichment associated with the “life style” of the related clades. Most previous approaches for studying function and conservation are “bottom up”, studying protein families one by one, and separately assessing the conservation of each. By way of contrast, our approach is “top down”. We globally partition the set of all proteins hierarchically, as described above, and then identify protein families enriched within different subdivisions. While supporting previous findings, our approach also provides a tool for discovering novel relations between protein conservation profiles, functionality, and evolutionary history as represented by the tree of life. PMID:24594619
Modeling Protein Self Assembly
ERIC Educational Resources Information Center
Baker, William P.; Jones, Carleton Buck; Hull, Elizabeth
2004-01-01
Understanding the structure and function of proteins is an important part of the standards-based science curriculum. Proteins serve vital roles within the cell and malfunctions in protein self assembly are implicated in degenerative diseases. Experience indicates that this topic is a difficult one for many students. We have found that the concept…
Uversky, Vladimir N
2015-03-01
Intrinsically disordered proteins (IDPs) and intrinsically disordered protein regions (IDPRs) are functional proteins or regions that do not have unique 3D structures under functional conditions. Therefore, from the viewpoint of their lack of stable 3D structure, IDPs/IDPRs are inherently unstable. As much as structure and function of normal ordered globular proteins are determined by their amino acid sequences, the lack of unique 3D structure in IDPs/IDPRs and their disorder-based functionality are also encoded in the amino acid sequences. Because of their specific sequence features and distinctive conformational behavior, these intrinsically unstable proteins or regions have several applications in biotechnology. This review introduces some of the most characteristic features of IDPs/IDPRs (such as peculiarities of amino acid sequences of these proteins and regions, their major structural features, and peculiar responses to changes in their environment) and describes how these features can be used in the biotechnology, for example for the proteome-wide analysis of the abundance of extended IDPs, for recombinant protein isolation and purification, as polypeptide nanoparticles for drug delivery, as solubilization tools, and as thermally sensitive carriers of active peptides and proteins. Copyright © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kumeta, Masahiro, E-mail: kumeta@lif.kyoto-u.ac.jp; Hirai, Yuya; Yoshimura, Shige H.
2013-12-10
To uncover the molecular composition and dynamics of the functional scaffold for the nucleus, three fractions of biochemically-stable nuclear protein complexes were extracted and used as immunogens to produce a variety of monoclonal antibodies. Many helix-based cytoskeletal proteins were identified as antigens, suggesting their dynamic contribution to nuclear architecture and function. Interestingly, sets of antibodies distinguished distinct subcellular localization of a single isoform of certain cytoskeletal proteins; distinct molecular forms of keratin and actinin were found in the nucleus. Their nuclear shuttling properties were verified by the apparent nuclear accumulations under inhibition of CRM1-dependent nuclear export. Nuclear keratins do notmore » take an obvious filamentous structure, as was revealed by non-filamentous cytoplasmic keratin-specific monoclonal antibody. These results suggest the distinct roles of the helix-based cytoskeletal proteins in the nucleus. - Highlights: • A set of monoclonal antibodies were raised against nuclear scaffold proteins. • Helix-based cytoskeletal proteins were involved in nuclear scaffold. • Many cytoskeletal components shuttle into the nucleus in a CRM1-dependent manner. • Sets of antibodies distinguished distinct subcellular localization of a single isoform. • Nuclear keratin is soluble and does not form an obvious filamentous structure.« less
Function and regulation of primary cilia and intraflagellar transport proteins in the skeleton.
Yuan, Xue; Serra, Rosa A; Yang, Shuying
2015-01-01
Primary cilia are microtubule-based organelles that project from the cell surface to enable transduction of various developmental signaling pathways. The process of intraflagellar transport (IFT) is crucial for the building and maintenance of primary cilia. Ciliary dysfunction has been found in a range of disorders called ciliopathies, some of which display severe skeletal dysplasias. In recent years, interest has grown in uncovering the function of primary cilia/IFT proteins in bone development, mechanotransduction, and cellular regulation. We summarize recent advances in understanding the function of cilia and IFT proteins in the regulation of cell differentiation in osteoblasts, osteocytes, chondrocytes, and mesenchymal stem cells (MSCs). We also discuss the mechanosensory function of cilia and IFT proteins in bone cells, cilia orientation, and other functions of cilia in chondrocytes. © 2014 New York Academy of Sciences.
Ryou, Sang-Mi; Yeom, Ji-Hyun; Kang, Hyo Jung; Won, Miae; Kim, Jin-Sik; Lee, Boeun; Seong, Maeng-Je; Ha, Nam-Chul; Bae, Jeehyeon; Lee, Kangseok
2014-12-28
Although the delivery of biologically functional protein(s) into mammalian cells could be of tremendous value to biomedical research, the development of such technology has been hindered by the lack of a safe and effective delivery method. Here, we present a simple, efficient, and versatile gold nanoparticle-DNA aptamer conjugate (AuNP-Apt)-based system, with nanoblock-like properties, that allows any recombinant protein to be loaded without additional modifications and delivered into mammalian living systems. AuNP-Apt-based protein delivery system was able to deliver various proteins into variety of cell types in vitro without showing cytotoxicity. This AuNP-Apt system was also effective for the local and systemic targeted delivery of proteins in vivo. A local injection of the AuNP-Apt loaded with the apoptosis-inducing BIM protein efficiently inhibited the growth of xenograft tumors in mice. Furthermore, an intravenous injection of AuNP-Apt loaded with both epidermal growth factor (EGF) and BIM resulted in the targeted delivery of BIM into a xenograft tumor derived from EGF receptor-overexpressing cancer cells with no detectable systemic toxicity. Our findings show that this system can serve as an innovative platform for the development of protein-based biomedical applications. Copyright © 2014 Elsevier B.V. All rights reserved.
Mechanism-based Proteomic Screening Identifies Targets of Thioredoxin-like Proteins*
Nakao, Lia S.; Everley, Robert A.; Marino, Stefano M.; Lo, Sze M.; de Souza, Luiz E.; Gygi, Steven P.; Gladyshev, Vadim N.
2015-01-01
Thioredoxin (Trx)-fold proteins are protagonists of numerous cellular pathways that are subject to thiol-based redox control. The best characterized regulator of thiols in proteins is Trx1 itself, which together with thioredoxin reductase 1 (TR1) and peroxiredoxins (Prxs) comprises a key redox regulatory system in mammalian cells. However, there are numerous other Trx-like proteins, whose functions and redox interactors are unknown. It is also unclear if the principles of Trx1-based redox control apply to these proteins. Here, we employed a proteomic strategy to four Trx-like proteins containing CXXC motifs, namely Trx1, Rdx12, Trx-like protein 1 (Txnl1) and nucleoredoxin 1 (Nrx1), whose cellular targets were trapped in vivo using mutant Trx-like proteins, under conditions of low endogenous expression of these proteins. Prxs were detected as key redox targets of Trx1, but this approach also supported the detection of TR1, which is the Trx1 reductant, as well as mitochondrial intermembrane proteins AIF and Mia40. In addition, glutathione peroxidase 4 was found to be a Rdx12 redox target. In contrast, no redox targets of Txnl1 and Nrx1 could be detected, suggesting that their CXXC motifs do not engage in mixed disulfides with cellular proteins. For some Trx-like proteins, the method allowed distinguishing redox and non-redox interactions. Parallel, comparative analyses of multiple thiol oxidoreductases revealed differences in the functions of their CXXC motifs, providing important insights into thiol-based redox control of cellular processes. PMID:25561728
A structural-alphabet-based strategy for finding structural motifs across protein families
Wu, Chih Yuan; Chen, Yao Chi; Lim, Carmay
2010-01-01
Proteins with insignificant sequence and overall structure similarity may still share locally conserved contiguous structural segments; i.e. structural/3D motifs. Most methods for finding 3D motifs require a known motif to search for other similar structures or functionally/structurally crucial residues. Here, without requiring a query motif or essential residues, a fully automated method for discovering 3D motifs of various sizes across protein families with different folds based on a 16-letter structural alphabet is presented. It was applied to structurally non-redundant proteins bound to DNA, RNA, obligate/non-obligate proteins as well as free DNA-binding proteins (DBPs) and proteins with known structures but unknown function. Its usefulness was illustrated by analyzing the 3D motifs found in DBPs. A non-specific motif was found with a ‘corner’ architecture that confers a stable scaffold and enables diverse interactions, making it suitable for binding not only DNA but also RNA and proteins. Furthermore, DNA-specific motifs present ‘only’ in DBPs were discovered. The motifs found can provide useful guidelines in detecting binding sites and computational protein redesign. PMID:20525797
NHS-Esters As Versatile Reactivity-Based Probes for Mapping Proteome-Wide Ligandable Hotspots.
Ward, Carl C; Kleinman, Jordan I; Nomura, Daniel K
2017-06-16
Most of the proteome is considered undruggable, oftentimes hindering translational efforts for drug discovery. Identifying previously unknown druggable hotspots in proteins would enable strategies for pharmacologically interrogating these sites with small molecules. Activity-based protein profiling (ABPP) has arisen as a powerful chemoproteomic strategy that uses reactivity-based chemical probes to map reactive, functional, and ligandable hotspots in complex proteomes, which has enabled inhibitor discovery against various therapeutic protein targets. Here, we report an alkyne-functionalized N-hydroxysuccinimide-ester (NHS-ester) as a versatile reactivity-based probe for mapping the reactivity of a wide range of nucleophilic ligandable hotspots, including lysines, serines, threonines, and tyrosines, encompassing active sites, allosteric sites, post-translational modification sites, protein interaction sites, and previously uncharacterized potential binding sites. Surprisingly, we also show that fragment-based NHS-ester ligands can be made to confer selectivity for specific lysine hotspots on specific targets including Dpyd, Aldh2, and Gstt1. We thus put forth NHS-esters as promising reactivity-based probes and chemical scaffolds for covalent ligand discovery.
F2Dock: Fast Fourier Protein-Protein Docking
Bajaj, Chandrajit; Chowdhury, Rezaul; Siddavanahalli, Vinay
2009-01-01
The functions of proteins is often realized through their mutual interactions. Determining a relative transformation for a pair of proteins and their conformations which form a stable complex, reproducible in nature, is known as docking. It is an important step in drug design, structure determination and understanding function and structure relationships. In this paper we extend our non-uniform fast Fourier transform docking algorithm to include an adaptive search phase (both translational and rotational) and thereby speed up its execution. We have also implemented a multithreaded version of the adaptive docking algorithm for even faster execution on multicore machines. We call this protein-protein docking code F2Dock (F2 = Fast Fourier). We have calibrated F2Dock based on an extensive experimental study on a list of benchmark complexes and conclude that F2Dock works very well in practice. Though all docking results reported in this paper use shape complementarity and Coulombic potential based scores only, F2Dock is structured to incorporate Lennard-Jones potential and re-ranking docking solutions based on desolvation energy. PMID:21071796
Predicting disease-related proteins based on clique backbone in protein-protein interaction network.
Yang, Lei; Zhao, Xudong; Tang, Xianglong
2014-01-01
Network biology integrates different kinds of data, including physical or functional networks and disease gene sets, to interpret human disease. A clique (maximal complete subgraph) in a protein-protein interaction network is a topological module and possesses inherently biological significance. A disease-related clique possibly associates with complex diseases. Fully identifying disease components in a clique is conductive to uncovering disease mechanisms. This paper proposes an approach of predicting disease proteins based on cliques in a protein-protein interaction network. To tolerate false positive and negative interactions in protein networks, extending cliques and scoring predicted disease proteins with gene ontology terms are introduced to the clique-based method. Precisions of predicted disease proteins are verified by disease phenotypes and steadily keep to more than 95%. The predicted disease proteins associated with cliques can partly complement mapping between genotype and phenotype, and provide clues for understanding the pathogenesis of serious diseases.
Improving predicted protein loop structure ranking using a Pareto-optimality consensus method.
Li, Yaohang; Rata, Ionel; Chiu, See-wing; Jakobsson, Eric
2010-07-20
Accurate protein loop structure models are important to understand functions of many proteins. Identifying the native or near-native models by distinguishing them from the misfolded ones is a critical step in protein loop structure prediction. We have developed a Pareto Optimal Consensus (POC) method, which is a consensus model ranking approach to integrate multiple knowledge- or physics-based scoring functions. The procedure of identifying the models of best quality in a model set includes: 1) identifying the models at the Pareto optimal front with respect to a set of scoring functions, and 2) ranking them based on the fuzzy dominance relationship to the rest of the models. We apply the POC method to a large number of decoy sets for loops of 4- to 12-residue in length using a functional space composed of several carefully-selected scoring functions: Rosetta, DOPE, DDFIRE, OPLS-AA, and a triplet backbone dihedral potential developed in our lab. Our computational results show that the sets of Pareto-optimal decoys, which are typically composed of approximately 20% or less of the overall decoys in a set, have a good coverage of the best or near-best decoys in more than 99% of the loop targets. Compared to the individual scoring function yielding best selection accuracy in the decoy sets, the POC method yields 23%, 37%, and 64% less false positives in distinguishing the native conformation, indentifying a near-native model (RMSD < 0.5A from the native) as top-ranked, and selecting at least one near-native model in the top-5-ranked models, respectively. Similar effectiveness of the POC method is also found in the decoy sets from membrane protein loops. Furthermore, the POC method outperforms the other popularly-used consensus strategies in model ranking, such as rank-by-number, rank-by-rank, rank-by-vote, and regression-based methods. By integrating multiple knowledge- and physics-based scoring functions based on Pareto optimality and fuzzy dominance, the POC method is effective in distinguishing the best loop models from the other ones within a loop model set.
Improving predicted protein loop structure ranking using a Pareto-optimality consensus method
2010-01-01
Background Accurate protein loop structure models are important to understand functions of many proteins. Identifying the native or near-native models by distinguishing them from the misfolded ones is a critical step in protein loop structure prediction. Results We have developed a Pareto Optimal Consensus (POC) method, which is a consensus model ranking approach to integrate multiple knowledge- or physics-based scoring functions. The procedure of identifying the models of best quality in a model set includes: 1) identifying the models at the Pareto optimal front with respect to a set of scoring functions, and 2) ranking them based on the fuzzy dominance relationship to the rest of the models. We apply the POC method to a large number of decoy sets for loops of 4- to 12-residue in length using a functional space composed of several carefully-selected scoring functions: Rosetta, DOPE, DDFIRE, OPLS-AA, and a triplet backbone dihedral potential developed in our lab. Our computational results show that the sets of Pareto-optimal decoys, which are typically composed of ~20% or less of the overall decoys in a set, have a good coverage of the best or near-best decoys in more than 99% of the loop targets. Compared to the individual scoring function yielding best selection accuracy in the decoy sets, the POC method yields 23%, 37%, and 64% less false positives in distinguishing the native conformation, indentifying a near-native model (RMSD < 0.5A from the native) as top-ranked, and selecting at least one near-native model in the top-5-ranked models, respectively. Similar effectiveness of the POC method is also found in the decoy sets from membrane protein loops. Furthermore, the POC method outperforms the other popularly-used consensus strategies in model ranking, such as rank-by-number, rank-by-rank, rank-by-vote, and regression-based methods. Conclusions By integrating multiple knowledge- and physics-based scoring functions based on Pareto optimality and fuzzy dominance, the POC method is effective in distinguishing the best loop models from the other ones within a loop model set. PMID:20642859
Effect of a High-Protein Diet on Kidney Function in Healthy Adults: Results From the OmniHeart Trial
Juraschek, Stephen P.; Appel, Lawrence J.; Anderson, Cheryl A.M.; Miller, Edgar R.
2013-01-01
Background Consumption of a diet high in protein can cause glomerular hyperfiltration, a potentially maladaptive response, which may accelerate the progression of kidney disease. Study Design An ancillary study of the OmniHeart trial, a randomized 3-period crossover feeding trial testing the effects of partial replacement of carbohydrate with protein on kidney function. Setting & Participants Healthy adults (N=164) with prehypertension or stage 1 hypertension at a community-based research clinic with a metabolic kitchen. Intervention Participants were fed each of 3 diets for 6 weeks. Feeding periods were separated by a 2- to 4-week washout period. Weight was held constant on each diet. The 3 diets emphasized carbohydrate, protein, or unsaturated fat; dietary protein was either 15% (carbohydrate and unsaturated fat diets) or 25% (protein diet) of energy intake. Outcomes Fasting serum creatinine, cystatin C, and β2-microglobulin levels, estimated glomerular filtration rate (eGFR). Measurements Serum creatinine, cystatin C, and β2-microglobulin collected at the end of each feeding period. Results Baseline cystatin C-based eGFR was 92.0±16.3 (SD) mL/min/1.73 m2. Compared with the carbohydrate and unsaturated fat diets, the protein diet increased cystatin C-based eGFR by ~4 mL/min/1.73 m2 (P < 0.001). The effects of the protein diet on kidney function were independent of changes in blood pressure. There was no significant difference between the carbohydrate and unsaturated fat diets. Limitations Participants did not have kidney disease at baseline. Conclusions A healthy diet rich in protein increased eGFR. Whether long-term consumption of a high-protein diet leads to kidney disease is uncertain. PMID:23219108
Fang, Caiyun; Zhang, Lei; Zhang, Xiaoqin; Lu, Haojie
2015-06-21
Metal binding proteins play many important roles in a broad range of biological processes. Characterization of metal binding proteins is important for understanding their structure and biological functions, thus leading to a clear understanding of metal associated diseases. The present study is the first to investigate the effectiveness of magnetic microspheres functionalized with metal cations (Ca(2+), Cu(2+), Zn(2+) and Fe(3+)) as the absorbent matrix in IMAC technology to enrich metal containing/binding proteins. The putative metal binding proteins in rat liver were then globally characterized by using this strategy which is very easy to handle and can capture a number of metal binding proteins effectively. In total, 185 putative metal binding proteins were identified from rat liver including some known less abundant and membrane-bound metal binding proteins such as Plcg1, Acsl5, etc. The identified proteins are involved in many important processes including binding, catalytic activity, translation elongation factor activity, electron carrier activity, and so on.
Micromotor-based lab-on-chip immunoassays
NASA Astrophysics Data System (ADS)
García, Miguel; Orozco, Jahir; Guix, Maria; Gao, Wei; Sattayasamitsathit, Sirilak; Escarpa, Alberto; Merkoçi, Arben; Wang, Joseph
2013-01-01
Here we describe the first example of using self-propelled antibody-functionalized synthetic catalytic microengines for capturing and transporting target proteins between the different reservoirs of a lab-on-a-chip (LOC) device. A new catalytic polymer/Ni/Pt microtube engine, containing carboxy moieties on its mixed poly(3,4-ethylenedioxythiophene) (PEDOT)/COOH-PEDOT polymeric outermost layer, is further functionalized with the antibody receptor to selectively recognize and capture the target protein. The new motor-based microchip immunoassay operations are carried out without any bulk fluid flow, replacing the common washing steps in antibody-based protein bioassays with the active transport of the captured protein throughout the different reservoirs, where each step of the immunoassay takes place. A first microchip format involving an `on-the-fly' double-antibody sandwich assay (DASA) is used for demonstrating the selective capture of the target protein, in the presence of excess of non-target proteins. A secondary antibody tagged with a polymeric-sphere tracer allows the direct visualization of the binding events. In a second approach the immuno-nanomotor captures and transports the microsphere-tagged antigen through a microchannel network. An anti-protein-A modified microengine is finally used to demonstrate the selective capture, transport and convenient label-free optical detection of a Staphylococcus aureus target bacteria (containing proteinA in its cell wall) in the presence of a large excess of non-target (Saccharomyces cerevisiae) cells. The resulting nanomotor-based microchip immunoassay offers considerable potential for diverse applications in clinical diagnostics, environmental and security monitoring fields.Here we describe the first example of using self-propelled antibody-functionalized synthetic catalytic microengines for capturing and transporting target proteins between the different reservoirs of a lab-on-a-chip (LOC) device. A new catalytic polymer/Ni/Pt microtube engine, containing carboxy moieties on its mixed poly(3,4-ethylenedioxythiophene) (PEDOT)/COOH-PEDOT polymeric outermost layer, is further functionalized with the antibody receptor to selectively recognize and capture the target protein. The new motor-based microchip immunoassay operations are carried out without any bulk fluid flow, replacing the common washing steps in antibody-based protein bioassays with the active transport of the captured protein throughout the different reservoirs, where each step of the immunoassay takes place. A first microchip format involving an `on-the-fly' double-antibody sandwich assay (DASA) is used for demonstrating the selective capture of the target protein, in the presence of excess of non-target proteins. A secondary antibody tagged with a polymeric-sphere tracer allows the direct visualization of the binding events. In a second approach the immuno-nanomotor captures and transports the microsphere-tagged antigen through a microchannel network. An anti-protein-A modified microengine is finally used to demonstrate the selective capture, transport and convenient label-free optical detection of a Staphylococcus aureus target bacteria (containing proteinA in its cell wall) in the presence of a large excess of non-target (Saccharomyces cerevisiae) cells. The resulting nanomotor-based microchip immunoassay offers considerable potential for diverse applications in clinical diagnostics, environmental and security monitoring fields. Electronic supplementary information (ESI) available. See DOI: 10.1039/c2nr32400h
Predicting nucleic acid binding interfaces from structural models of proteins
Dror, Iris; Shazman, Shula; Mukherjee, Srayanta; Zhang, Yang; Glaser, Fabian; Mandel-Gutfreund, Yael
2011-01-01
The function of DNA- and RNA-binding proteins can be inferred from the characterization and accurate prediction of their binding interfaces. However the main pitfall of various structure-based methods for predicting nucleic acid binding function is that they are all limited to a relatively small number of proteins for which high-resolution three dimensional structures are available. In this study, we developed a pipeline for extracting functional electrostatic patches from surfaces of protein structural models, obtained using the I-TASSER protein structure predictor. The largest positive patches are extracted from the protein surface using the patchfinder algorithm. We show that functional electrostatic patches extracted from an ensemble of structural models highly overlap the patches extracted from high-resolution structures. Furthermore, by testing our pipeline on a set of 55 known nucleic acid binding proteins for which I-TASSER produces high-quality models, we show that the method accurately identifies the nucleic acids binding interface on structural models of proteins. Employing a combined patch approach we show that patches extracted from an ensemble of models better predicts the real nucleic acid binding interfaces compared to patches extracted from independent models. Overall, these results suggest that combining information from a collection of low-resolution structural models could be a valuable approach for functional annotation. We suggest that our method will be further applicable for predicting other functional surfaces of proteins with unknown structure. PMID:22086767
2009-09-01
merely qualitative, so in order to quantify the functional effect of S14 overexpression, NMR based metabolomics was used. The literature reports that...overexpression in DIP medium, even though fatty acids were significantly increased. Due to limitations of NMR based metabolomics, the chain length of the...S14 affects glucose carbon conversion directly into fatty acids. Interestingly, glucose consumption and lactate excretion was identical in either
Automated quantitative assessment of proteins' biological function in protein knowledge bases.
Mayr, Gabriele; Lepperdinger, Günter; Lackner, Peter
2008-01-01
Primary protein sequence data are archived in databases together with information regarding corresponding biological functions. In this respect, UniProt/Swiss-Prot is currently the most comprehensive collection and it is routinely cross-examined when trying to unravel the biological role of hypothetical proteins. Bioscientists frequently extract single entries and further evaluate those on a subjective basis. In lieu of a standardized procedure for scoring the existing knowledge regarding individual proteins, we here report about a computer-assisted method, which we applied to score the present knowledge about any given Swiss-Prot entry. Applying this quantitative score allows the comparison of proteins with respect to their sequence yet highlights the comprehension of functional data. pfs analysis may be also applied for quality control of individual entries or for database management in order to rank entry listings.
Combs, Steven A; Mueller, Benjamin K; Meiler, Jens
2018-05-29
Partial covalent interactions (PCIs) in proteins, which include hydrogen bonds, salt bridges, cation-π, and π-π interactions, contribute to thermodynamic stability and facilitate interactions with other biomolecules. Several score functions have been developed within the Rosetta protein modeling framework that identify and evaluate these PCIs through analyzing the geometry between participating atoms. However, we hypothesize that PCIs can be unified through a simplified electron orbital representation. To test this hypothesis, we have introduced orbital based chemical descriptors for PCIs into Rosetta, called the PCI score function. Optimal geometries for the PCIs are derived from a statistical analysis of high-quality protein structures obtained from the Protein Data Bank (PDB), and the relative orientation of electron deficient hydrogen atoms and electron-rich lone pair or π orbitals are evaluated. We demonstrate that nativelike geometries of hydrogen bonds, salt bridges, cation-π, and π-π interactions are recapitulated during minimization of protein conformation. The packing density of tested protein structures increased from the standard score function from 0.62 to 0.64, closer to the native value of 0.70. Overall, rotamer recovery improved when using the PCI score function (75%) as compared to the standard Rosetta score function (74%). The PCI score function represents an improvement over the standard Rosetta score function for protein model scoring; in addition, it provides a platform for future directions in the analysis of small molecule to protein interactions, which depend on partial covalent interactions.
Designing protein-based biomaterials for medical applications.
Gagner, Jennifer E; Kim, Wookhyun; Chaikof, Elliot L
2014-04-01
Biomaterials produced by nature have been honed through billions of years, evolving exquisitely precise structure-function relationships that scientists strive to emulate. Advances in genetic engineering have facilitated extensive investigations to determine how changes in even a single peptide within a protein sequence can produce biomaterials with unique thermal, mechanical and biological properties. Elastin, a naturally occurring protein polymer, serves as a model protein to determine the relationship between specific structural elements and desirable material characteristics. The modular, repetitive nature of the protein facilitates the formation of well-defined secondary structures with the ability to self-assemble into complex three-dimensional architectures on a variety of length scales. Furthermore, many opportunities exist to incorporate other protein-based motifs and inorganic materials into recombinant protein-based materials, extending the range and usefulness of these materials in potential biomedical applications. Elastin-like polypeptides (ELPs) can be assembled into 3-D architectures with precise control over payload encapsulation, mechanical and thermal properties, as well as unique functionalization opportunities through both genetic and enzymatic means. An overview of current protein-based materials, their properties and uses in biomedicine will be provided, with a focus on the advantages of ELPs. Applications of these biomaterials as imaging and therapeutic delivery agents will be discussed. Finally, broader implications and future directions of these materials as diagnostic and therapeutic systems will be explored. Copyright © 2013 Elsevier Ltd. All rights reserved.
Designing Protein-Based Biomaterials for Medical Applications
Gagner, Jennifer E.; Kim, Wookhyun; Chaikof, Elliot L.
2013-01-01
Biomaterials produced by nature have been honed through billions of years, evolving exquisitely precise structure-function relationships that scientists strive to emulate. Advances in genetic engineering have facilitated extensive investigations to determine how changes in even a single peptide within a protein sequence can produce biomaterials with unique thermal, mechanical and biological properties. Elastin, a naturally occurring protein polymer, serves as a model protein to determine the relationship between specific structural elements and desirable material characteristics. The modular, repetitive nature of the protein facilitates the formation of well-defined secondary structures with the ability to self-assemble into complex three-dimensional architectures on a variety of length scales. Furthermore, many opportunities exist to incorporate other protein-based motifs and inorganic materials into recombinant protein-based materials, extending the range and usefulness of these materials in potential biomedical applications. Elastin-like polypeptides can be assembled into 3D architectures with precise control over payload encapsulation, mechanical and thermal properties, as well as unique functionalization opportunities through both genetic and enzymatic means. An overview of current protein-based materials, their properties and uses in biomedicine will be provided, with a focus on the advantages of elastin-like polypeptides. Applications of these biomaterials as imaging and therapeutic delivery agents will be discussed. Finally, broader implications and future directions of these materials as diagnostic and therapeutic systems will be explored. PMID:24121196
Peterson, Lenna X.; Kim, Hyungrae; Esquivel-Rodriguez, Juan; Roy, Amitava; Han, Xusi; Shin, Woong-Hee; Zhang, Jian; Terashi, Genki; Lee, Matt; Kihara, Daisuke
2016-01-01
We report the performance of protein-protein docking predictions by our group for recent rounds of the Critical Assessment of Prediction of Interactions (CAPRI), a community-wide assessment of state-of-the-art docking methods. Our prediction procedure uses a protein-protein docking program named LZerD developed in our group. LZerD represents a protein surface with 3D Zernike descriptors (3DZD), which are based on a mathematical series expansion of a 3D function. The appropriate soft representation of protein surface with 3DZD makes the method more tolerant to conformational change of proteins upon docking, which adds an advantage for unbound docking. Docking was guided by interface residue prediction performed with BindML and cons-PPISP as well as literature information when available. The generated docking models were ranked by a combination of scoring functions, including PRESCO, which evaluates the native-likeness of residues’ spatial environments in structure models. First, we discuss the overall performance of our group in the CAPRI prediction rounds and investigate the reasons for unsuccessful cases. Then, we examine the performance of several knowledge-based scoring functions and their combinations for ranking docking models. It was found that the quality of a pool of docking models generated by LZerD, i.e. whether or not the pool includes near-native models, can be predicted by the correlation of multiple scores. Although the current analysis used docking models generated by LZerD, findings on scoring functions are expected to be universally applicable to other docking methods. PMID:27654025
Family-specific scaling laws in bacterial genomes.
De Lazzari, Eleonora; Grilli, Jacopo; Maslov, Sergei; Cosentino Lagomarsino, Marco
2017-07-27
Among several quantitative invariants found in evolutionary genomics, one of the most striking is the scaling of the overall abundance of proteins, or protein domains, sharing a specific functional annotation across genomes of given size. The size of these functional categories change, on average, as power-laws in the total number of protein-coding genes. Here, we show that such regularities are not restricted to the overall behavior of high-level functional categories, but also exist systematically at the level of single evolutionary families of protein domains. Specifically, the number of proteins within each family follows family-specific scaling laws with genome size. Functionally similar sets of families tend to follow similar scaling laws, but this is not always the case. To understand this systematically, we provide a comprehensive classification of families based on their scaling properties. Additionally, we develop a quantitative score for the heterogeneity of the scaling of families belonging to a given category or predefined group. Under the common reasonable assumption that selection is driven solely or mainly by biological function, these findings point to fine-tuned and interdependent functional roles of specific protein domains, beyond our current functional annotations. This analysis provides a deeper view on the links between evolutionary expansion of protein families and the functional constraints shaping the gene repertoire of bacterial genomes. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Peterson, Lenna X; Kim, Hyungrae; Esquivel-Rodriguez, Juan; Roy, Amitava; Han, Xusi; Shin, Woong-Hee; Zhang, Jian; Terashi, Genki; Lee, Matt; Kihara, Daisuke
2017-03-01
We report the performance of protein-protein docking predictions by our group for recent rounds of the Critical Assessment of Prediction of Interactions (CAPRI), a community-wide assessment of state-of-the-art docking methods. Our prediction procedure uses a protein-protein docking program named LZerD developed in our group. LZerD represents a protein surface with 3D Zernike descriptors (3DZD), which are based on a mathematical series expansion of a 3D function. The appropriate soft representation of protein surface with 3DZD makes the method more tolerant to conformational change of proteins upon docking, which adds an advantage for unbound docking. Docking was guided by interface residue prediction performed with BindML and cons-PPISP as well as literature information when available. The generated docking models were ranked by a combination of scoring functions, including PRESCO, which evaluates the native-likeness of residues' spatial environments in structure models. First, we discuss the overall performance of our group in the CAPRI prediction rounds and investigate the reasons for unsuccessful cases. Then, we examine the performance of several knowledge-based scoring functions and their combinations for ranking docking models. It was found that the quality of a pool of docking models generated by LZerD, that is whether or not the pool includes near-native models, can be predicted by the correlation of multiple scores. Although the current analysis used docking models generated by LZerD, findings on scoring functions are expected to be universally applicable to other docking methods. Proteins 2017; 85:513-527. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Nitric oxide-based protein modification: formation and site-specificity of protein S-nitrosylation
Kovacs, Izabella; Lindermayr, Christian
2013-01-01
Nitric oxide (NO) is a reactive free radical with pleiotropic functions that participates in diverse biological processes in plants, such as germination, root development, stomatal closing, abiotic stress, and defense responses. It acts mainly through redox-based modification of cysteine residue(s) of target proteins, called protein S-nitrosylation.In this way NO regulates numerous cellular functions and signaling events in plants. Identification of S-nitrosylated substrates and their exact target cysteine residue(s) is very important to reveal the molecular mechanisms and regulatory roles of S-nitrosylation. In addition to the necessity of protein–protein interaction for trans-nitrosylation and denitrosylation reactions, the cellular redox environment and cysteine thiol micro-environment have been proposed important factors for the specificity of protein S-nitrosylation. Several methods have recently been developed for the proteomic identification of target proteins. However, the specificity of NO-based cysteine modification is still less defined. In this review, we discuss formation and specificity of S-nitrosylation. Special focus will be on potential S-nitrosylation motifs, site-specific proteomic analyses, computational predictions using different algorithms, and on structural analysis of cysteine S-nitrosylation. PMID:23717319
Ghosh, Pritha; Mathew, Oommen K; Sowdhamini, Ramanathan
2016-10-07
RNA-binding proteins (RBPs) interact with their cognate RNA(s) to form large biomolecular assemblies. They are versatile in their functionality and are involved in a myriad of processes inside the cell. RBPs with similar structural features and common biological functions are grouped together into families and superfamilies. It will be useful to obtain an early understanding and association of RNA-binding property of sequences of gene products. Here, we report a web server, RStrucFam, to predict the structure, type of cognate RNA(s) and function(s) of proteins, where possible, from mere sequence information. The web server employs Hidden Markov Model scan (hmmscan) to enable association to a back-end database of structural and sequence families. The database (HMMRBP) comprises of 437 HMMs of RBP families of known structure that have been generated using structure-based sequence alignments and 746 sequence-centric RBP family HMMs. The input protein sequence is associated with structural or sequence domain families, if structure or sequence signatures exist. In case of association of the protein with a family of known structures, output features like, multiple structure-based sequence alignment (MSSA) of the query with all others members of that family is provided. Further, cognate RNA partner(s) for that protein, Gene Ontology (GO) annotations, if any and a homology model of the protein can be obtained. The users can also browse through the database for details pertaining to each family, protein or RNA and their related information based on keyword search or RNA motif search. RStrucFam is a web server that exploits structurally conserved features of RBPs, derived from known family members and imprinted in mathematical profiles, to predict putative RBPs from sequence information. Proteins that fail to associate with such structure-centric families are further queried against the sequence-centric RBP family HMMs in the HMMRBP database. Further, all other essential information pertaining to an RBP, like overall function annotations, are provided. The web server can be accessed at the following link: http://caps.ncbs.res.in/rstrucfam .
Hosseini, Samira; Ibrahim, Fatimah; Djordjevic, Ivan; Koole, Leo H
2014-06-21
Biosensor chips for immune-based assay systems have been investigated for their application in early diagnostics. The development of such systems strongly depends on the effective protein immobilization on polymer substrates. In order to achieve this complex heterogeneous interaction the polymer surface must be functionalized with chemical groups that are reactive towards proteins in a way that surface functional groups (such as carboxyl, -COOH; amine, -NH2; and hydroxyl, -OH) chemically or physically anchor the proteins to the polymer platform. Since the proteins are very sensitive towards their environment and can easily lose their activity when brought in close proximity to the solid surface, effective surface functionalization and high level of control over surface chemistry present the most important steps in the fabrication of biosensors. This paper reviews recent developments in surface functionalization and preparation of polymethacrylates for protein immobilization. Due to their versatility and cost effectiveness, this particular group of plastic polymers is widely used both in research and in industry.
Lymphocyte signaling: beyond knockouts.
Saveliev, Alexander; Tybulewicz, Victor L J
2009-04-01
The analysis of lymphocyte signaling was greatly enhanced by the advent of gene targeting, which allows the selective inactivation of a single gene. Although this gene 'knockout' approach is often informative, in many cases, the phenotype resulting from gene ablation might not provide a complete picture of the function of the corresponding protein. If a protein has multiple functions within a single or several signaling pathways, or stabilizes other proteins in a complex, the phenotypic consequences of a gene knockout may manifest as a combination of several different perturbations. In these cases, gene targeting to 'knock in' subtle point mutations might provide more accurate insight into protein function. However, to be informative, such mutations must be carefully based on structural and biophysical data.
Syn, Genevieve; Blackwell, Jenefer M; Jamieson, Sarra E; Francis, Richard W
2018-01-01
Toxoplasma gondii uses epigenetic mechanisms to regulate both endogenous and host cell gene expression. To identify genes with putative epigenetic functions, we developed an in silico pipeline to interrogate the T. gondii proteome of 8313 proteins. Step 1 employs PredictNLS and NucPred to identify genes predicted to target eukaryotic nuclei. Step 2 uses GOLink to identify proteins of epigenetic function based on Gene Ontology terms. This resulted in 611 putative nuclear localised proteins with predicted epigenetic functions. Step 3 filtered for secretory proteins using SignalP, SecretomeP, and experimental data. This identified 57 of the 611 putative epigenetic proteins as likely to be secreted. The pipeline is freely available online, uses open access tools and software with user-friendly Perl scripts to automate and manage the results, and is readily adaptable to undertake any such in silico search for genes contributing to particular functions.
Protein-Based Nanofabrics for Multifunctional Air Filtering
NASA Astrophysics Data System (ADS)
Souzandeh, Hamid
With the fast development of economics and population, air pollution is getting worse and becomes a great concern worldwide. The release of chemicals, particulates and biological materials into air can lead to various diseases or discomfort to humans and other living organisms, alongside other serious impacts on the environment. Therefore, improving indoor air quality using various air filters is in critical need because people stay inside buildings most time of the day. However, current air filters using traditional polymers can only remove particles from the polluted air and disposing the huge amount of used air filters can cause serious secondary environmental pollution. Therefore, development of multi-functional air filter materials with environmental friendliness is significant. For this purpose, we developed "green" protein-based multifunctional air-filtering materials. The outstanding performance of the green materials in removal of multiple species of pollutants, including particulate matter, toxic chemicals, and biological hazards, simultaneously, will greatly facilitate the development of the next-generation air-filtration systems. First and foremost, we developed high-performance protein-based nanofabric air-filter mats. It was found that the protein-nanofabrics possess high-efficiency multifunctional air-filtering properties for both particles and various species of chemical gases. Then, the high-performance natural protein-based nanofabrics were promoted both mechanically and functionally by a textured cellulose paper towel. It is interestingly discovered that the textured cellulose paper towel not only can act as a flexible mechanical support, but also a type of airflow regulator which can improve the pollutant-nanofilter interactions. Furthermore, the protein-based nanofabrics were crosslinked in order to enhance the environmental-stability of the filters. It was found that the crosslinked protein-nanofabrics can significantly improve the structure stability against different moisture levels and temperatures, while maintain the multifunctional filtration performance. Moreover, it was demonstrated that the crosslinked protein-nanomaterials also possess antibacterial properties against the selected gram-negative and gram-positive bacteria. This provides a cost-effective solution for advanced "green" nanomaterials with excellent performance in both filtration functions and structure stability under varying environment. This work indicates that protein-based air-filters are promising "green" air-filtering materials for next-generation air-filtration systems.
Carvalho, Henrique F; Barbosa, Arménio J M; Roque, Ana C A; Iranzo, Olga; Branco, Ricardo J F
2017-01-01
Recent advances in de novo protein design have gained considerable insight from the intrinsic dynamics of proteins, based on the integration of molecular dynamics simulations protocols on the state-of-the-art de novo protein design protocols used nowadays. With this protocol we illustrate how to set up and run a molecular dynamics simulation followed by a functional protein dynamics analysis. New users will be introduced to some useful open-source computational tools, including the GROMACS molecular dynamics simulation software package and ProDy for protein structural dynamics analysis.
Structural Features of Antiviral APOBEC3 Proteins are Linked to Their Functional Activities
Kitamura, Shingo; Ode, Hirotaka; Iwatani, Yasumasa
2011-01-01
Human APOBEC3 (A3) proteins are cellular cytidine deaminases that potently restrict the replication of retroviruses by hypermutating viral cDNA and/or inhibiting reverse transcription. There are seven members of this family including A3A, B, C, DE, F, G, and H, all encoded in a tandem array on human chromosome 22. A3F and A3G are the most potent inhibitors of HIV-1, but only in the absence of the virus-encoded protein, Vif. HIV-1 utilizes Vif to abrogate A3 functions in the producer cells. More specifically, Vif, serving as a substrate receptor, facilitates ubiquitination of A3 proteins by forming a Cullin5 (Cul5)-based E3 ubiquitin ligase complex, which targets A3 proteins for rapid proteasomal degradation. The specificity of A3 degradation is determined by the ability of Vif to bind to the target. Several lines of evidence have suggested that three distinct regions of A3 proteins are involved in the interaction with Vif. Here, we review the biological functions of A3 family members with special focus on A3G and base our analysis on the available structural information. PMID:22203821
Comparative bioinformatics analyses and profiling of lysosome-related organelle proteomes
NASA Astrophysics Data System (ADS)
Hu, Zhang-Zhi; Valencia, Julio C.; Huang, Hongzhan; Chi, An; Shabanowitz, Jeffrey; Hearing, Vincent J.; Appella, Ettore; Wu, Cathy
2007-01-01
Complete and accurate profiling of cellular organelle proteomes, while challenging, is important for the understanding of detailed cellular processes at the organelle level. Mass spectrometry technologies coupled with bioinformatics analysis provide an effective approach for protein identification and functional interpretation of organelle proteomes. In this study, we have compiled human organelle reference datasets from large-scale proteomic studies and protein databases for seven lysosome-related organelles (LROs), as well as the endoplasmic reticulum and mitochondria, for comparative organelle proteome analysis. Heterogeneous sources of human organelle proteins and rodent homologs are mapped to human UniProtKB protein entries based on ID and/or peptide mappings, followed by functional annotation and categorization using the iProXpress proteomic expression analysis system. Cataloging organelle proteomes allows close examination of both shared and unique proteins among various LROs and reveals their functional relevance. The proteomic comparisons show that LROs are a closely related family of organelles. The shared proteins indicate the dynamic and hybrid nature of LROs, while the unique transmembrane proteins may represent additional candidate marker proteins for LROs. This comparative analysis, therefore, provides a basis for hypothesis formulation and experimental validation of organelle proteins and their functional roles.
NASA Astrophysics Data System (ADS)
Eid, Sameh; Saleh, Noureldin; Zalewski, Adam; Vedani, Angelo
2014-12-01
Carbohydrates play a key role in a variety of physiological and pathological processes and, hence, represent a rich source for the development of novel therapeutic agents. Being able to predict binding mode and binding affinity is an essential, yet lacking, aspect of the structure-based design of carbohydrate-based ligands. We assembled a diverse data set comprising 273 carbohydrate-protein crystal structures with known binding affinity and evaluated the prediction accuracy of a large collection of well-established scoring and free-energy functions, as well as combinations thereof. Unfortunately, the tested functions were not capable of reproducing binding affinities in the studied complexes. To simplify the complex free-energy surface of carbohydrate-protein systems, we classified the studied proteins according to the topology and solvent exposure of the carbohydrate-binding site into five distinct categories. A free-energy model based on the proposed classification scheme reproduced binding affinities in the carbohydrate data set with an r 2 of 0.71 and root-mean-squared-error of 1.25 kcal/mol ( N = 236). The improvement in model performance underlines the significance of the differences in the local micro-environments of carbohydrate-binding sites and demonstrates the usefulness of calibrating free-energy functions individually according to binding-site topology and solvent exposure.
Rahaman, Obaidur; Estrada, Trilce P.; Doren, Douglas J.; Taufer, Michela; Brooks, Charles L.; Armen, Roger S.
2011-01-01
The performance of several two-step scoring approaches for molecular docking were assessed for their ability to predict binding geometries and free energies. Two new scoring functions designed for “step 2 discrimination” were proposed and compared to our CHARMM implementation of the linear interaction energy (LIE) approach using the Generalized-Born with Molecular Volume (GBMV) implicit solvation model. A scoring function S1 was proposed by considering only “interacting” ligand atoms as the “effective size” of the ligand, and extended to an empirical regression-based pair potential S2. The S1 and S2 scoring schemes were trained and five-fold cross validated on a diverse set of 259 protein-ligand complexes from the Ligand Protein Database (LPDB). The regression-based parameters for S1 and S2 also demonstrated reasonable transferability in the CSARdock 2010 benchmark using a new dataset (NRC HiQ) of diverse protein-ligand complexes. The ability of the scoring functions to accurately predict ligand geometry was evaluated by calculating the discriminative power (DP) of the scoring functions to identify native poses. The parameters for the LIE scoring function with the optimal discriminative power (DP) for geometry (step 1 discrimination) were found to be very similar to the best-fit parameters for binding free energy over a large number of protein-ligand complexes (step 2 discrimination). Reasonable performance of the scoring functions in enrichment of active compounds in four different protein target classes established that the parameters for S1 and S2 provided reasonable accuracy and transferability. Additional analysis was performed to definitively separate scoring function performance from molecular weight effects. This analysis included the prediction of ligand binding efficiencies for a subset of the CSARdock NRC HiQ dataset where the number of ligand heavy atoms ranged from 17 to 35. This range of ligand heavy atoms is where improved accuracy of predicted ligand efficiencies is most relevant to real-world drug design efforts. PMID:21644546
Rahaman, Obaidur; Estrada, Trilce P; Doren, Douglas J; Taufer, Michela; Brooks, Charles L; Armen, Roger S
2011-09-26
The performances of several two-step scoring approaches for molecular docking were assessed for their ability to predict binding geometries and free energies. Two new scoring functions designed for "step 2 discrimination" were proposed and compared to our CHARMM implementation of the linear interaction energy (LIE) approach using the Generalized-Born with Molecular Volume (GBMV) implicit solvation model. A scoring function S1 was proposed by considering only "interacting" ligand atoms as the "effective size" of the ligand and extended to an empirical regression-based pair potential S2. The S1 and S2 scoring schemes were trained and 5-fold cross-validated on a diverse set of 259 protein-ligand complexes from the Ligand Protein Database (LPDB). The regression-based parameters for S1 and S2 also demonstrated reasonable transferability in the CSARdock 2010 benchmark using a new data set (NRC HiQ) of diverse protein-ligand complexes. The ability of the scoring functions to accurately predict ligand geometry was evaluated by calculating the discriminative power (DP) of the scoring functions to identify native poses. The parameters for the LIE scoring function with the optimal discriminative power (DP) for geometry (step 1 discrimination) were found to be very similar to the best-fit parameters for binding free energy over a large number of protein-ligand complexes (step 2 discrimination). Reasonable performance of the scoring functions in enrichment of active compounds in four different protein target classes established that the parameters for S1 and S2 provided reasonable accuracy and transferability. Additional analysis was performed to definitively separate scoring function performance from molecular weight effects. This analysis included the prediction of ligand binding efficiencies for a subset of the CSARdock NRC HiQ data set where the number of ligand heavy atoms ranged from 17 to 35. This range of ligand heavy atoms is where improved accuracy of predicted ligand efficiencies is most relevant to real-world drug design efforts.
Gene Ontology annotation of the rice blast fungus, Magnaporthe oryzae
Meng, Shaowu; Brown, Douglas E; Ebbole, Daniel J; Torto-Alalibo, Trudy; Oh, Yeon Yee; Deng, Jixin; Mitchell, Thomas K; Dean, Ralph A
2009-01-01
Background Magnaporthe oryzae, the causal agent of blast disease of rice, is the most destructive disease of rice worldwide. The genome of this fungal pathogen has been sequenced and an automated annotation has recently been updated to Version 6 . However, a comprehensive manual curation remains to be performed. Gene Ontology (GO) annotation is a valuable means of assigning functional information using standardized vocabulary. We report an overview of the GO annotation for Version 5 of M. oryzae genome assembly. Methods A similarity-based (i.e., computational) GO annotation with manual review was conducted, which was then integrated with a literature-based GO annotation with computational assistance. For similarity-based GO annotation a stringent reciprocal best hits method was used to identify similarity between predicted proteins of M. oryzae and GO proteins from multiple organisms with published associations to GO terms. Significant alignment pairs were manually reviewed. Functional assignments were further cross-validated with manually reviewed data, conserved domains, or data determined by wet lab experiments. Additionally, biological appropriateness of the functional assignments was manually checked. Results In total, 6,286 proteins received GO term assignment via the homology-based annotation, including 2,870 hypothetical proteins. Literature-based experimental evidence, such as microarray, MPSS, T-DNA insertion mutation, or gene knockout mutation, resulted in 2,810 proteins being annotated with GO terms. Of these, 1,673 proteins were annotated with new terms developed for Plant-Associated Microbe Gene Ontology (PAMGO). In addition, 67 experiment-determined secreted proteins were annotated with PAMGO terms. Integration of the two data sets resulted in 7,412 proteins (57%) being annotated with 1,957 distinct and specific GO terms. Unannotated proteins were assigned to the 3 root terms. The Version 5 GO annotation is publically queryable via the GO site . Additionally, the genome of M. oryzae is constantly being refined and updated as new information is incorporated. For the latest GO annotation of Version 6 genome, please visit our website . The preliminary GO annotation of Version 6 genome is placed at a local MySql database that is publically queryable via a user-friendly interface Adhoc Query System. Conclusion Our analysis provides comprehensive and robust GO annotations of the M. oryzae genome assemblies that will be solid foundations for further functional interrogation of M. oryzae. PMID:19278556
NASA Astrophysics Data System (ADS)
Thomas, Carla; Xu, Liza; Olsen, Bradley
2013-03-01
Self-assembly of globular protein-polymer block copolymers into well-defined nanostructures provides a route towards the manufacture of protein-based materials which maintains protein fold and function. The model material mCherry-b-poly(N-isopropyl acrylamide) forms self-assembled nanostructures from aqueous solutions via solvent evaporation. To improve retention of protein functionality when dehydrated, small molecules such as trehalose and glycerol are added in solution prior to solvent removal. With as little as 10 wt% additive, improvements in retained functionality of 20-60% are observed in the solid-state as compared to samples in which no additive is present. Higher additive levels (up to 50%) continue to show improvement until approximately 100% of the protein function is retained. These large gains are hypothesized to originate from the ability of the additives to replace hydrogen bonds normally fulfilled by water. The addition of trehalose in the bulk material also improves the thermal stability of the protein by 15-20 °C, while glycerol decreases the thermal stability. Materials containing up to 50% additives remain microphase separated, and, upon incorporation of additives, nanostructure domain spacing tends to increase, accompanied by order-order transitions.
Physiological functions of MTA family of proteins.
Sen, Nirmalya; Gui, Bin; Kumar, Rakesh
2014-12-01
Although the functional significance of the metastasic tumor antigen (MTA) family of chromatin remodeling proteins in the pathobiology of cancer is fairly well recognized, the physiological role of MTA proteins continues to be an understudied research area and is just beginning to be recognized. Similar to cancer cells, MTA1 also modulates the expression of target genes in normal cells either by acting as a corepressor or coactivator. In addition, physiological functions of MTA proteins are likely to be influenced by its differential expression, subcellular localization, and regulation by upstream modulators and extracellular signals. This review summarizes our current understanding of the physiological functions of the MTA proteins in model systems. In particular, we highlight recent advances of the role MTA proteins play in the brain, eye, circadian rhythm, mammary gland biology, spermatogenesis, liver, immunomodulation and inflammation, cellular radio-sensitivity, and hematopoiesis and differentiation. Based on the growth of knowledge regarding the exciting new facets of the MTA family of proteins in biology and medicine, we speculate that the next burst of findings in this field may reveal further molecular regulatory insights of non-redundant functions of MTA coregulators in the normal physiology as well as in pathological conditions outside cancer.
regSNPs-splicing: a tool for prioritizing synonymous single-nucleotide substitution.
Zhang, Xinjun; Li, Meng; Lin, Hai; Rao, Xi; Feng, Weixing; Yang, Yuedong; Mort, Matthew; Cooper, David N; Wang, Yue; Wang, Yadong; Wells, Clark; Zhou, Yaoqi; Liu, Yunlong
2017-09-01
While synonymous single-nucleotide variants (sSNVs) have largely been unstudied, since they do not alter protein sequence, mounting evidence suggests that they may affect RNA conformation, splicing, and the stability of nascent-mRNAs to promote various diseases. Accurately prioritizing deleterious sSNVs from a pool of neutral ones can significantly improve our ability of selecting functional genetic variants identified from various genome-sequencing projects, and, therefore, advance our understanding of disease etiology. In this study, we develop a computational algorithm to prioritize sSNVs based on their impact on mRNA splicing and protein function. In addition to genomic features that potentially affect splicing regulation, our proposed algorithm also includes dozens structural features that characterize the functions of alternatively spliced exons on protein function. Our systematical evaluation on thousands of sSNVs suggests that several structural features, including intrinsic disorder protein scores, solvent accessible surface areas, protein secondary structures, and known and predicted protein family domains, show significant differences between disease-causing and neutral sSNVs. Our result suggests that the protein structure features offer an added dimension of information while distinguishing disease-causing and neutral synonymous variants. The inclusion of structural features increases the predictive accuracy for functional sSNV prioritization.
Gorissen, Stefan H M; Witard, Oliver C
2018-02-01
The age-related loss of skeletal muscle mass and function is caused, at least in part, by a reduced muscle protein synthetic response to protein ingestion. The magnitude and duration of the postprandial muscle protein synthetic response to ingested protein is dependent on the quantity and quality of the protein consumed. This review characterises the anabolic properties of animal-derived and plant-based dietary protein sources in older adults. While approximately 60 % of dietary protein consumed worldwide is derived from plant sources, plant-based proteins generally exhibit lower digestibility, lower leucine content and deficiencies in certain essential amino acids such as lysine and methionine, which compromise the availability of a complete amino acid profile required for muscle protein synthesis. Based on currently available scientific evidence, animal-derived proteins may be considered more anabolic than plant-based protein sources. However, the production and consumption of animal-derived protein sources is associated with higher greenhouse gas emissions, while plant-based protein sources may be considered more environmentally sustainable. Theoretically, the lower anabolic capacity of plant-based proteins can be compensated for by ingesting a greater dose of protein or by combining various plant-based proteins to provide a more favourable amino acid profile. In addition, leucine co-ingestion can further augment the postprandial muscle protein synthetic response. Finally, prior exercise or n-3 fatty acid supplementation have been shown to sensitise skeletal muscle to the anabolic properties of dietary protein. Applying one or more of these strategies may support the maintenance of muscle mass with ageing when diets rich in plant-based protein are consumed.
Findeisen, Felix; Campiglio, Marta; Jo, Hyunil; Abderemane-Ali, Fayal; Rumpf, Christine H; Pope, Lianne; Rossen, Nathan D; Flucher, Bernhard E; DeGrado, William F; Minor, Daniel L
2017-06-21
For many voltage-gated ion channels (VGICs), creation of a properly functioning ion channel requires the formation of specific protein-protein interactions between the transmembrane pore-forming subunits and cystoplasmic accessory subunits. Despite the importance of such protein-protein interactions in VGIC function and assembly, their potential as sites for VGIC modulator development has been largely overlooked. Here, we develop meta-xylyl (m-xylyl) stapled peptides that target a prototypic VGIC high affinity protein-protein interaction, the interaction between the voltage-gated calcium channel (Ca V ) pore-forming subunit α-interaction domain (AID) and cytoplasmic β-subunit (Ca V β). We show using circular dichroism spectroscopy, X-ray crystallography, and isothermal titration calorimetry that the m-xylyl staples enhance AID helix formation are structurally compatible with native-like AID:Ca V β interactions and reduce the entropic penalty associated with AID binding to Ca V β. Importantly, electrophysiological studies reveal that stapled AID peptides act as effective inhibitors of the Ca V α 1 :Ca V β interaction that modulate Ca V function in an Ca V β isoform-selective manner. Together, our studies provide a proof-of-concept demonstration of the use of protein-protein interaction inhibitors to control VGIC function and point to strategies for improved AID-based Ca V modulator design.
Wilton, Brianne A.; Campbell, Stephanie; Van Buuren, Nicholas; Garneau, Robyn; Furukawa, Manabu; Xiong, Yue; Barry., Michele
2008-01-01
Cellular proteins containing BTB and kelch domains have been shown to function as adapters for the recruitment of substrates to cullin-3-based ubiquitin ligases. Poxviruses are the only family of viruses known to encode multiple BTB/kelch proteins, suggesting that poxviruses may modulate the ubiquitin pathway through interaction with cullin-3. Ectromelia virus encodes four BTB/kelch proteins and one BTB-only protein. Here we demonstrate that two of the ectromelia virus encoded BTB/kelch proteins, EVM150 and EVM167, interacted with cullin-3. Similar to cellular BTB proteins, the BTB domain of EVM150 and EVM167 was necessary and sufficient for cullin-3 interaction. During infection, EVM150 and EVM167 localized to discrete cytoplasmic regions, which co-localized with cullin-3. Furthermore, EVM150 and EVM167 co-localized and interacted with conjugated ubiquitin, as demonstrated by confocal microscopy and co-immunoprecipitation. Our findings suggest that the ectromelia virus encoded BTB/kelch proteins, EVM150 and EVM167, interact with cullin-3 potentially functioning to recruit unidentified substrates for ubiquitination. PMID:18221766
Molecular requirements for actin-based lamella formation in Drosophila S2 cells
Rogers, Stephen L.; Wiedemann, Ursula; Stuurman, Nico; Vale, Ronald D.
2003-01-01
Cell migration occurs through the protrusion of the actin-enriched lamella. Here, we investigated the effects of RNAi depletion of ∼90 proteins implicated in actin function on lamella formation in Drosophila S2 cells. Similar to in vitro reconstitution studies of actin-based Listeria movement, we find that lamellae formation requires a relatively small set of proteins that participate in actin nucleation (Arp2/3 and SCAR), barbed end capping (capping protein), filament depolymerization (cofilin and Aip1), and actin monomer binding (profilin and cyclase-associated protein). Lamellae are initiated by parallel and partially redundant signaling pathways involving Rac GTPases and the adaptor protein Nck, which stimulate SCAR, an Arp2/3 activator. We also show that RNAi of three proteins (kette, Abi, and Sra-1) known to copurify with and inhibit SCAR in vitro leads to SCAR degradation, revealing a novel function of this protein complex in SCAR stability. Our results have identified an essential set of proteins involved in actin dynamics during lamella formation in Drosophila S2 cells. PMID:12975351
Akhter, Nasrin; Shehu, Amarda
2018-01-19
Due to the essential role that the three-dimensional conformation of a protein plays in regulating interactions with molecular partners, wet and dry laboratories seek biologically-active conformations of a protein to decode its function. Computational approaches are gaining prominence due to the labor and cost demands of wet laboratory investigations. Template-free methods can now compute thousands of conformations known as decoys, but selecting native conformations from the generated decoys remains challenging. Repeatedly, research has shown that the protein energy functions whose minima are sought in the generation of decoys are unreliable indicators of nativeness. The prevalent approach ignores energy altogether and clusters decoys by conformational similarity. Complementary recent efforts design protein-specific scoring functions or train machine learning models on labeled decoys. In this paper, we show that an informative consideration of energy can be carried out under the energy landscape view. Specifically, we leverage local structures known as basins in the energy landscape probed by a template-free method. We propose and compare various strategies of basin-based decoy selection that we demonstrate are superior to clustering-based strategies. The presented results point to further directions of research for improving decoy selection, including the ability to properly consider the multiplicity of native conformations of proteins.
Analysis of functional redundancies within the Arabidopsis TCP transcription factor family.
Danisman, Selahattin; van Dijk, Aalt D J; Bimbo, Andrea; van der Wal, Froukje; Hennig, Lars; de Folter, Stefan; Angenent, Gerco C; Immink, Richard G H
2013-12-01
Analyses of the functions of TEOSINTE-LIKE1, CYCLOIDEA, and PROLIFERATING CELL FACTOR1 (TCP) transcription factors have been hampered by functional redundancy between its individual members. In general, putative functionally redundant genes are predicted based on sequence similarity and confirmed by genetic analysis. In the TCP family, however, identification is impeded by relatively low overall sequence similarity. In a search for functionally redundant TCP pairs that control Arabidopsis leaf development, this work performed an integrative bioinformatics analysis, combining protein sequence similarities, gene expression data, and results of pair-wise protein-protein interaction studies for the 24 members of the Arabidopsis TCP transcription factor family. For this, the work completed any lacking gene expression and protein-protein interaction data experimentally and then performed a comprehensive prediction of potential functional redundant TCP pairs. Subsequently, redundant functions could be confirmed for selected predicted TCP pairs by genetic and molecular analyses. It is demonstrated that the previously uncharacterized class I TCP19 gene plays a role in the control of leaf senescence in a redundant fashion with TCP20. Altogether, this work shows the power of combining classical genetic and molecular approaches with bioinformatics predictions to unravel functional redundancies in the TCP transcription factor family.
Busk, Peter Kamp; Lange, Lene
2013-06-01
Functional prediction of carbohydrate-active enzymes is difficult due to low sequence identity. However, similar enzymes often share a few short motifs, e.g., around the active site, even when the overall sequences are very different. To exploit this notion for functional prediction of carbohydrate-active enzymes, we developed a simple algorithm, peptide pattern recognition (PPR), that can divide proteins into groups of sequences that share a set of short conserved sequences. When this method was used on 118 glycoside hydrolase 5 proteins with 9% average pairwise identity and representing four characterized enzymatic functions, 97% of the proteins were sorted into groups correlating with their enzymatic activity. Furthermore, we analyzed 8,138 glycoside hydrolase 13 proteins including 204 experimentally characterized enzymes with 28 different functions. There was a 91% correlation between group and enzyme activity. These results indicate that the function of carbohydrate-active enzymes can be predicted with high precision by finding short, conserved motifs in their sequences. The glycoside hydrolase 61 family is important for fungal biomass conversion, but only a few proteins of this family have been functionally characterized. Interestingly, PPR divided 743 glycoside hydrolase 61 proteins into 16 subfamilies useful for targeted investigation of the function of these proteins and pinpointed three conserved motifs with putative importance for enzyme activity. Furthermore, the conserved sequences were useful for cloning of new, subfamily-specific glycoside hydrolase 61 proteins from 14 fungi. In conclusion, identification of conserved sequence motifs is a new approach to sequence analysis that can predict carbohydrate-active enzyme functions with high precision.
Liu, Zhiming; Luo, Jiawei
2017-08-01
Associating protein complexes to human inherited diseases is critical for better understanding of biological processes and functional mechanisms of the disease. Many protein complexes have been identified and functionally annotated by computational and purification methods so far, however, the particular roles they were playing in causing disease have not yet been well determined. In this study, we present a novel method to identify associations between protein complexes and diseases. First, we construct a disease-protein heterogeneous network based on data integration and laplacian normalization. Second, we apply a random walk with restart on heterogeneous network (RWRH) algorithm on this network to quantify the strength of the association between proteins and the query disease. Third, we sum over the scores of member proteins to obtain a summary score for each candidate protein complex, and then rank all candidate protein complexes according to their scores. With a series of leave-one-out cross-validation experiments, we found that our method not only possesses high performance but also demonstrates robustness regarding the parameters and the network structure. We test our approach with breast cancer and select top 20 highly ranked protein complexes, 17 of the selected protein complexes are evidenced to be connected with breast cancer. Our proposed method is effective in identifying disease-related protein complexes based on data integration and laplacian normalization. Copyright © 2017. Published by Elsevier Ltd.
Click-MS: Tagless Protein Enrichment Using Bioorthogonal Chemistry for Quantitative Proteomics.
Smits, Arne H; Borrmann, Annika; Roosjen, Mark; van Hest, Jan C M; Vermeulen, Michiel
2016-12-16
Epitope-tagging is an effective tool to facilitate protein enrichment from crude cell extracts. Traditionally, N- or C-terminal fused tags are employed, which, however, can perturb protein function. Unnatural amino acids (UAAs) harboring small reactive handles can be site-specifically incorporated into proteins, thus serving as a potential alternative for conventional protein tags. Here, we introduce Click-MS, which combines the power of site-specific UAA incorporation, bioorthogonal chemistry, and quantitative mass spectrometry-based proteomics to specifically enrich a single protein of interest from crude mammalian cell extracts. By genetic encoding of p-azido-l-phenylalanine, the protein of interest can be selectively captured using copper-free click chemistry. We use Click-MS to enrich proteins that function in different cellular compartments, and we identify protein-protein interactions, showing the great potential of Click-MS for interaction proteomics workflows.
Li, H; Ji, H; Wu, S S; Hou, B X
2016-12-09
Objective: To analyze the protein expression profile and the potential virulence factors of Porphyromonas endodontalis (Pe) via comparison with that of two strains of Porphyromonas gingivalis (Pg) with high and low virulences, respectively. Methods: Whole cell comparative proteomics of Pe ATCC35406 was examined and compared with that of high virulent strain Pg W83 andlow virulent strain Pg ATCC33277, respectively. Isobaric tags for relative and absolute quantitation (iTRAQ) combined with nano liquid chromatography-tandem mass spectrometry (Nano-LC-MS/MS) were adopted to identify and quantitate the proteins of Pe and two strains of Pg with various virulences by using the methods of isotopically labeled peptides, mass spectrometric detection and bioinformatics analysis. The biological functions of similar proteins expressed by Pe ATCC35406 and two strains of Pg were quantified and analyzed. Results: Totally 1 210 proteins were identified while Pe compared with Pg W83. There were 130 proteins (10.74% of the total proteins) expressed similarly, including 89 known functional proteins and 41 proteins of unknown functions. Totally 1 223 proteins were identified when Pe compared with Pg ATCC33277. There were 110 proteins (8.99% of the total proteins) expressed similarly, including 72 known functional proteins and 38 proteins of unknown functions. The similarly expressed proteins in Pe and Pg strains with various virulences mainly focused on catalytic activity and binding function, including recombination activation gene (RagA), lipoprotein, chaperonin Dnak, Clp family proteins (ClpC and ClpX) and various iron-binding proteins. They were involved in metabolism and cellular processes. In addition, the type and number of similar virulence proteins between Pe and high virulence Pg were higher than those between Pe and low virulence Pg. Conclusions: Lipoprotein, oxygen resistance protein, iron binding protein were probably the potential virulence factors of Pe ATCC35406. It was speculated that pathogenicity of Pe was more similar to high virulence Pg than that to low virulence strain.
Yu, Geng; Rosenberg, Julian N; Betenbaugh, Michael J; Oyler, George A
2015-12-01
Protein degradation in normal living cells is precisely regulated to match the cells' physiological requirements. The selectivity of protein degradation is determined by an elaborate degron-tagging system. Degron refers to an amino acid sequence that encodes a protein degradation signal, which is oftentimes a poly-ubiquitin chain that can be transferred to other proteins. Current understanding of ubiquitination dependent and independent protein degradation processes has expanded the application of degrons for targeted protein degradation and novel cell engineering strategies. Recent findings suggest that small molecules inducing protein association can be exploited to create degrons that target proteins for degradation. Here, recent applications of degron-based targeted protein degradation in eukaryotic organisms are reviewed. The degron mediated protein degradation represents a rapidly tunable methodology to control protein abundance, which has broad application in therapeutics and cellular function control and monitoring. Copyright © 2015. Published by Elsevier Ltd.
Protein Structure Classification and Loop Modeling Using Multiple Ramachandran Distributions.
Najibi, Seyed Morteza; Maadooliat, Mehdi; Zhou, Lan; Huang, Jianhua Z; Gao, Xin
2017-01-01
Recently, the study of protein structures using angular representations has attracted much attention among structural biologists. The main challenge is how to efficiently model the continuous conformational space of the protein structures based on the differences and similarities between different Ramachandran plots. Despite the presence of statistical methods for modeling angular data of proteins, there is still a substantial need for more sophisticated and faster statistical tools to model the large-scale circular datasets. To address this need, we have developed a nonparametric method for collective estimation of multiple bivariate density functions for a collection of populations of protein backbone angles. The proposed method takes into account the circular nature of the angular data using trigonometric spline which is more efficient compared to existing methods. This collective density estimation approach is widely applicable when there is a need to estimate multiple density functions from different populations with common features. Moreover, the coefficients of adaptive basis expansion for the fitted densities provide a low-dimensional representation that is useful for visualization, clustering, and classification of the densities. The proposed method provides a novel and unique perspective to two important and challenging problems in protein structure research: structure-based protein classification and angular-sampling-based protein loop structure prediction.
The actin binding cytoskeletal protein Moesin is involved in nuclear mRNA export.
Kristó, Ildikó; Bajusz, Csaba; Borsos, Barbara N; Pankotai, Tibor; Dopie, Joseph; Jankovics, Ferenc; Vartiainen, Maria K; Erdélyi, Miklós; Vilmos, Péter
2017-10-01
Current models imply that the evolutionarily conserved, actin-binding Ezrin-Radixin-Moesin (ERM) proteins perform their activities at the plasma membrane by anchoring membrane proteins to the cortical actin network. Here we show that beside its cytoplasmic functions, the single ERM protein of Drosophila, Moesin, has a novel role in the nucleus. The activation of transcription by heat shock or hormonal treatment increases the amount of nuclear Moesin, indicating biological function for the protein in the nucleus. The distribution of Moesin in the nucleus suggests a function in transcription and the depletion of mRNA export factors Nup98 or its interacting partner, Rae1, leads to the nuclear accumulation of Moesin, suggesting that the nuclear function of the protein is linked to mRNA export. Moesin localizes to mRNP particles through the interaction with the mRNA export factor PCID2 and knock down of Moesin leads to the accumulation of mRNA in the nucleus. Based on our results we propose that, beyond its well-known, manifold functions in the cytoplasm, the ERM protein of Drosophila is a new, functional component of the nucleus where it participates in mRNA export. Copyright © 2017 Elsevier B.V. All rights reserved.
The Protein Information Resource: an integrated public resource of functional annotation of proteins
Wu, Cathy H.; Huang, Hongzhan; Arminski, Leslie; Castro-Alvear, Jorge; Chen, Yongxing; Hu, Zhang-Zhi; Ledley, Robert S.; Lewis, Kali C.; Mewes, Hans-Werner; Orcutt, Bruce C.; Suzek, Baris E.; Tsugita, Akira; Vinayaka, C. R.; Yeh, Lai-Su L.; Zhang, Jian; Barker, Winona C.
2002-01-01
The Protein Information Resource (PIR) serves as an integrated public resource of functional annotation of protein data to support genomic/proteomic research and scientific discovery. The PIR, in collaboration with the Munich Information Center for Protein Sequences (MIPS) and the Japan International Protein Information Database (JIPID), produces the PIR-International Protein Sequence Database (PSD), the major annotated protein sequence database in the public domain, containing about 250 000 proteins. To improve protein annotation and the coverage of experimentally validated data, a bibliography submission system is developed for scientists to submit, categorize and retrieve literature information. Comprehensive protein information is available from iProClass, which includes family classification at the superfamily, domain and motif levels, structural and functional features of proteins, as well as cross-references to over 40 biological databases. To provide timely and comprehensive protein data with source attribution, we have introduced a non-redundant reference protein database, PIR-NREF. The database consists of about 800 000 proteins collected from PIR-PSD, SWISS-PROT, TrEMBL, GenPept, RefSeq and PDB, with composite protein names and literature data. To promote database interoperability, we provide XML data distribution and open database schema, and adopt common ontologies. The PIR web site (http://pir.georgetown.edu/) features data mining and sequence analysis tools for information retrieval and functional identification of proteins based on both sequence and annotation information. The PIR databases and other files are also available by FTP (ftp://nbrfa.georgetown.edu/pir_databases). PMID:11752247
Protein-based underwater adhesives and the prospects for their biotechnological production.
Stewart, Russell J
2011-01-01
Biotechnological approaches to practical production of biological protein-based adhesives have had limited success over the last several decades. Broader efforts to produce recombinant adhesive proteins may have been limited by early disappointments. More recent synthetic polymer approaches have successfully replicated some aspects of natural underwater adhesives. For example, synthetic polymers, inspired by mussels, containing the catecholic functional group of 3,4-L-dihydroxyphenylalanine adhere strongly to wet metal oxide surfaces. Synthetic complex coacervates inspired by the Sandcastle worm are water-borne adhesives that can be delivered underwater without dispersing. Synthetic approaches offer several advantages, including versatile chemistries and scalable production. In the future, more sophisticated mimetic adhesives may combine synthetic copolymers with recombinant or agriculture-derived proteins to better replicate the structural and functional organization of natural adhesives.
Protein-based underwater adhesives and the prospects for their biotechnological production
Stewart, Russell J.
2011-01-01
Biotechnological approaches to practical production of biological protein-based adhesives have had limited success over the last several decades. Broader efforts to produce recombinant adhesive proteins may have been limited by early disappointments. More recent synthetic polymer approaches have successfully replicated some aspects of natural underwater adhesives. For example, synthetic polymers, inspired by mussels, containing the catecholic functional group of 3,4-L-dihydroxyphenylalanine adhere strongly to wet metal oxide surfaces. Synthetic complex coacervates inspired by the Sandcastle worm are water-borne adhesives that can be delivered underwater without dispersing. Synthetic approaches offer several advantages, including versatile chemistries and scalable production. In the future, more sophisticated mimetic adhesives may combine synthetic copolymers with recombinant or agriculture-derived proteins to better replicate the structural and functional organization of natural adhesives. PMID:20890598
Linking the proteins--elucidation of proteome-scale networks using mass spectrometry.
Pflieger, Delphine; Gonnet, Florence; de la Fuente van Bentem, Sergio; Hirt, Heribert; de la Fuente, Alberto
2011-01-01
Proteomes are intricate. Typically, thousands of proteins interact through physical association and post-translational modifications (PTMs) to give rise to the emergent functions of cells. Understanding these functions requires one to study proteomes as "systems" rather than collections of individual protein molecules. The abstraction of the interacting proteome to "protein networks" has recently gained much attention, as networks are effective representations, that lose specific molecular details, but provide the ability to see the proteome as a whole. Mostly two aspects of the proteome have been represented by network models: proteome-wide physical protein-protein-binding interactions organized into Protein Interaction Networks (PINs), and proteome-wide PTM relations organized into Protein Signaling Networks (PSNs). Mass spectrometry (MS) techniques have been shown to be essential to reveal both of these aspects on a proteome-wide scale. Techniques such as affinity purification followed by MS have been used to elucidate protein-protein interactions, and MS-based quantitative phosphoproteomics is critical to understand the structure and dynamics of signaling through the proteome. We here review the current state-of-the-art MS-based analytical pipelines for the purpose to characterize proteome-scale networks. Copyright © 2010 Wiley Periodicals, Inc.
Hackenberg, Dieter; McKain, Michael R; Lee, Soon Goo; Roy Choudhury, Swarup; McCann, Tyler; Schreier, Spencer; Harkess, Alex; Pires, J Chris; Wong, Gane Ka-Shu; Jez, Joseph M; Kellogg, Elizabeth A; Pandey, Sona
2017-10-01
Signaling pathways regulated by heterotrimeric G-proteins exist in all eukaryotes. The regulator of G-protein signaling (RGS) proteins are key interactors and critical modulators of the Gα protein of the heterotrimer. However, while G-proteins are widespread in plants, RGS proteins have been reported to be missing from the entire monocot lineage, with two exceptions. A single amino acid substitution-based adaptive coevolution of the Gα:RGS proteins was proposed to enable the loss of RGS in monocots. We used a combination of evolutionary and biochemical analyses and homology modeling of the Gα and RGS proteins to address their expansion and its potential effects on the G-protein cycle in plants. Our results show that RGS proteins are widely distributed in the monocot lineage, despite their frequent loss. There is no support for the adaptive coevolution of the Gα:RGS protein pair based on single amino acid substitutions. RGS proteins interact with, and affect the activity of, Gα proteins from species with or without endogenous RGS. This cross-functional compatibility expands between the metazoan and plant kingdoms, illustrating striking conservation of their interaction interface. We propose that additional proteins or alternative mechanisms may exist which compensate for the loss of RGS in certain plant species. © 2016 The Authors. New Phytologist © 2016 New Phytologist Trust.
Beta-propellers: associated functions and their role in human diseases.
Pons, Tirso; Gómez, Raú; Chinea, Glay; Valencia, Alfonso
2003-03-01
The beta-propeller fold appears as a very fascinating architecture based on four-stranded antiparallel and twisted beta-sheets, radially arranged around a central tunnel. Similar to the alpha/beta-barrel (TIM-barrel) fold, the beta-propeller has a wide range of different functions, and is gaining substantial attention. Some proteins containing beta-propeller domains have been implicated in the pathogenesis of a variety of diseases such as cancer, Alzheimer, Huntington, arthritis, familial hypercholesterolemia, retinitis pigmentosa, osteogenesis, hypertension, and microbial and viral infections. This article reviews some aspects of 3D structure, amino acids sequence regularities, and biological functions of the proteins containing beta-propeller domains. Major emphasis has been laid on beta-propellers whose functions are associated to human diseases. Recent research efforts reported in the fields of protein engineering, drug design, and protein structure-function relationship studies, concerning the beta-propeller architecture, have also been discussed.
GOLabeler: Improving Sequence-based Large-scale Protein Function Prediction by Learning to Rank.
You, Ronghui; Zhang, Zihan; Xiong, Yi; Sun, Fengzhu; Mamitsuka, Hiroshi; Zhu, Shanfeng
2018-03-07
Gene Ontology (GO) has been widely used to annotate functions of proteins and understand their biological roles. Currently only <1% of more than 70 million proteins in UniProtKB have experimental GO annotations, implying the strong necessity of automated function prediction (AFP) of proteins, where AFP is a hard multilabel classification problem due to one protein with a diverse number of GO terms. Most of these proteins have only sequences as input information, indicating the importance of sequence-based AFP (SAFP: sequences are the only input). Furthermore homology-based SAFP tools are competitive in AFP competitions, while they do not necessarily work well for so-called difficult proteins, which have <60% sequence identity to proteins with annotations already. Thus the vital and challenging problem now is how to develop a method for SAFP, particularly for difficult proteins. The key of this method is to extract not only homology information but also diverse, deep- rooted information/evidence from sequence inputs and integrate them into a predictor in a both effective and efficient manner. We propose GOLabeler, which integrates five component classifiers, trained from different features, including GO term frequency, sequence alignment, amino acid trigram, domains and motifs, and biophysical properties, etc., in the framework of learning to rank (LTR), a paradigm of machine learning, especially powerful for multilabel classification. The empirical results obtained by examining GOLabeler extensively and thoroughly by using large-scale datasets revealed numerous favorable aspects of GOLabeler, including significant performance advantage over state-of-the-art AFP methods. http://datamining-iip.fudan.edu.cn/golabeler. zhusf@fudan.edu.cn. Supplementary data are available at Bioinformatics online.
Chen, Fu; Sun, Huiyong; Wang, Junmei; Zhu, Feng; Liu, Hui; Wang, Zhe; Lei, Tailong; Li, Youyong; Hou, Tingjun
2018-06-21
Molecular docking provides a computationally efficient way to predict the atomic structural details of protein-RNA interactions (PRI), but accurate prediction of the three-dimensional structures and binding affinities for PRI is still notoriously difficult, partly due to the unreliability of the existing scoring functions for PRI. MM/PBSA and MM/GBSA are more theoretically rigorous than most scoring functions for protein-RNA docking, but their prediction performance for protein-RNA systems remains unclear. Here, we systemically evaluated the capability of MM/PBSA and MM/GBSA to predict the binding affinities and recognize the near-native binding structures for protein-RNA systems with different solvent models and interior dielectric constants (ϵ in ). For predicting the binding affinities, the predictions given by MM/GBSA based on the minimized structures in explicit solvent and the GBGBn1 model with ϵ in = 2 yielded the highest correlation with the experimental data. Moreover, the MM/GBSA calculations based on the minimized structures in implicit solvent and the GBGBn1 model distinguished the near-native binding structures within the top 10 decoys for 118 out of the 149 protein-RNA systems (79.2%). This performance is better than all docking scoring functions studied here. Therefore, the MM/GBSA rescoring is an efficient way to improve the prediction capability of scoring functions for protein-RNA systems. Published by Cold Spring Harbor Laboratory Press for the RNA Society.
Functional analysis of proteins and protein species using shotgun proteomics and linear mathematics.
Hoehenwarter, Wolfgang; Chen, Yanmei; Recuenco-Munoz, Luis; Wienkoop, Stefanie; Weckwerth, Wolfram
2011-07-01
Covalent post-translational modification of proteins is the primary modulator of protein function in the cell. It greatly expands the functional potential of the proteome compared to the genome. In the past few years shotgun proteomics-based research, where the proteome is digested into peptides prior to mass spectrometric analysis has been prolific in this area. It has determined the kinetics of tens of thousands of sites of covalent modification on an equally large number of proteins under various biological conditions and uncovered a transiently active regulatory network that extends into diverse branches of cellular physiology. In this review, we discuss this work in light of the concept of protein speciation, which emphasizes the entire post-translationally modified molecule and its interactions and not just the modification site as the functional entity. Sometimes, particularly when considering complex multisite modification, all of the modified molecular species involved in the investigated condition, the protein species must be completely resolved for full understanding. We present a mathematical technique that delivers a good approximation for shotgun proteomics data.
Hughes, Stephen R; Butt, Tauseef R; Bartolett, Scott; Riedmuller, Steven B; Farrelly, Philip
2011-08-01
The molecular biological techniques for plasmid-based assembly and cloning of gene open reading frames are essential for elucidating the function of the proteins encoded by the genes. High-throughput integrated robotic molecular biology platforms that have the capacity to rapidly clone and express heterologous gene open reading frames in bacteria and yeast and to screen large numbers of expressed proteins for optimized function are an important technology for improving microbial strains for biofuel production. The process involves the production of full-length complementary DNA libraries as a source of plasmid-based clones to express the desired proteins in active form for determination of their functions. Proteins that were identified by high-throughput screening as having desired characteristics are overexpressed in microbes to enable them to perform functions that will allow more cost-effective and sustainable production of biofuels. Because the plasmid libraries are composed of several thousand unique genes, automation of the process is essential. This review describes the design and implementation of an automated integrated programmable robotic workcell capable of producing complementary DNA libraries, colony picking, isolating plasmid DNA, transforming yeast and bacteria, expressing protein, and performing appropriate functional assays. These operations will allow tailoring microbial strains to use renewable feedstocks for production of biofuels, bioderived chemicals, fertilizers, and other coproducts for profitable and sustainable biorefineries. Published by Elsevier Inc.
Modulation of electronic structures of bases through DNA recognition of protein.
Hagiwara, Yohsuke; Kino, Hiori; Tateno, Masaru
2010-04-21
The effects of environmental structures on the electronic states of functional regions in a fully solvated DNA·protein complex were investigated using combined ab initio quantum mechanics/molecular mechanics calculations. A complex of a transcriptional factor, PU.1, and the target DNA was used for the calculations. The effects of solvent on the energies of molecular orbitals (MOs) of some DNA bases strongly correlate with the magnitude of masking of the DNA bases from the solvent by the protein. In the complex, PU.1 causes a variation in the magnitude among DNA bases by means of directly recognizing the DNA bases through hydrogen bonds and inducing structural changes of the DNA structure from the canonical one. Thus, the strong correlation found in this study is the first evidence showing the close quantitative relationship between recognition modes of DNA bases and the energy levels of the corresponding MOs. Thus, it has been revealed that the electronic state of each base is highly regulated and organized by the DNA recognition of the protein. Other biological macromolecular systems can be expected to also possess similar modulation mechanisms, suggesting that this finding provides a novel basis for the understanding for the regulation functions of biological macromolecular systems.
Dutagaci, Bercem; Wittayanarakul, Kitiyaporn; Mori, Takaharu; Feig, Michael
2017-06-13
A scoring protocol based on implicit membrane-based scoring functions and a new protocol for optimizing the positioning of proteins inside the membrane was evaluated for its capacity to discriminate native-like states from misfolded decoys. A decoy set previously established by the Baker lab (Proteins: Struct., Funct., Genet. 2006, 62, 1010-1025) was used along with a second set that was generated to cover higher resolution models. The Implicit Membrane Model 1 (IMM1), IMM1 model with CHARMM 36 parameters (IMM1-p36), generalized Born with simple switching (GBSW), and heterogeneous dielectric generalized Born versions 2 (HDGBv2) and 3 (HDGBv3) were tested along with the new HDGB van der Waals (HDGBvdW) model that adds implicit van der Waals contributions to the solvation free energy. For comparison, scores were also calculated with the distance-scaled finite ideal-gas reference (DFIRE) scoring function. Z-scores for native state discrimination, energy vs root-mean-square deviation (RMSD) correlations, and the ability to select the most native-like structures as top-scoring decoys were evaluated to assess the performance of the scoring functions. Ranking of the decoys in the Baker set that were relatively far from the native state was challenging and dominated largely by packing interactions that were captured best by DFIRE with less benefit of the implicit membrane-based models. Accounting for the membrane environment was much more important in the second decoy set where especially the HDGB-based scoring functions performed very well in ranking decoys and providing significant correlations between scores and RMSD, which shows promise for improving membrane protein structure prediction and refinement applications. The new membrane structure scoring protocol was implemented in the MEMScore web server ( http://feiglab.org/memscore ).
Bioengineered protein-based nanocage for drug delivery.
Lee, Eun Jung; Lee, Na Kyeong; Kim, In-San
2016-11-15
Nature, in its wonders, presents and assembles the most intricate and delicate protein structures and this remarkable phenomenon occurs in all kingdom and phyla of life. Of these proteins, cage-like multimeric proteins provide spatial control to biological processes and also compartmentalizes compounds that may be toxic or unstable and avoids their contact with the environment. Protein-based nanocages are of particular interest because of their potential applicability as drug delivery carriers and their perfect and complex symmetry and ideal physical properties, which have stimulated researchers to engineer, modify or mimic these qualities. This article reviews various existing types of protein-based nanocages that are used for therapeutic purposes, and outlines their drug-loading mechanisms and bioengineering strategies via genetic and chemical functionalization. Through a critical evaluation of recent advances in protein nanocage-based drug delivery in vitro and in vivo, an outlook for de novo and in silico nanocage design, and also protein-based nanocage preclinical and future clinical applications will be presented. Copyright © 2016 Elsevier B.V. All rights reserved.
Processing and characteristics of canola protein-based biodegradable packaging: A review.
Zhang, Yachuan; Liu, Qiang; Rempel, Curtis
2018-02-11
Interest increased recently in manufacturing food packaging, such as films and coatings, from protein-based biopolymers. Among various protein sources, canola protein is a novel source for manufacturing polymer films. It can be concentrated or isolated by aqueous extraction technology followed by protein precipitation. Using this procedure, it was claimed that more than 99% of protein was extracted from the defatted canola meal, and protein recovery was 87.5%. Canola protein exhibits thermoplastic properties when plasticizers are present, including water, glycerol, polyethylene glycol, and sorbitol. Addition of these plasticizers allows the canola protein to undergo glass transition and facilitates deformation and processability. Normally, canola protein-based bioplastics showed low mechanical properties, which had tensile strength (TS) of 1.19 to 4.31 MPa. So, various factors were explored to improve it, including blending with synthetic polymers, modifying protein functionality through controlled denaturation, and adding cross-linking agents. Canola protein-based bioplastics were reported to have glass transition temperature, T g , below -50°C but it highly depends on the plasticizer content. Canola protein-based bioplastics have demonstrated comparable mechanical and moisture barrier properties compared with other plant protein-based bioplastics. They have great potential in food packaging applications, including their use as wraps, sacks, sachets, or pouches.
Recent advances in proteomics of cereals.
Bansal, Monika; Sharma, Madhu; Kanwar, Priyanka; Goyal, Aakash
Cereals contribute a major part of human nutrition and are considered as an integral source of energy for human diets. With genomic databases already available in cereals such as rice, wheat, barley, and maize, the focus has now moved to proteome analysis. Proteomics studies involve the development of appropriate databases based on developing suitable separation and purification protocols, identification of protein functions, and can confirm their functional networks based on already available data from other sources. Tremendous progress has been made in the past decade in generating huge data-sets for covering interactions among proteins, protein composition of various organs and organelles, quantitative and qualitative analysis of proteins, and to characterize their modulation during plant development, biotic, and abiotic stresses. Proteomics platforms have been used to identify and improve our understanding of various metabolic pathways. This article gives a brief review of efforts made by different research groups on comparative descriptive and functional analysis of proteomics applications achieved in the cereal science so far.
Generation of Rab-based transgenic lines for in vivo studies of endosome biology in zebrafish
Clark, Brian S.; Winter, Mark; Cohen, Andrew R.; Link, Brian A.
2011-01-01
The Rab family of small GTPases function as molecular switches regulating membrane and protein trafficking. Individual Rab isoforms define and are required for specific endosomal compartments. To facilitate in vivo investigation of specific Rab proteins, and endosome biology in general, we have generated transgenic zebrafish lines to mark and manipulate Rab proteins. We also developed software to track and quantify endosome dynamics within time-lapse movies. The established transgenic lines ubiquitously express EGFP fusions of Rab5c (early endosomes), Rab11a (recycling endosomes), and Rab7 (late endosomes) to study localization and dynamics during development. Additionally, we generated UAS-based transgenic lines expressing constitutive active (CA) and dominant negative (DN) versions for each of these Rab proteins. Predicted localization and functional consequences for each line were verified through a variety of assays, including lipophilic dye uptake and Crumbs2a localization. In summary, we have established a toolset for in vivo analyses of endosome dynamics and functions. PMID:21976318
Pesaranghader, Ahmad; Matwin, Stan; Sokolova, Marina; Beiko, Robert G
2016-05-01
Measures of protein functional similarity are essential tools for function prediction, evaluation of protein-protein interactions (PPIs) and other applications. Several existing methods perform comparisons between proteins based on the semantic similarity of their GO terms; however, these measures are highly sensitive to modifications in the topological structure of GO, tend to be focused on specific analytical tasks and concentrate on the GO terms themselves rather than considering their textual definitions. We introduce simDEF, an efficient method for measuring semantic similarity of GO terms using their GO definitions, which is based on the Gloss Vector measure commonly used in natural language processing. The simDEF approach builds optimized definition vectors for all relevant GO terms, and expresses the similarity of a pair of proteins as the cosine of the angle between their definition vectors. Relative to existing similarity measures, when validated on a yeast reference database, simDEF improves correlation with sequence homology by up to 50%, shows a correlation improvement >4% with gene expression in the biological process hierarchy of GO and increases PPI predictability by > 2.5% in F1 score for molecular function hierarchy. Datasets, results and source code are available at http://kiwi.cs.dal.ca/Software/simDEF CONTACT: ahmad.pgh@dal.ca or beiko@cs.dal.ca Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Expanded microbial genome coverage and improved protein family annotation in the COG database.
Galperin, Michael Y; Makarova, Kira S; Wolf, Yuri I; Koonin, Eugene V
2015-01-01
Microbial genome sequencing projects produce numerous sequences of deduced proteins, only a small fraction of which have been or will ever be studied experimentally. This leaves sequence analysis as the only feasible way to annotate these proteins and assign to them tentative functions. The Clusters of Orthologous Groups of proteins (COGs) database (http://www.ncbi.nlm.nih.gov/COG/), first created in 1997, has been a popular tool for functional annotation. Its success was largely based on (i) its reliance on complete microbial genomes, which allowed reliable assignment of orthologs and paralogs for most genes; (ii) orthology-based approach, which used the function(s) of the characterized member(s) of the protein family (COG) to assign function(s) to the entire set of carefully identified orthologs and describe the range of potential functions when there were more than one; and (iii) careful manual curation of the annotation of the COGs, aimed at detailed prediction of the biological function(s) for each COG while avoiding annotation errors and overprediction. Here we present an update of the COGs, the first since 2003, and a comprehensive revision of the COG annotations and expansion of the genome coverage to include representative complete genomes from all bacterial and archaeal lineages down to the genus level. This re-analysis of the COGs shows that the original COG assignments had an error rate below 0.5% and allows an assessment of the progress in functional genomics in the past 12 years. During this time, functions of many previously uncharacterized COGs have been elucidated and tentative functional assignments of many COGs have been validated, either by targeted experiments or through the use of high-throughput methods. A particularly important development is the assignment of functions to several widespread, conserved proteins many of which turned out to participate in translation, in particular rRNA maturation and tRNA modification. The new version of the COGs is expected to become an important tool for microbial genomics. Published by Oxford University Press on behalf of Nucleic Acids Research 2014. This work is written by US Government employees and is in the public domain in the US.
Assessment of the reliability of protein-protein interactions and protein function prediction.
Deng, Minghua; Sun, Fengzhu; Chen, Ting
2003-01-01
As more and more high-throughput protein-protein interaction data are collected, the task of estimating the reliability of different data sets becomes increasingly important. In this paper, we present our study of two groups of protein-protein interaction data, the physical interaction data and the protein complex data, and estimate the reliability of these data sets using three different measurements: (1) the distribution of gene expression correlation coefficients, (2) the reliability based on gene expression correlation coefficients, and (3) the accuracy of protein function predictions. We develop a maximum likelihood method to estimate the reliability of protein interaction data sets according to the distribution of correlation coefficients of gene expression profiles of putative interacting protein pairs. The results of the three measurements are consistent with each other. The MIPS protein complex data have the highest mean gene expression correlation coefficients (0.256) and the highest accuracy in predicting protein functions (70% sensitivity and specificity), while Ito's Yeast two-hybrid data have the lowest mean (0.041) and the lowest accuracy (15% sensitivity and specificity). Uetz's data are more reliable than Ito's data in all three measurements, and the TAP protein complex data are more reliable than the HMS-PCI data in all three measurements as well. The complex data sets generally perform better in function predictions than do the physical interaction data sets. Proteins in complexes are shown to be more highly correlated in gene expression. The results confirm that the components of a protein complex can be assigned to functions that the complex carries out within a cell. There are three interaction data sets different from the above two groups: the genetic interaction data, the in-silico data and the syn-express data. Their capability of predicting protein functions generally falls between that of the Y2H data and that of the MIPS protein complex data. The supplementary information is available at the following Web site: http://www-hto.usc.edu/-msms/AssessInteraction/.
2013-01-01
Background Valuable clone collections encoding the complete ORFeomes for some model organisms have been constructed following the completion of their genome sequencing projects. These libraries are based on Gateway cloning technology, which facilitates the study of protein function by simplifying the subcloning of open reading frames (ORF) into any suitable destination vector. The expression of proteins of interest as fusions with functional modules is a frequent approach in their initial functional characterization. A limited number of Gateway destination expression vectors allow the construction of fusion proteins from ORFeome-derived sequences, but they are restricted to the possibilities offered by their inbuilt functional modules and their pre-defined model organism-specificity. Thus, the availability of cloning systems that overcome these limitations would be highly advantageous. Results We present a versatile cloning toolkit for constructing fully-customizable three-part fusion proteins based on the MultiSite Gateway cloning system. The fusion protein components are encoded in the three plasmids integral to the kit. These can recombine with any purposely-engineered destination vector that uses a heterologous promoter external to the Gateway cassette, leading to the in-frame cloning of an ORF of interest flanked by two functional modules. In contrast to previous systems, a third part becomes available for peptide-encoding as it no longer needs to contain a promoter, resulting in an increased number of possible fusion combinations. We have constructed the kit’s component plasmids and demonstrate its functionality by providing proof-of-principle data on the expression of prototype fluorescent fusions in transiently-transfected cells. Conclusions We have developed a toolkit for creating fusion proteins with customized N- and C-term modules from Gateway entry clones encoding ORFs of interest. Importantly, our method allows entry clones obtained from ORFeome collections to be used without prior modifications. Using this technology, any existing Gateway destination expression vector with its model-specific properties could be easily adapted for expressing fusion proteins. PMID:23957834
Buj, Raquel; Iglesias, Noa; Planas, Anna M; Santalucía, Tomàs
2013-08-20
Valuable clone collections encoding the complete ORFeomes for some model organisms have been constructed following the completion of their genome sequencing projects. These libraries are based on Gateway cloning technology, which facilitates the study of protein function by simplifying the subcloning of open reading frames (ORF) into any suitable destination vector. The expression of proteins of interest as fusions with functional modules is a frequent approach in their initial functional characterization. A limited number of Gateway destination expression vectors allow the construction of fusion proteins from ORFeome-derived sequences, but they are restricted to the possibilities offered by their inbuilt functional modules and their pre-defined model organism-specificity. Thus, the availability of cloning systems that overcome these limitations would be highly advantageous. We present a versatile cloning toolkit for constructing fully-customizable three-part fusion proteins based on the MultiSite Gateway cloning system. The fusion protein components are encoded in the three plasmids integral to the kit. These can recombine with any purposely-engineered destination vector that uses a heterologous promoter external to the Gateway cassette, leading to the in-frame cloning of an ORF of interest flanked by two functional modules. In contrast to previous systems, a third part becomes available for peptide-encoding as it no longer needs to contain a promoter, resulting in an increased number of possible fusion combinations. We have constructed the kit's component plasmids and demonstrate its functionality by providing proof-of-principle data on the expression of prototype fluorescent fusions in transiently-transfected cells. We have developed a toolkit for creating fusion proteins with customized N- and C-term modules from Gateway entry clones encoding ORFs of interest. Importantly, our method allows entry clones obtained from ORFeome collections to be used without prior modifications. Using this technology, any existing Gateway destination expression vector with its model-specific properties could be easily adapted for expressing fusion proteins.
COPRED: prediction of fold, GO molecular function and functional residues at the domain level.
López, Daniel; Pazos, Florencio
2013-07-15
Only recently the first resources devoted to the functional annotation of proteins at the domain level started to appear. The next step is to develop specific methodologies for predicting function at the domain level based on these resources, and to implement them in web servers to be used by the community. In this work, we present COPRED, a web server for the concomitant prediction of fold, molecular function and functional sites at the domain level, based on a methodology for domain molecular function prediction and a resource of domain functional annotations previously developed and benchmarked. COPRED can be freely accessed at http://csbg.cnb.csic.es/copred. The interface works in all standard web browsers. WebGL (natively supported by most browsers) is required for the in-line preview and manipulation of protein 3D structures. The website includes a detailed help section and usage examples. pazos@cnb.csic.es.
FunTree: advances in a resource for exploring and contextualising protein function evolution.
Sillitoe, Ian; Furnham, Nicholas
2016-01-04
FunTree is a resource that brings together protein sequence, structure and functional information, including overall chemical reaction and mechanistic data, for structurally defined domain superfamilies. Developed in tandem with the CATH database, the original FunTree contained just 276 superfamilies focused on enzymes. Here, we present an update of FunTree that has expanded to include 2340 superfamilies including both enzymes and proteins with non-enzymatic functions annotated by Gene Ontology (GO) terms. This allows the investigation of how novel functions have evolved within a structurally defined superfamily and provides a means to analyse trends across many superfamilies. This is done not only within the context of a protein's sequence and structure but also the relationships of their functions. New measures of functional similarity have been integrated, including for enzymes comparisons of overall reactions based on overall bond changes, reaction centres (the local environment atoms involved in the reaction) and the sub-structure similarities of the metabolites involved in the reaction and for non-enzymes semantic similarities based on the GO. To identify and highlight changes in function through evolution, ancestral character estimations are made and presented. All this is accessible through a new re-designed web interface that can be found at http://www.funtree.info. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Chowdhary, Nupoor; Selvaraj, Ashok; KrishnaKumaar, Lakshmi; Kumar, Gopal Ramesh
2015-01-01
Caldicellulosiruptor saccharolyticus has proven itself to be an excellent candidate for biological hydrogen (H2) production, but still it has major drawbacks like sensitivity to high osmotic pressure and low volumetric H2 productivity, which should be considered before it can be used industrially. A whole genome re-annotation work has been carried out as an attempt to update the incomplete genome information that causes gap in the knowledge especially in the area of metabolic engineering, to improve the H2 producing capabilities of C. saccharolyticus. Whole genome re-annotation was performed through manual means for 2,682 Coding Sequences (CDSs). Bioinformatics tools based on sequence similarity, motif search, phylogenetic analysis and fold recognition were employed for re-annotation. Our methodology could successfully add functions for 409 hypothetical proteins (HPs), 46 proteins previously annotated as putative and assigned more accurate functions for the known protein sequences. Homology based gene annotation has been used as a standard method for assigning function to novel proteins, but over the past few years many non-homology based methods such as genomic context approaches for protein function prediction have been developed. Using non-homology based functional prediction methods, we were able to assign cellular processes or physical complexes for 249 hypothetical sequences. Our re-annotation pipeline highlights the addition of 231 new CDSs generated from MicroScope Platform, to the original genome with functional prediction for 49 of them. The re-annotation of HPs and new CDSs is stored in the relational database that is available on the MicroScope web-based platform. In parallel, a comparative genome analyses were performed among the members of genus Caldicellulosiruptor to understand the function and evolutionary processes. Further, with results from integrated re-annotation studies (homology and genomic context approach), we strongly suggest that Csac_0437 and Csac_0424 encode for glycoside hydrolases (GH) and are proposed to be involved in the decomposition of recalcitrant plant polysaccharides. Similarly, HPs: Csac_0732, Csac_1862, Csac_1294 and Csac_0668 are suggested to play a significant role in biohydrogen production. Function prediction of these HPs by using our integrated approach will considerably enhance the interpretation of large-scale experiments targeting this industrially important organism. PMID:26196387
Chowdhary, Nupoor; Selvaraj, Ashok; KrishnaKumaar, Lakshmi; Kumar, Gopal Ramesh
2015-01-01
Caldicellulosiruptor saccharolyticus has proven itself to be an excellent candidate for biological hydrogen (H2) production, but still it has major drawbacks like sensitivity to high osmotic pressure and low volumetric H2 productivity, which should be considered before it can be used industrially. A whole genome re-annotation work has been carried out as an attempt to update the incomplete genome information that causes gap in the knowledge especially in the area of metabolic engineering, to improve the H2 producing capabilities of C. saccharolyticus. Whole genome re-annotation was performed through manual means for 2,682 Coding Sequences (CDSs). Bioinformatics tools based on sequence similarity, motif search, phylogenetic analysis and fold recognition were employed for re-annotation. Our methodology could successfully add functions for 409 hypothetical proteins (HPs), 46 proteins previously annotated as putative and assigned more accurate functions for the known protein sequences. Homology based gene annotation has been used as a standard method for assigning function to novel proteins, but over the past few years many non-homology based methods such as genomic context approaches for protein function prediction have been developed. Using non-homology based functional prediction methods, we were able to assign cellular processes or physical complexes for 249 hypothetical sequences. Our re-annotation pipeline highlights the addition of 231 new CDSs generated from MicroScope Platform, to the original genome with functional prediction for 49 of them. The re-annotation of HPs and new CDSs is stored in the relational database that is available on the MicroScope web-based platform. In parallel, a comparative genome analyses were performed among the members of genus Caldicellulosiruptor to understand the function and evolutionary processes. Further, with results from integrated re-annotation studies (homology and genomic context approach), we strongly suggest that Csac_0437 and Csac_0424 encode for glycoside hydrolases (GH) and are proposed to be involved in the decomposition of recalcitrant plant polysaccharides. Similarly, HPs: Csac_0732, Csac_1862, Csac_1294 and Csac_0668 are suggested to play a significant role in biohydrogen production. Function prediction of these HPs by using our integrated approach will considerably enhance the interpretation of large-scale experiments targeting this industrially important organism.
Potential toxicity of graphene to cell functions via disrupting protein-protein interactions.
Luan, Binquan; Huynh, Tien; Zhao, Lin; Zhou, Ruhong
2015-01-27
While carbon-based nanomaterials such as graphene and carbon nanotubes (CNTs) have become popular in state-of-the-art nanotechnology, their biological safety and underlying molecular mechanism is still largely unknown. Experimental studies have been focused at the cellular level and revealed good correlations between cell's death and the application of CNTs or graphene. Using large-scale all-atom molecular dynamics simulations, we theoretically investigate the potential toxicity of graphene to a biological cell at molecular level. Simulation results show that the hydrophobic protein-protein interaction (or recognition) that is essential to biological functions can be interrupted by a graphene nanosheet. Due to the hydrophobic nature of graphene, it is energetically favorable for a graphene nanosheet to enter the hydrophobic interface of two contacting proteins, such as a dimer. The forced separation of two functional proteins can disrupt the cell's metabolism and even lead to the cell's mortality.
C-Myc Protein-Protein and Protein-DNA Interactions: Targets for Therapeutic Intervention.
1997-09-01
including those of the Myc family. In fact, members of different bHLH protein subgroups, including the Myc proteins, are characterized by conserved BR...important functional consequences, and they provide insights into how different bHLH proteins can act on different targets. The zinc finger protein...roles for a number of BR residues which do not contact bases, yet are conserved within different bHLH protein sub- families (Benezra et al. 1990), and
Protein-protein recognition control by modulating electrostatic interactions.
Han, Song; Yin, Shijin; Yi, Hong; Mouhat, Stéphanie; Qiu, Su; Cao, Zhijian; Sabatier, Jean-Marc; Wu, Yingliang; Li, Wenxin
2010-06-04
Protein-protein control recognition remains a huge challenge, and its development depends on understanding the chemical and biological mechanisms by which these interactions occur. Here we describe a protein-protein control recognition technique based on the dominant electrostatic interactions occurring between the proteins. We designed a potassium channel inhibitor, BmP05-T, that was 90.32% identical to wild-type BmP05. Negatively charged residues were translocated from the nonbinding interface to the binding interface of BmP05 inhibitor, such that BmP05-T now used BmP05 nonbinding interface as the binding interface. This switch demonstrated that nonbinding interfaces were able to control the orientation of protein binding interfaces in the process of protein-protein recognition. The novel function findings of BmP05-T peptide suggested that the control recognition technique described here had the potential for use in designing and utilizing functional proteins in many biological scenarios.
Ohno, Yusuke; Kashio, Atsushi; Ogata, Ren; Ishitomi, Akihiro; Yamazaki, Yuki; Kihara, Akio
2012-01-01
Palmitoylation plays important roles in the regulation of protein localization, stability, and activity. The protein acyltransferases (PATs) have a common DHHC Cys-rich domain. Twenty-three DHHC proteins have been identified in humans. However, it is unclear whether all of these DHHC proteins function as PATs. In addition, their substrate specificities remain largely unknown. Here we develop a useful method to examine substrate specificities of PATs using a yeast expression system with six distinct model substrates. We identify 17 human DHHC proteins as PATs. Moreover, we classify 11 human and 5 yeast DHHC proteins into three classes (I, II, and III), based on the cellular localization of their respective substrates (class I, soluble proteins; class II, integral membrane proteins; class III, lipidated proteins). Our results may provide an important clue for understanding the function of individual DHHC proteins. PMID:23034182
ERIC Educational Resources Information Center
Bethel, Casey M.; Lieberman, Raquel L.
2014-01-01
Here we present a multidisciplinary educational unit intended for general, advanced placement, or international baccalaureate-level high school science, focused on the three-dimensional structure of proteins and their connection to function and disease. The lessons are designed within the framework of the Next Generation Science Standards to make…
Reversible Immobilization of Proteins in Sensors and Solid-State Nanopores.
Ananth, Adithya; Genua, María; Aissaoui, Nesrine; Díaz, Leire; Eisele, Nico B; Frey, Steffen; Dekker, Cees; Richter, Ralf P; Görlich, Dirk
2018-05-01
The controlled functionalization of surfaces with proteins is crucial for many analytical methods in life science research and biomedical applications. Here, a coating for silica-based surfaces is established which enables stable and selective immobilization of proteins with controlled orientation and tunable surface density. The coating is reusable, retains functionality upon long-term storage in air, and is applicable to surfaces of complex geometry. The protein anchoring method is validated on planar surfaces, and then a method is developed to measure the anchoring process in real time using silicon nitride solid-state nanopores. For surface attachment, polyhistidine tags that are site specifically introduced into recombinant proteins are exploited, and the yeast nucleoporin Nsp1 is used as model protein. Contrary to the commonly used covalent thiol chemistry, the anchoring of proteins via polyhistidine tag is reversible, permitting to take proteins off and replace them by other ones. Such switching in real time in experiments on individual nanopores is monitored using ion conductivity. Finally, it is demonstrated that silica and gold surfaces can be orthogonally functionalized to accommodate polyhistidine-tagged proteins on silica but prevent protein binding to gold, which extends the applicability of this surface functionalization method to even more complex sensor devices. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Zheng, Wenjun
2010-01-01
Abstract Protein conformational dynamics, despite its significant anharmonicity, has been widely explored by normal mode analysis (NMA) based on atomic or coarse-grained potential functions. To account for the anharmonic aspects of protein dynamics, this study proposes, and has performed, an anharmonic NMA (ANMA) based on the Cα-only elastic network models, which assume elastic interactions between pairs of residues whose Cα atoms or heavy atoms are within a cutoff distance. The key step of ANMA is to sample an anharmonic potential function along the directions of eigenvectors of the lowest normal modes to determine the mean-squared fluctuations along these directions. ANMA was evaluated based on the modeling of anisotropic displacement parameters (ADPs) from a list of 83 high-resolution protein crystal structures. Significant improvement was found in the modeling of ADPs by ANMA compared with standard NMA. Further improvement in the modeling of ADPs is attained if the interactions between a protein and its crystalline environment are taken into account. In addition, this study has determined the optimal cutoff distances for ADP modeling based on elastic network models, and these agree well with the peaks of the statistical distributions of distances between Cα atoms or heavy atoms derived from a large set of protein crystal structures. PMID:20550915
Comparative Analysis and Distribution of Omega-3 lcPUFA Biosynthesis Genes in Marine Molluscs
Surm, Joachim M.; Prentis, Peter J.; Pavasovic, Ana
2015-01-01
Recent research has identified marine molluscs as an excellent source of omega-3 long-chain polyunsaturated fatty acids (lcPUFAs), based on their potential for endogenous synthesis of lcPUFAs. In this study we generated a representative list of fatty acyl desaturase (Fad) and elongation of very long-chain fatty acid (Elovl) genes from major orders of Phylum Mollusca, through the interrogation of transcriptome and genome sequences, and various publicly available databases. We have identified novel and uncharacterised Fad and Elovl sequences in the following species: Anadara trapezia, Nerita albicilla, Nerita melanotragus, Crassostrea gigas, Lottia gigantea, Aplysia californica, Loligo pealeii and Chlamys farreri. Based on alignments of translated protein sequences of Fad and Elovl genes, the haeme binding motif and histidine boxes of Fad proteins, and the histidine box and seventeen important amino acids in Elovl proteins, were highly conserved. Phylogenetic analysis of aligned reference sequences was used to reconstruct the evolutionary relationships for Fad and Elovl genes separately. Multiple, well resolved clades for both the Fad and Elovl sequences were observed, suggesting that repeated rounds of gene duplication best explain the distribution of Fad and Elovl proteins across the major orders of molluscs. For Elovl sequences, one clade contained the functionally characterised Elovl5 proteins, while another clade contained proteins hypothesised to have Elovl4 function. Additional well resolved clades consisted only of uncharacterised Elovl sequences. One clade from the Fad phylogeny contained only uncharacterised proteins, while the other clade contained functionally characterised delta-5 desaturase proteins. The discovery of an uncharacterised Fad clade is particularly interesting as these divergent proteins may have novel functions. Overall, this paper presents a number of novel Fad and Elovl genes suggesting that many mollusc groups possess most of the required enzymes for the synthesis of lcPUFAs. PMID:26308548
Predicting nucleic acid binding interfaces from structural models of proteins.
Dror, Iris; Shazman, Shula; Mukherjee, Srayanta; Zhang, Yang; Glaser, Fabian; Mandel-Gutfreund, Yael
2012-02-01
The function of DNA- and RNA-binding proteins can be inferred from the characterization and accurate prediction of their binding interfaces. However, the main pitfall of various structure-based methods for predicting nucleic acid binding function is that they are all limited to a relatively small number of proteins for which high-resolution three-dimensional structures are available. In this study, we developed a pipeline for extracting functional electrostatic patches from surfaces of protein structural models, obtained using the I-TASSER protein structure predictor. The largest positive patches are extracted from the protein surface using the patchfinder algorithm. We show that functional electrostatic patches extracted from an ensemble of structural models highly overlap the patches extracted from high-resolution structures. Furthermore, by testing our pipeline on a set of 55 known nucleic acid binding proteins for which I-TASSER produces high-quality models, we show that the method accurately identifies the nucleic acids binding interface on structural models of proteins. Employing a combined patch approach we show that patches extracted from an ensemble of models better predicts the real nucleic acid binding interfaces compared with patches extracted from independent models. Overall, these results suggest that combining information from a collection of low-resolution structural models could be a valuable approach for functional annotation. We suggest that our method will be further applicable for predicting other functional surfaces of proteins with unknown structure. Copyright © 2011 Wiley Periodicals, Inc.
PDB2Graph: A toolbox for identifying critical amino acids map in proteins based on graph theory.
Niknam, Niloofar; Khakzad, Hamed; Arab, Seyed Shahriar; Naderi-Manesh, Hossein
2016-05-01
The integrative and cooperative nature of protein structure involves the assessment of topological and global features of constituent parts. Network concept takes complete advantage of both of these properties in the analysis concomitantly. High compatibility to structural concepts or physicochemical properties in addition to exploiting a remarkable simplification in the system has made network an ideal tool to explore biological systems. There are numerous examples in which different protein structural and functional characteristics have been clarified by the network approach. Here, we present an interactive and user-friendly Matlab-based toolbox, PDB2Graph, devoted to protein structure network construction, visualization, and analysis. Moreover, PDB2Graph is an appropriate tool for identifying critical nodes involved in protein structural robustness and function based on centrality indices. It maps critical amino acids in protein networks and can greatly aid structural biologists in selecting proper amino acid candidates for manipulating protein structures in a more reasonable and rational manner. To introduce the capability and efficiency of PDB2Graph in detail, the structural modification of Calmodulin through allosteric binding of Ca(2+) is considered. In addition, a mutational analysis for three well-identified model proteins including Phage T4 lysozyme, Barnase and Ribonuclease HI, was performed to inspect the influence of mutating important central residues on protein activity. Copyright © 2016 Elsevier Ltd. All rights reserved.
Multifarious Roles of Intrinsic Disorder in Proteins Illustrate Its Broad Impact on Plant Biology
Sun, Xiaolin; Rikkerink, Erik H.A.; Jones, William T.; Uversky, Vladimir N.
2013-01-01
Intrinsically disordered proteins (IDPs) are highly abundant in eukaryotic proteomes. Plant IDPs play critical roles in plant biology and often act as integrators of signals from multiple plant regulatory and environmental inputs. Binding promiscuity and plasticity allow IDPs to interact with multiple partners in protein interaction networks and provide important functional advantages in molecular recognition through transient protein–protein interactions. Short interaction-prone segments within IDPs, termed molecular recognition features, represent potential binding sites that can undergo disorder-to-order transition upon binding to their partners. In this review, we summarize the evidence for the importance of IDPs in plant biology and evaluate the functions associated with intrinsic disorder in five different types of plant protein families experimentally confirmed as IDPs. Functional studies of these proteins illustrate the broad impact of disorder on many areas of plant biology, including abiotic stress, transcriptional regulation, light perception, and development. Based on the roles of disorder in the protein–protein interactions, we propose various modes of action for plant IDPs that may provide insight for future experimental approaches aimed at understanding the molecular basis of protein function within important plant pathways. PMID:23362206
Effect of fullerenol surface chemistry on nanoparticle binding-induced protein misfolding
NASA Astrophysics Data System (ADS)
Radic, Slaven; Nedumpully-Govindan, Praveen; Chen, Ran; Salonen, Emppu; Brown, Jared M.; Ke, Pu Chun; Ding, Feng
2014-06-01
Fullerene and its derivatives with different surface chemistry have great potential in biomedical applications. Accordingly, it is important to delineate the impact of these carbon-based nanoparticles on protein structure, dynamics, and subsequently function. Here, we focused on the effect of hydroxylation -- a common strategy for solubilizing and functionalizing fullerene -- on protein-nanoparticle interactions using a model protein, ubiquitin. We applied a set of complementary computational modeling methods, including docking and molecular dynamics simulations with both explicit and implicit solvent, to illustrate the impact of hydroxylated fullerenes on the structure and dynamics of ubiquitin. We found that all derivatives bound to the model protein. Specifically, the more hydrophilic nanoparticles with a higher number of hydroxyl groups bound to the surface of the protein via hydrogen bonds, which stabilized the protein without inducing large conformational changes in the protein structure. In contrast, fullerene derivatives with a smaller number of hydroxyl groups buried their hydrophobic surface inside the protein, thereby causing protein denaturation. Overall, our results revealed a distinct role of surface chemistry on nanoparticle-protein binding and binding-induced protein misfolding.Fullerene and its derivatives with different surface chemistry have great potential in biomedical applications. Accordingly, it is important to delineate the impact of these carbon-based nanoparticles on protein structure, dynamics, and subsequently function. Here, we focused on the effect of hydroxylation -- a common strategy for solubilizing and functionalizing fullerene -- on protein-nanoparticle interactions using a model protein, ubiquitin. We applied a set of complementary computational modeling methods, including docking and molecular dynamics simulations with both explicit and implicit solvent, to illustrate the impact of hydroxylated fullerenes on the structure and dynamics of ubiquitin. We found that all derivatives bound to the model protein. Specifically, the more hydrophilic nanoparticles with a higher number of hydroxyl groups bound to the surface of the protein via hydrogen bonds, which stabilized the protein without inducing large conformational changes in the protein structure. In contrast, fullerene derivatives with a smaller number of hydroxyl groups buried their hydrophobic surface inside the protein, thereby causing protein denaturation. Overall, our results revealed a distinct role of surface chemistry on nanoparticle-protein binding and binding-induced protein misfolding. Electronic supplementary information (ESI) is available: Fluorescence spectra, ITC, CD spectra and other data as described in the text. See DOI: 10.1039/c4nr01544d
Lisowska-Myjak, B; Skarżyńska, E; Bakun, M
2018-06-01
Intrauterine environmental factors can be associated with perinatal complications and long-term health outcomes although the underlying mechanisms remain poorly defined. Meconium formed exclusively in utero and passed naturally by a neonate may contain proteins which characterise the intrauterine environment. The aim of the study was proteomic analysis of the composition of meconium proteins and their classification by biological function. Proteomic techniques combining isoelectrofocussing fractionation and LC-MS/MS analysis were used to study the protein composition of a meconium sample obtained by pooling 50 serial meconium portions from 10 healthy full-term neonates. The proteins were classified by function based on the literature search for each protein in the PubMed database. A total of 946 proteins were identified in the meconium, including 430 proteins represented by two or more peptides. When the proteins were classified by their biological function the following were identified: immunoglobulin fragments and enzymatic, neutrophil-derived, structural and fetal intestine-specific proteins. Meconium is a rich source of proteins deposited in the fetal intestine during its development in utero. A better understanding of their specific biological functions in the intrauterine environment may help to identify these proteins which may serve as biomarkers associated with specific clinical conditions/diseases with the possible impact on the fetal development and further health consequences in infants, older children and adults.
Genetics pathway-based imaging approaches in Chinese Han population with Alzheimer's disease risk.
Bai, Feng; Liao, Wei; Yue, Chunxian; Pu, Mengjia; Shi, Yongmei; Yu, Hui; Yuan, Yonggui; Geng, Leiyu; Zhang, Zhijun
2016-01-01
The tau hypothesis has been raised with regard to the pathophysiology of Alzheimer's disease (AD). Mild cognitive impairment (MCI) is associated with a high risk for developing AD. However, no study has directly examined the brain topological alterations based on combined effects of tau protein pathway genes in MCI population. Forty-three patients with MCI and 30 healthy controls underwent resting-state functional magnetic resonance imaging (fMRI) in Chinese Han, and a tau protein pathway-based imaging approaches (7 candidate genes: 17 SNPs) were used to investigate changes in the topological organisation of brain activation associated with MCI. Impaired regional activation is related to tau protein pathway genes (5/7 candidate genes) in patients with MCI and likely in topologically convergent and divergent functional alterations patterns associated with genes, and combined effects of tau protein pathway genes disrupt the topological architecture of cortico-cerebellar loops. The associations between the loops and behaviours further suggest that tau protein pathway genes do play a significant role in non-episodic memory impairment. Tau pathway-based imaging approaches might strengthen the credibility in imaging genetic associations and generate pathway frameworks that might provide powerful new insights into the neural mechanisms that underlie MCI.
Liu, Suxuan; Xiong, Xinyu; Zhao, Xianxian; Yang, Xiaofeng; Wang, Hong
2015-05-09
Eukaryotic cell membrane dynamics change in curvature during physiological and pathological processes. In the past ten years, a novel protein family, Fes/CIP4 homology-Bin/Amphiphysin/Rvs (F-BAR) domain proteins, has been identified to be the most important coordinators in membrane curvature regulation. The F-BAR domain family is a member of the Bin/Amphiphysin/Rvs (BAR) domain superfamily that is associated with dynamic changes in cell membrane. However, the molecular basis in membrane structure regulation and the biological functions of F-BAR protein are unclear. The pathophysiological role of F-BAR protein is unknown. This review summarizes the current understanding of structure and function in the BAR domain superfamily, classifies F-BAR family proteins into nine subfamilies based on domain structure, and characterizes F-BAR protein structure, domain interaction, and functional relevance. In general, F-BAR protein binds to cell membrane via F-BAR domain association with membrane phospholipids and initiates membrane curvature and scission via Src homology-3 (SH3) domain interaction with its partner proteins. This process causes membrane dynamic changes and leads to seven important cellular biological functions, which include endocytosis, phagocytosis, filopodium, lamellipodium, cytokinesis, adhesion, and podosome formation, via distinct signaling pathways determined by specific domain-binding partners. These cellular functions play important roles in many physiological and pathophysiological processes. We further summarize F-BAR protein expression and mutation changes observed in various diseases and developmental disorders. Considering the structure feature and functional implication of F-BAR proteins, we anticipate that F-BAR proteins modulate physiological and pathophysiological processes via transferring extracellular materials, regulating cell trafficking and mobility, presenting antigens, mediating extracellular matrix degradation, and transmitting signaling for cell proliferation.
NovelFam3000 – Uncharacterized human protein domains conserved across model organisms
Kemmer, Danielle; Podowski, Raf M; Arenillas, David; Lim, Jonathan; Hodges, Emily; Roth, Peggy; Sonnhammer, Erik LL; Höög, Christer; Wasserman, Wyeth W
2006-01-01
Background Despite significant efforts from the research community, an extensive portion of the proteins encoded by human genes lack an assigned cellular function. Most metazoan proteins are composed of structural and/or functional domains, of which many appear in multiple proteins. Once a domain is characterized in one protein, the presence of a similar sequence in an uncharacterized protein serves as a basis for inference of function. Thus knowledge of a domain's function, or the protein within which it arises, can facilitate the analysis of an entire set of proteins. Description From the Pfam domain database, we extracted uncharacterized protein domains represented in proteins from humans, worms, and flies. A data centre was created to facilitate the analysis of the uncharacterized domain-containing proteins. The centre both provides researchers with links to dispersed internet resources containing gene-specific experimental data and enables them to post relevant experimental results or comments. For each human gene in the system, a characterization score is posted, allowing users to track the progress of characterization over time or to identify for study uncharacterized domains in well-characterized genes. As a test of the system, a subset of 39 domains was selected for analysis and the experimental results posted to the NovelFam3000 system. For 25 human protein members of these 39 domain families, detailed sub-cellular localizations were determined. Specific observations are presented based on the analysis of the integrated information provided through the online NovelFam3000 system. Conclusion Consistent experimental results between multiple members of a domain family allow for inferences of the domain's functional role. We unite bioinformatics resources and experimental data in order to accelerate the functional characterization of scarcely annotated domain families. PMID:16533400
High-Throughput Cloning and Expression Library Creation for Functional Proteomics
Festa, Fernanda; Steel, Jason; Bian, Xiaofang; Labaer, Joshua
2013-01-01
The study of protein function usually requires the use of a cloned version of the gene for protein expression and functional assays. This strategy is particular important when the information available regarding function is limited. The functional characterization of the thousands of newly identified proteins revealed by genomics requires faster methods than traditional single gene experiments, creating the need for fast, flexible and reliable cloning systems. These collections of open reading frame (ORF) clones can be coupled with high-throughput proteomics platforms, such as protein microarrays and cell-based assays, to answer biological questions. In this tutorial we provide the background for DNA cloning, discuss the major high-throughput cloning systems (Gateway® Technology, Flexi® Vector Systems, and Creator™ DNA Cloning System) and compare them side-by-side. We also report an example of high-throughput cloning study and its application in functional proteomics. This Tutorial is part of the International Proteomics Tutorial Programme (IPTP12). Details can be found at http://www.proteomicstutorials.org. PMID:23457047
High-throughput cloning and expression library creation for functional proteomics.
Festa, Fernanda; Steel, Jason; Bian, Xiaofang; Labaer, Joshua
2013-05-01
The study of protein function usually requires the use of a cloned version of the gene for protein expression and functional assays. This strategy is particularly important when the information available regarding function is limited. The functional characterization of the thousands of newly identified proteins revealed by genomics requires faster methods than traditional single-gene experiments, creating the need for fast, flexible, and reliable cloning systems. These collections of ORF clones can be coupled with high-throughput proteomics platforms, such as protein microarrays and cell-based assays, to answer biological questions. In this tutorial, we provide the background for DNA cloning, discuss the major high-throughput cloning systems (Gateway® Technology, Flexi® Vector Systems, and Creator(TM) DNA Cloning System) and compare them side-by-side. We also report an example of high-throughput cloning study and its application in functional proteomics. This tutorial is part of the International Proteomics Tutorial Programme (IPTP12). © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Zhou, Mi; Tang, Min; Li, Shuiming; Peng, Li; Huang, Haojun; Fang, Qihua; Liu, Zhao; Xie, Peng; Li, Gao; Zhou, Jian
2018-06-21
For specific applications, gold nanoparticles (GNPs) are commonly functionalized with various biological ligands, including amino-free ligands such as amino acids, peptides, proteins, and nucleic acids. Upon entering a biological fluid, the protein corona that forms around GNPs can conceal the targeting ligands and sterically hinder the functional properties. The protein corona is routinely prepared by standard centrifugation or sucrose cushion centrifugation. However, such methodologies are not applicable to the exclusive analysis of a ligand-binding protein corona. In this study, we first proposed a lock-in strategy based on a combination of rapid crosslinking and stringent washing. Cysteine was used as a model of amino-free ligands and attached to GNPs. After corona formation in the human plasma, GNP cysteine and corona proteins were quickly fixed by 5 s of crosslinking with 7.5% formaldehyde. After stringent washing using SDS buffer with sonication, the cysteine-bound proteins were effectively separated from unbound proteins. Qualitative and quantitative analyses using a mass spectrometry-based proteomics approach indicated that the protein composition of the cysteine-binding corona from the new method was significantly different from the composition of the whole corona from the two conventional methods. Furthermore, network and formaldehyde-linked site analyses of cysteine-binding proteins provided useful information toward a better knowledge of the behavior of protein-ligand and protein-protein interactions. Collectively, our new strategy has the capability to particularly characterize the protein composition of a cysteine-binding corona. The presented methodology in principal provides a generic way to analyze a nanoparticle corona bound to amino-free ligands and has the potential to decipher corona-masked ligand functions.
Lymphocyte signaling : beyond knockouts
Saveliev, Alexander; Tybulewicz, Victor L. J.
2016-01-01
The analysis of lymphocyte signaling was greatly enhanced by the advent of gene targeting, which allows the selective inactivation of a single gene. Whereas this gene ‘knockout’ approach is often informative, in many cases the phenotype resulting from gene ablation might not provide a complete picture of the function of the corresponding protein. If a protein has multiple functions within a single or several signaling pathways, or stabilizes other proteins in a complex, the phenotypic consequences of a gene knockout may manifest as a combination of several different perturbations. In these cases, gene targeting to ‘knockin’ subtle point mutations might provide more accurate insight into protein function. However, to be informative, such mutations must be carefully designed based on structural and biophysical data. PMID:19295633
Prototype Protein-Based Three-Dimensional Memory
2003-01-01
9 Figure 3.2: Hypothetical mutational landscape ...to explore the genetic mutational landscape of a protein without any a priori knowledge of structure- function relationships. As such, it explores...native organism, Halobacterium salinarum, the protein acts as a photosynthetic sunlight to chemical energy transducer. Through several billion years of
Micromotor-based lab-on-chip immunoassays.
García, Miguel; Orozco, Jahir; Guix, Maria; Gao, Wei; Sattayasamitsathit, Sirilak; Escarpa, Alberto; Merkoçi, Arben; Wang, Joseph
2013-02-21
Here we describe the first example of using self-propelled antibody-functionalized synthetic catalytic microengines for capturing and transporting target proteins between the different reservoirs of a lab-on-a-chip (LOC) device. A new catalytic polymer/Ni/Pt microtube engine, containing carboxy moieties on its mixed poly(3,4-ethylenedioxythiophene) (PEDOT)/COOH-PEDOT polymeric outermost layer, is further functionalized with the antibody receptor to selectively recognize and capture the target protein. The new motor-based microchip immunoassay operations are carried out without any bulk fluid flow, replacing the common washing steps in antibody-based protein bioassays with the active transport of the captured protein throughout the different reservoirs, where each step of the immunoassay takes place. A first microchip format involving an 'on-the-fly' double-antibody sandwich assay (DASA) is used for demonstrating the selective capture of the target protein, in the presence of excess of non-target proteins. A secondary antibody tagged with a polymeric-sphere tracer allows the direct visualization of the binding events. In a second approach the immuno-nanomotor captures and transports the microsphere-tagged antigen through a microchannel network. An anti-protein-A modified microengine is finally used to demonstrate the selective capture, transport and convenient label-free optical detection of a Staphylococcus aureus target bacteria (containing proteinA in its cell wall) in the presence of a large excess of non-target (Saccharomyces cerevisiae) cells. The resulting nanomotor-based microchip immunoassay offers considerable potential for diverse applications in clinical diagnostics, environmental and security monitoring fields.
Bromberg, Yana; Yachdav, Guy; Ofran, Yanay; Schneider, Reinhard; Rost, Burkhard
2009-05-01
The rapidly increasing quantity of protein sequence data continues to widen the gap between available sequences and annotations. Comparative modeling suggests some aspects of the 3D structures of approximately half of all known proteins; homology- and network-based inferences annotate some aspect of function for a similar fraction of the proteome. For most known protein sequences, however, there is detailed knowledge about neither their function nor their structure. Comprehensive efforts towards the expert curation of sequence annotations have failed to meet the demand of the rapidly increasing number of available sequences. Only the automated prediction of protein function in the absence of homology can close the gap between available sequences and annotations in the foreseeable future. This review focuses on two novel methods for automated annotation, and briefly presents an outlook on how modern web software may revolutionize the field of protein sequence annotation. First, predictions of protein binding sites and functional hotspots, and the evolution of these into the most successful type of prediction of protein function from sequence will be discussed. Second, a new tool, comprehensive in silico mutagenesis, which contributes important novel predictions of function and at the same time prepares for the onset of the next sequencing revolution, will be described. While these two new sub-fields of protein prediction represent the breakthroughs that have been achieved methodologically, it will then be argued that a different development might further change the way biomedical researchers benefit from annotations: modern web software can connect the worldwide web in any browser with the 'Deep Web' (ie, proprietary data resources). The availability of this direct connection, and the resulting access to a wealth of data, may impact drug discovery and development more than any existing method that contributes to protein annotation.
Actions of plant Argonautes: predictable or unpredictable?
Ma, Zeyang; Zhang, Xiuren
2018-05-29
Argonaute (AGO) proteins are the key effector of RNA-induced silencing complex (RISC). Land plants typically encode numerous AGO proteins, and they can be typically divided into two major functional groups based on the species of their housed small RNAs (sRNAs). One group of AGOs, guided by 24-nucleotide (nt) sRNAs, canonically function in nuclei to implement transcriptional gene silencing (TGS), whereas the other group of AGOs, guided by 21-nt sRNAs, act in the cytoplasm to fulfill posttranscriptional gene silencing (PTGS). Many new discoveries have been recently made on functions and mechanisms of AGO proteins in plants, and some of the findings change our views on the conventional classification and roles of AGO proteins. In this review, we summarize our current knowledge of AGO proteins in plants. Copyright © 2018 Elsevier Ltd. All rights reserved.
Protein domains of unknown function are essential in bacteria.
Goodacre, Norman F; Gerloff, Dietlind L; Uetz, Peter
2013-12-31
More than 20% of all protein domains are currently annotated as "domains of unknown function" (DUFs). About 2,700 DUFs are found in bacteria compared with just over 1,500 in eukaryotes. Over 800 DUFs are shared between bacteria and eukaryotes, and about 300 of these are also present in archaea. A total of 2,786 bacterial Pfam domains even occur in animals, including 320 DUFs. Evolutionary conservation suggests that many of these DUFs are important. Here we show that 355 essential proteins in 16 model bacterial species contain 238 DUFs, most of which represent single-domain proteins, clearly establishing the biological essentiality of DUFs. We suggest that experimental research should focus on conserved and essential DUFs (eDUFs) for functional analysis given their important function and wide taxonomic distribution, including bacterial pathogens. The functional units of proteins are domains. Typically, each domain has a distinct structure and function. Genomes encode thousands of domains, and many of the domains have no known function (domains of unknown function [DUFs]). They are often ignored as of little relevance, given that many of them are found in only a few genomes. Here we show that many DUFs are essential DUFs (eDUFs) based on their presence in essential proteins. We also show that eDUFs are often essential even if they are found in relatively few genomes. However, in general, more common DUFs are more often essential than rare DUFs.
Accounting for epistatic interactions improves the functional analysis of protein structures.
Wilkins, Angela D; Venner, Eric; Marciano, David C; Erdin, Serkan; Atri, Benu; Lua, Rhonald C; Lichtarge, Olivier
2013-11-01
The constraints under which sequence, structure and function coevolve are not fully understood. Bringing this mutual relationship to light can reveal the molecular basis of binding, catalysis and allostery, thereby identifying function and rationally guiding protein redesign. Underlying these relationships are the epistatic interactions that occur when the consequences of a mutation to a protein are determined by the genetic background in which it occurs. Based on prior data, we hypothesize that epistatic forces operate most strongly between residues nearby in the structure, resulting in smooth evolutionary importance across the structure. We find that when residue scores of evolutionary importance are distributed smoothly between nearby residues, functional site prediction accuracy improves. Accordingly, we designed a novel measure of evolutionary importance that focuses on the interaction between pairs of structurally neighboring residues. This measure that we term pair-interaction Evolutionary Trace yields greater functional site overlap and better structure-based proteome-wide functional predictions. Our data show that the structural smoothness of evolutionary importance is a fundamental feature of the coevolution of sequence, structure and function. Mutations operate on individual residues, but selective pressure depends in part on the extent to which a mutation perturbs interactions with neighboring residues. In practice, this principle led us to redefine the importance of a residue in terms of the importance of its epistatic interactions with neighbors, yielding better annotation of functional residues, motivating experimental validation of a novel functional site in LexA and refining protein function prediction. lichtarge@bcm.edu. Supplementary data are available at Bioinformatics online.
Accounting for epistatic interactions improves the functional analysis of protein structures
Wilkins, Angela D.; Venner, Eric; Marciano, David C.; Erdin, Serkan; Atri, Benu; Lua, Rhonald C.; Lichtarge, Olivier
2013-01-01
Motivation: The constraints under which sequence, structure and function coevolve are not fully understood. Bringing this mutual relationship to light can reveal the molecular basis of binding, catalysis and allostery, thereby identifying function and rationally guiding protein redesign. Underlying these relationships are the epistatic interactions that occur when the consequences of a mutation to a protein are determined by the genetic background in which it occurs. Based on prior data, we hypothesize that epistatic forces operate most strongly between residues nearby in the structure, resulting in smooth evolutionary importance across the structure. Methods and Results: We find that when residue scores of evolutionary importance are distributed smoothly between nearby residues, functional site prediction accuracy improves. Accordingly, we designed a novel measure of evolutionary importance that focuses on the interaction between pairs of structurally neighboring residues. This measure that we term pair-interaction Evolutionary Trace yields greater functional site overlap and better structure-based proteome-wide functional predictions. Conclusions: Our data show that the structural smoothness of evolutionary importance is a fundamental feature of the coevolution of sequence, structure and function. Mutations operate on individual residues, but selective pressure depends in part on the extent to which a mutation perturbs interactions with neighboring residues. In practice, this principle led us to redefine the importance of a residue in terms of the importance of its epistatic interactions with neighbors, yielding better annotation of functional residues, motivating experimental validation of a novel functional site in LexA and refining protein function prediction. Contact: lichtarge@bcm.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:24021383
Structure-based approach to the prediction of disulfide bonds in proteins.
Salam, Noeris K; Adzhigirey, Matvey; Sherman, Woody; Pearlman, David A
2014-10-01
Protein engineering remains an area of growing importance in pharmaceutical and biotechnology research. Stabilization of a folded protein conformation is a frequent goal in projects that deal with affinity optimization, enzyme design, protein construct design, and reducing the size of functional proteins. Indeed, it can be desirable to assess and improve protein stability in order to avoid liabilities such as aggregation, degradation, and immunogenic response that may arise during development. One way to stabilize a protein is through the introduction of disulfide bonds. Here, we describe a method to predict pairs of protein residues that can be mutated to form a disulfide bond. We combine a physics-based approach that incorporates implicit solvent molecular mechanics with a knowledge-based approach. We first assign relative weights to the terms that comprise our scoring function using a genetic algorithm applied to a set of 75 wild-type structures that each contains a disulfide bond. The method is then tested on a separate set of 13 engineered proteins comprising 15 artificial stabilizing disulfides introduced via site-directed mutagenesis. We find that the native disulfide in the wild-type proteins is scored well, on average (within the top 6% of the reasonable pairs of residues that could form a disulfide bond) while 6 out of the 15 artificial stabilizing disulfides scored within the top 13% of ranked predictions. Overall, this suggests that the physics-based approach presented here can be useful for triaging possible pairs of mutations for disulfide bond formation to improve protein stability. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Amyloid Fibrils as Building Blocks for Natural and Artificial Functional Materials.
Knowles, Tuomas P J; Mezzenga, Raffaele
2016-08-01
Proteinaceous materials based on the amyloid core structure have recently been discovered at the origin of biological functionality in a remarkably diverse set of roles, and attention is increasingly turning towards such structures as the basis of artificial self-assembling materials. These roles contrast markedly with the original picture of amyloid fibrils as inherently pathological structures. Here we outline the salient features of this class of functional materials, both in the context of the functional roles that have been revealed for amyloid fibrils in nature, as well as in relation to their potential as artificial materials. We discuss how amyloid materials exemplify the emergence of function from protein self-assembly at multiple length scales. We focus on the connections between mesoscale structure and material function, and demonstrate how the natural examples of functional amyloids illuminate the potential applications for future artificial protein based materials. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
T-RMSD: a web server for automated fine-grained protein structural classification.
Magis, Cedrik; Di Tommaso, Paolo; Notredame, Cedric
2013-07-01
This article introduces the T-RMSD web server (tree-based on root-mean-square deviation), a service allowing the online computation of structure-based protein classification. It has been developed to address the relation between structural and functional similarity in proteins, and it allows a fine-grained structural clustering of a given protein family or group of structurally related proteins using distance RMSD (dRMSD) variations. These distances are computed between all pairs of equivalent residues, as defined by the ungapped columns within a given multiple sequence alignment. Using these generated distance matrices (one per equivalent position), T-RMSD produces a structural tree with support values for each cluster node, reminiscent of bootstrap values. These values, associated with the tree topology, allow a quantitative estimate of structural distances between proteins or group of proteins defined by the tree topology. The clusters thus defined have been shown to be structurally and functionally informative. The T-RMSD web server is a free website open to all users and available at http://tcoffee.crg.cat/apps/tcoffee/do:trmsd.
T-RMSD: a web server for automated fine-grained protein structural classification
Magis, Cedrik; Di Tommaso, Paolo; Notredame, Cedric
2013-01-01
This article introduces the T-RMSD web server (tree-based on root-mean-square deviation), a service allowing the online computation of structure-based protein classification. It has been developed to address the relation between structural and functional similarity in proteins, and it allows a fine-grained structural clustering of a given protein family or group of structurally related proteins using distance RMSD (dRMSD) variations. These distances are computed between all pairs of equivalent residues, as defined by the ungapped columns within a given multiple sequence alignment. Using these generated distance matrices (one per equivalent position), T-RMSD produces a structural tree with support values for each cluster node, reminiscent of bootstrap values. These values, associated with the tree topology, allow a quantitative estimate of structural distances between proteins or group of proteins defined by the tree topology. The clusters thus defined have been shown to be structurally and functionally informative. The T-RMSD web server is a free website open to all users and available at http://tcoffee.crg.cat/apps/tcoffee/do:trmsd. PMID:23716642
NASA Astrophysics Data System (ADS)
Xu, Jie; Wang, Hongqi; Kong, Dekang
2018-01-01
Although the degradation pathways of Polycyclic aromatic hydrocarbons (PAHs) have been extensively studied in many bacteria, the variations in the expression levels of the key functional regulation of proteins during catabolism are still not quantitatively understood. In this study, we compared two proteomic methods, that one is two-dimensional gel electrophoresis (2-DE), a traditional widely used way and the other is isobaric tags for relative and absolute quantization (iTRAQ), an innovative approach, in order to analyze the functional regulation at the protein level in high effective fluoranthene-degrading bacteria named Rhodococcus sp. BAP-1. The number of differentially expressed proteins identified using iTRAQ is much larger than employing 2-DE. Response to fluoranthene, the key over expressed proteins in BAP-1 were NADPH-dependent FMN reductase, 30S ribosomal protein S2, S-ribosylhomocysteinase, etc.; the significant down-regulated proteins were cytochrome ubiquinol oxidase subunit, NAD(P) transhydrogenase subunit alpha, 5-methyltetrahydropteroyltriglutamate-homocysteine methyltransferase, et al.
Refinement of protein termini in template-based modeling using conformational space annealing.
Park, Hahnbeom; Ko, Junsu; Joo, Keehyoung; Lee, Julian; Seok, Chaok; Lee, Jooyoung
2011-09-01
The rapid increase in the number of experimentally determined protein structures in recent years enables us to obtain more reliable protein tertiary structure models than ever by template-based modeling. However, refinement of template-based models beyond the limit available from the best templates is still needed for understanding protein function in atomic detail. In this work, we develop a new method for protein terminus modeling that can be applied to refinement of models with unreliable terminus structures. The energy function for terminus modeling consists of both physics-based and knowledge-based potential terms with carefully optimized relative weights. Effective sampling of both the framework and terminus is performed using the conformational space annealing technique. This method has been tested on a set of termini derived from a nonredundant structure database and two sets of termini from the CASP8 targets. The performance of the terminus modeling method is significantly improved over our previous method that does not employ terminus refinement. It is also comparable or superior to the best server methods tested in CASP8. The success of the current approach suggests that similar strategy may be applied to other types of refinement problems such as loop modeling or secondary structure rearrangement. Copyright © 2011 Wiley-Liss, Inc.
Ayyar, Vivaswath S; Almon, Richard R; DuBois, Debra C; Sukumaran, Siddharth; Qu, Jun; Jusko, William J
2017-05-08
Corticosteroids (CS) are anti-inflammatory agents that cause extensive pharmacogenomic and proteomic changes in multiple tissues. An understanding of the proteome-wide effects of CS in liver and its relationships to altered hepatic and systemic physiology remains incomplete. Here, we report the application of a functional pharmacoproteomic approach to gain integrated insight into the complex nature of CS responses in liver in vivo. An in-depth functional analysis was performed using rich pharmacodynamic (temporal-based) proteomic data measured over 66h in rat liver following a single dose of methylprednisolone (MPL). Data mining identified 451 differentially regulated proteins. These proteins were analyzed on the basis of temporal regulation, cellular localization, and literature-mined functional information. Of the 451 proteins, 378 were clustered into six functional groups based on major clinically-relevant effects of CS in liver. MPL-responsive proteins were highly localized in the mitochondria (20%) and cytosol (24%). Interestingly, several proteins were related to hepatic stress and signaling processes, which appear to be involved in secondary signaling cascades and in protecting the liver from CS-induced oxidative damage. Consistent with known adverse metabolic effects of CS, several rate-controlling enzymes involved in amino acid metabolism, gluconeogenesis, and fatty-acid metabolism were altered by MPL. In addition, proteins involved in the metabolism of endogenous compounds, xenobiotics, and therapeutic drugs including cytochrome P450 and Phase-II enzymes were differentially regulated. Proteins related to the inflammatory acute-phase response were up-regulated in response to MPL. Functionally-similar proteins showed large diversity in their temporal profiles, indicating complex mechanisms of regulation by CS. Clinical use of corticosteroid (CS) therapy is frequent and chronic. However, current knowledge on the proteome-level effects of CS in liver and other tissues is sparse. While transcriptomic regulation following methylprednisolone (MPL) dosing has been temporally examined in rat liver, proteomic assessments are needed to better characterize the tissue-specific functional aspects of MPL actions. This study describes a functional pharmacoproteomic analysis of dynamic changes in MPL-regulated proteins in liver and provides biological insight into how steroid-induced perturbations on a molecular level may relate to both adverse and therapeutic responses presented clinically. Copyright © 2017 Elsevier B.V. All rights reserved.
iTRAQ-Based Proteomics Analysis and Network Integration for Kernel Tissue Development in Maize
Dong, Yongbin; Wang, Qilei; Du, Chunguang; Xiong, Wenwei; Li, Xinyu; Zhu, Sailan; Li, Yuling
2017-01-01
Grain weight is one of the most important yield components and a developmentally complex structure comprised of two major compartments (endosperm and pericarp) in maize (Zea mays L.), however, very little is known concerning the coordinated accumulation of the numerous proteins involved. Herein, we used isobaric tags for relative and absolute quantitation (iTRAQ)-based comparative proteomic method to analyze the characteristics of dynamic proteomics for endosperm and pericarp during grain development. Totally, 9539 proteins were identified for both components at four development stages, among which 1401 proteins were non-redundant, 232 proteins were specific in pericarp and 153 proteins were specific in endosperm. A functional annotation of the identified proteins revealed the importance of metabolic and cellular processes, and binding and catalytic activities for the tissue development. Three and 76 proteins involved in 49 Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways were integrated for the specific endosperm and pericarp proteins, respectively, reflecting their complex metabolic interactions. In addition, four proteins with important functions and different expression levels were chosen for gene cloning and expression analysis. Different concordance between mRNA level and the protein abundance was observed across different proteins, stages, and tissues as in previous research. These results could provide useful message for understanding the developmental mechanisms in grain development in maize. PMID:28837076
DOE Office of Scientific and Technical Information (OSTI.GOV)
Leung, Elo; Huang, Amy; Cadag, Eithon
In this study, we introduce the Protein Sequence Annotation Tool (PSAT), a web-based, sequence annotation meta-server for performing integrated, high-throughput, genome-wide sequence analyses. Our goals in building PSAT were to (1) create an extensible platform for integration of multiple sequence-based bioinformatics tools, (2) enable functional annotations and enzyme predictions over large input protein fasta data sets, and (3) provide a web interface for convenient execution of the tools. In this paper, we demonstrate the utility of PSAT by annotating the predicted peptide gene products of Herbaspirillum sp. strain RV1423, importing the results of PSAT into EC2KEGG, and using the resultingmore » functional comparisons to identify a putative catabolic pathway, thereby distinguishing RV1423 from a well annotated Herbaspirillum species. This analysis demonstrates that high-throughput enzyme predictions, provided by PSAT processing, can be used to identify metabolic potential in an otherwise poorly annotated genome. Lastly, PSAT is a meta server that combines the results from several sequence-based annotation and function prediction codes, and is available at http://psat.llnl.gov/psat/. PSAT stands apart from other sequencebased genome annotation systems in providing a high-throughput platform for rapid de novo enzyme predictions and sequence annotations over large input protein sequence data sets in FASTA. PSAT is most appropriately applied in annotation of large protein FASTA sets that may or may not be associated with a single genome.« less
Leung, Elo; Huang, Amy; Cadag, Eithon; ...
2016-01-20
In this study, we introduce the Protein Sequence Annotation Tool (PSAT), a web-based, sequence annotation meta-server for performing integrated, high-throughput, genome-wide sequence analyses. Our goals in building PSAT were to (1) create an extensible platform for integration of multiple sequence-based bioinformatics tools, (2) enable functional annotations and enzyme predictions over large input protein fasta data sets, and (3) provide a web interface for convenient execution of the tools. In this paper, we demonstrate the utility of PSAT by annotating the predicted peptide gene products of Herbaspirillum sp. strain RV1423, importing the results of PSAT into EC2KEGG, and using the resultingmore » functional comparisons to identify a putative catabolic pathway, thereby distinguishing RV1423 from a well annotated Herbaspirillum species. This analysis demonstrates that high-throughput enzyme predictions, provided by PSAT processing, can be used to identify metabolic potential in an otherwise poorly annotated genome. Lastly, PSAT is a meta server that combines the results from several sequence-based annotation and function prediction codes, and is available at http://psat.llnl.gov/psat/. PSAT stands apart from other sequencebased genome annotation systems in providing a high-throughput platform for rapid de novo enzyme predictions and sequence annotations over large input protein sequence data sets in FASTA. PSAT is most appropriately applied in annotation of large protein FASTA sets that may or may not be associated with a single genome.« less
Cloning strategy for producing brush-forming protein-based polymers.
Henderson, Douglas B; Davis, Richey M; Ducker, William A; Van Cott, Kevin E
2005-01-01
Brush-forming polymers are being used in a variety of applications, and by using recombinant DNA technology, there exists the potential to produce protein-based polymers that incorporate unique structures and functions in these brush layers. Despite this potential, production of protein-based brush-forming polymers is not routinely performed. For the design and production of new protein-based polymers with optimal brush-forming properties, it would be desirable to have a cloning strategy that allows an iterative approach wherein the protein based-polymer product can be produced and evaluated, and then if necessary, it can be sequentially modified in a controlled manner to obtain optimal surface density and brush extension. In this work, we report on the development of a cloning strategy intended for the production of protein-based brush-forming polymers. This strategy is based on the assembly of modules of DNA that encode for blocks of protein-based polymers into a commercially available expression vector; there is no need for custom-modified vectors and no need for intermediate cloning vectors. Additionally, because the design of new protein-based biopolymers can be an iterative process, our method enables sequential modification of a protein-based polymer product. With at least 21 bacterial expression vectors and 11 yeast expression vectors compatible with this strategy, there are a number of options available for production of protein-based polymers. It is our intent that this strategy will aid in advancing the production of protein-based brush-forming polymers.
Iliuk, Anton; Li, Li; Melesse, Michael; Hall, Mark C; Tao, W Andy
2016-05-17
Accurate protein phosphorylation analysis reveals dynamic cellular signaling events not evident from protein expression levels. The most dominant biochemical assay, western blotting, suffers from the inadequate availability and poor quality of phospho-specific antibodies for phosphorylated proteins. Furthermore, multiplexed assays based on antibodies are limited by steric interference between the antibodies. Here we introduce a multifunctionalized nanopolymer for the universal detection of phosphoproteins that, in combination with regular antibodies, allows multiplexed imaging and accurate determination of protein phosphorylation on membranes. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Quignot, Chloé; Rey, Julien; Yu, Jinchao; Tufféry, Pierre; Guerois, Raphaël; Andreani, Jessica
2018-05-08
Computational protein docking is a powerful strategy to predict structures of protein-protein interactions and provides crucial insights for the functional characterization of macromolecular cross-talks. We previously developed InterEvDock, a server for ab initio protein docking based on rigid-body sampling followed by consensus scoring using physics-based and statistical potentials, including the InterEvScore function specifically developed to incorporate co-evolutionary information in docking. InterEvDock2 is a major evolution of InterEvDock which allows users to submit input sequences - not only structures - and multimeric inputs and to specify constraints for the pairwise docking process based on previous knowledge about the interaction. For this purpose, we added modules in InterEvDock2 for automatic template search and comparative modeling of the input proteins. The InterEvDock2 pipeline was benchmarked on 812 complexes for which unbound homology models of the two partners and co-evolutionary information are available in the PPI4DOCK database. InterEvDock2 identified a correct model among the top 10 consensus in 29% of these cases (compared to 15-24% for individual scoring functions) and at least one correct interface residue among 10 predicted in 91% of these cases. InterEvDock2 is thus a unique protein docking server, designed to be useful for the experimental biology community. The InterEvDock2 web interface is available at http://bioserv.rpbs.univ-paris-diderot.fr/services/InterEvDock2/.
Jeong, Hyundoo; Qian, Xiaoning; Yoon, Byung-Jun
2016-10-06
Comparative analysis of protein-protein interaction (PPI) networks provides an effective means of detecting conserved functional network modules across different species. Such modules typically consist of orthologous proteins with conserved interactions, which can be exploited to computationally predict the modules through network comparison. In this work, we propose a novel probabilistic framework for comparing PPI networks and effectively predicting the correspondence between proteins, represented as network nodes, that belong to conserved functional modules across the given PPI networks. The basic idea is to estimate the steady-state network flow between nodes that belong to different PPI networks based on a Markov random walk model. The random walker is designed to make random moves to adjacent nodes within a PPI network as well as cross-network moves between potential orthologous nodes with high sequence similarity. Based on this Markov random walk model, we estimate the steady-state network flow - or the long-term relative frequency of the transitions that the random walker makes - between nodes in different PPI networks, which can be used as a probabilistic score measuring their potential correspondence. Subsequently, the estimated scores can be used for detecting orthologous proteins in conserved functional modules through network alignment. Through evaluations based on multiple real PPI networks, we demonstrate that the proposed scheme leads to improved alignment results that are biologically more meaningful at reduced computational cost, outperforming the current state-of-the-art algorithms. The source code and datasets can be downloaded from http://www.ece.tamu.edu/~bjyoon/CUFID .
Dissecting Orthosteric Contacts for a Reverse-Fragment-Based Ligand Design.
Chandramohan, Arun; Tulsian, Nikhil K; Anand, Ganesh S
2017-08-01
Orthosteric sites on proteins are formed typically from noncontiguous interacting sites in three-dimensional space where the composite binding interaction of a biological ligand is mediated by multiple synergistic interactions of its constituent functional groups. Through these multiple interactions, ligands stabilize both the ligand binding site and the local secondary structure. However, relative energetic contributions of the individual contacts in these protein-ligand interactions are difficult to resolve. Deconvolution of the contributions of these various functional groups in natural inhibitors/ligand would greatly aid in iterative fragment-based drug discovery (FBDD). In this study, we describe an approach of progressive unfolding of a target protein using a gradient of denaturant urea to reveal the individual energetic contributions of various ligand-functional groups to the affinity of the entire ligand. Through calibrated unfolding of two protein-ligand systems: cAMP-bound regulatory subunit of Protein Kinase A (RIα) and IBMX-bound phosphodiesterase8 (PDE8), monitored by amide hydrogen-deuterium exchange mass spectrometry, we show progressive disruption of individual orthosteric contacts in the ligand binding sites, allowing us to rank the energetic contributions of these individual interactions. In the two cAMP-binding sites of RIα, exocyclic phosphate oxygens of cAMP were identified to mediate stronger interactions than ribose 2'-OH in both the RIα-cAMP binding interfaces. Further, we have also ranked the relative contributions of the different functional groups of IBMX based on their interactions with the orthosteric residues of PDE8. This strategy for deconstruction of individual binding sites and identification of the strongest functional group interaction in enzyme orthosteric sites offers a rational starting point for FBDD.
Recognition of functional sites in protein structures.
Shulman-Peleg, Alexandra; Nussinov, Ruth; Wolfson, Haim J
2004-06-04
Recognition of regions on the surface of one protein, that are similar to a binding site of another is crucial for the prediction of molecular interactions and for functional classifications. We first describe a novel method, SiteEngine, that assumes no sequence or fold similarities and is able to recognize proteins that have similar binding sites and may perform similar functions. We achieve high efficiency and speed by introducing a low-resolution surface representation via chemically important surface points, by hashing triangles of physico-chemical properties and by application of hierarchical scoring schemes for a thorough exploration of global and local similarities. We proceed to rigorously apply this method to functional site recognition in three possible ways: first, we search a given functional site on a large set of complete protein structures. Second, a potential functional site on a protein of interest is compared with known binding sites, to recognize similar features. Third, a complete protein structure is searched for the presence of an a priori unknown functional site, similar to known sites. Our method is robust and efficient enough to allow computationally demanding applications such as the first and the third. From the biological standpoint, the first application may identify secondary binding sites of drugs that may lead to side-effects. The third application finds new potential sites on the protein that may provide targets for drug design. Each of the three applications may aid in assigning a function and in classification of binding patterns. We highlight the advantages and disadvantages of each type of search, provide examples of large-scale searches of the entire Protein Data Base and make functional predictions.
You, Zhu-Hong; Li, Shuai; Gao, Xin; Luo, Xin; Ji, Zhen
2014-01-01
Protein-protein interactions are the basis of biological functions, and studying these interactions on a molecular level is of crucial importance for understanding the functionality of a living cell. During the past decade, biosensors have emerged as an important tool for the high-throughput identification of proteins and their interactions. However, the high-throughput experimental methods for identifying PPIs are both time-consuming and expensive. On the other hand, high-throughput PPI data are often associated with high false-positive and high false-negative rates. Targeting at these problems, we propose a method for PPI detection by integrating biosensor-based PPI data with a novel computational model. This method was developed based on the algorithm of extreme learning machine combined with a novel representation of protein sequence descriptor. When performed on the large-scale human protein interaction dataset, the proposed method achieved 84.8% prediction accuracy with 84.08% sensitivity at the specificity of 85.53%. We conducted more extensive experiments to compare the proposed method with the state-of-the-art techniques, support vector machine. The achieved results demonstrate that our approach is very promising for detecting new PPIs, and it can be a helpful supplement for biosensor-based PPI data detection.
Serna, Naroa; Sánchez-García, Laura; Sánchez-Chardi, Alejandro; Unzueta, Ugutz; Roldán, Mónica; Mangues, Ramón; Vázquez, Esther; Villaverde, Antonio
2017-09-15
The emergence of bacterial antibiotic resistances is a serious concern in human and animal health. In this context, naturally occurring cationic antimicrobial peptides (AMPs) might play a main role in a next generation of drugs against bacterial infections. Taking an innovative approach to design self-organizing functional proteins, we have generated here protein-only nanoparticles with intrinsic AMP microbicide activity. Using a recombinant version of the GWH1 antimicrobial peptide as building block, these materials show a wide antibacterial activity spectrum in absence of detectable toxicity on mammalian cells. The GWH1-based nanoparticles combine clinically appealing properties of nanoscale materials with full biocompatibility, structural and functional plasticity and biological efficacy exhibited by proteins. Because of the largely implemented biological fabrication of recombinant protein drugs, the protein-based platform presented here represents a novel and scalable strategy in antimicrobial drug design, that by solving some of the limitations of AMPs offers a promising alternative to conventional antibiotics. The low molecular weight antimicrobial peptide GWH1 has been engineered to oligomerize as self-assembling protein-only nanoparticles of around 50nm. In this form, the peptide exhibits potent and broad antibacterial activities against both Gram-positive and Gram-negative bacteria, without any harmful effect over mammalian cells. As a solid proof-of-concept, this finding strongly supports the design and biofabrication of nanoscale antimicrobial materials with in-built functionalities. The protein-based homogeneous composition offer advantages over alternative materials explored as antimicrobial agents, regarding biocompatibility, biodegradability and environmental suitability. Beyond the described prototype, this transversal engineering concept has wide applicability in the design of novel nanomedicines for advanced treatments of bacterial infections. Copyright © 2017 Acta Materialia Inc. Published by Elsevier Ltd. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Otwell, Annie E.; Sherwood, Roberts; Zhang, Sheng
Metal reduction capability has been found in numerous species of environmentally abundant Gram-positive bacteria. However, understanding of microbial metal reduction is based almost solely on studies of Gram-negative organisms. In this study, we focus on Desulfotomaculum reducens MI-1, a Gram-positive metal reducer whose genome lacks genes with similarity to any characterized metal reductase. D. reducens has been shown to reduce not only Fe(III), but also the environmentally important contaminants U(VI) and Cr(VI). By extracting, separating, and analyzing the functional proteome of D. reducens, using a ferrozine-based assay in order to screen for chelated Fe(III)-NTA reduction with NADH as electron donor,more » we have identified proteins not previously characterized as iron reductases. Their function was confirmed by heterologous expression in E. coli. These are the protein NADH:flavin oxidoreductase (Dred_2421) and a protein complex composed of oxidoreductase FAD/NAD(P)-binding subunit (Dred_1685) and dihydroorotate dehydrogenase 1B (Dred_1686). Dred_2421 was identified in the soluble proteome and is predicted to be a cytoplasmic protein. Dred_1685 and Dred_1686 were identified in both the soluble as well as the insoluble (presumably membrane) protein fraction, suggesting a type of membrane-association, although PSORTb predicts both proteins are cytoplasmic. Furthermore, we show that these proteins have the capability to reduce soluble Cr(VI) and U(VI) with NADH as electron donor. This study is the first functional proteomic analysis of D. reducens, and one of the first analyses of metal and radionuclide reduction in an environmentally relevant Gram-positive bacterium.« less
Insulator function and topological domain border strength scale with architectural protein occupancy
2014-01-01
Background Chromosome conformation capture studies suggest that eukaryotic genomes are organized into structures called topologically associating domains. The borders of these domains are highly enriched for architectural proteins with characterized roles in insulator function. However, a majority of architectural protein binding sites localize within topological domains, suggesting sites associated with domain borders represent a functionally different subclass of these regulatory elements. How topologically associating domains are established and what differentiates border-associated from non-border architectural protein binding sites remain unanswered questions. Results By mapping the genome-wide target sites for several Drosophila architectural proteins, including previously uncharacterized profiles for TFIIIC and SMC-containing condensin complexes, we uncover an extensive pattern of colocalization in which architectural proteins establish dense clusters at the borders of topological domains. Reporter-based enhancer-blocking insulator activity as well as endogenous domain border strength scale with the occupancy level of architectural protein binding sites, suggesting co-binding by architectural proteins underlies the functional potential of these loci. Analyses in mouse and human stem cells suggest that clustering of architectural proteins is a general feature of genome organization, and conserved architectural protein binding sites may underlie the tissue-invariant nature of topologically associating domains observed in mammals. Conclusions We identify a spectrum of architectural protein occupancy that scales with the topological structure of chromosomes and the regulatory potential of these elements. Whereas high occupancy architectural protein binding sites associate with robust partitioning of topologically associating domains and robust insulator function, low occupancy sites appear reserved for gene-specific regulation within topological domains. PMID:24981874
DOE Office of Scientific and Technical Information (OSTI.GOV)
Syed, Aleem
Systematic spatial and temporal fluctuations are a fundamental part of any biological process. For example, lateral diffusion of membrane proteins is one of the key mechanisms in their cellular function. Lateral diffusion governs how membrane proteins interact with intracellular, transmembrane, and extracellular components to achieve their function. Herein, fluorescence-based techniques are used to elucidate the dynamics of receptor for advanced glycation end-products (RAGE) and integrin membrane proteins. RAGE is a transmembrane protein that is being used as a biomarker for various diseases. RAGE dependent signaling in numerous pathological conditions is well studied. However, RAGE lateral diffusion in the cell membranemore » is poorly understood. For this purpose, effect of cholesterol, cytoskeleton dynamics, and presence of ligand on RAGE lateral diffusion is investigated.« less
Collision induced unfolding of isolated proteins in the gas phase: past, present, and future.
Dixit, Sugyan M; Polasky, Daniel A; Ruotolo, Brandon T
2018-02-01
Rapidly characterizing the three-dimensional structures of proteins and the multimeric machines they form remains one of the great challenges facing modern biological and medical sciences. Ion mobility-mass spectrometry based techniques are playing an expanding role in characterizing these functional complexes, especially in drug discovery and development workflows. Despite this expansion, ion mobility-mass spectrometry faces many challenges, especially in the context of detecting small differences in protein tertiary structure that bear functional consequences. Collision induced unfolding is an ion mobility-mass spectrometry method that enables the rapid differentiation of subtly-different protein isoforms based on their unfolding patterns and stabilities. In this review, we summarize the modern implementation of such gas-phase unfolding experiments and provide an overview of recent developments in both methods and applications. Copyright © 2017 Elsevier Ltd. All rights reserved.
Solubility of glucose isomerase in ammonium sulphate solutions
NASA Astrophysics Data System (ADS)
Chayen, N.; Akins, J.; Campbell-Smith, S.; Blow, D. M.
1988-07-01
In order to quantify protein crystallization techniques, a method for measuring protein solubility in high salt concentration has been developed. It is based on a sensitive protein concentration assay, using binding to Coomassie blue dye. The protein concentration in a supernatant from which glucose isomerase is crystallising has been studied as a function of time. Equilibrium is established in 3-5 weeks, and the protein concentration remaining in solution is defined as the solubility of the protein. The solubility of glucose isomerase has been determined as a function of ammonium sulphate concentration; its variation with pH in 1.50M ammonium sulphate has also been studied. A remarkable dependence on pH over the range of 5.5 to 6.5 has been observed.
Proteome-wide Subcellular Topologies of E. coli Polypeptides Database (STEPdb)*
Orfanoudaki, Georgia; Economou, Anastassios
2014-01-01
Cell compartmentalization serves both the isolation and the specialization of cell functions. After synthesis in the cytoplasm, over a third of all proteins are targeted to other subcellular compartments. Knowing how proteins are distributed within the cell and how they interact is a prerequisite for understanding it as a whole. Surface and secreted proteins are important pathogenicity determinants. Here we present the STEP database (STEPdb) that contains a comprehensive characterization of subcellular localization and topology of the complete proteome of Escherichia coli. Two widely used E. coli proteomes (K-12 and BL21) are presented organized into thirteen subcellular classes. STEPdb exploits the wealth of genetic, proteomic, biochemical, and functional information on protein localization, secretion, and targeting in E. coli, one of the best understood model organisms. Subcellular annotations were derived from a combination of bioinformatics prediction, proteomic, biochemical, functional, topological data and extensive literature re-examination that were refined through manual curation. Strong experimental support for the location of 1553 out of 4303 proteins was based on 426 articles and some experimental indications for another 526. Annotations were provided for another 320 proteins based on firm bioinformatic predictions. STEPdb is the first database that contains an extensive set of peripheral IM proteins (PIM proteins) and includes their graphical visualization into complexes, cellular functions, and interactions. It also summarizes all currently known protein export machineries of E. coli K-12 and pairs them, where available, with the secretory proteins that use them. It catalogs the Sec- and TAT-utilizing secretomes and summarizes their topological features such as signal peptides and transmembrane regions, transmembrane topologies and orientations. It also catalogs physicochemical and structural features that influence topology such as abundance, solubility, disorder, heat resistance, and structural domain families. Finally, STEPdb incorporates prediction tools for topology (TMHMM, SignalP, and Phobius) and disorder (IUPred) and implements the BLAST2STEP that performs protein homology searches against the STEPdb. PMID:25210196
Production of membrane proteins without cells or detergents.
Rajesh, Sundaresan; Knowles, Timothy; Overduin, Michael
2011-04-30
The production of membrane proteins in cellular systems is besieged by several problems due to their hydrophobic nature which often causes misfolding, protein aggregation and cytotoxicity, resulting in poor yields of stable proteins. Cell-free expression has emerged as one of the most versatile alternatives for circumventing these obstacles by producing membrane proteins directly into designed hydrophobic environments. Efficient optimisation of expression and solubilisation conditions using a variety of detergents, membrane mimetics and lipids has yielded structurally and functionally intact membrane proteins, with yields several fold above the levels possible from cell-based systems. Here we review recently developed techniques available to produce functional membrane proteins, and discuss amphipols, nanodisc and styrene maleic acid lipid particle (SMALP) technologies that can be exploited alongside cell-free expression of membrane proteins. Copyright © 2010 Elsevier B.V. All rights reserved.
ERIC Educational Resources Information Center
Barak, Miri; Hussein-Farraj, Rania
2013-01-01
This paper describes a study conducted in the context of chemistry education reforms in Israel. The study examined a new biochemistry learning unit that was developed to promote in-depth understanding of 3D structures and functions of proteins and nucleic acids. Our goal was to examine whether, and to what extent teaching and learning via…
ERIC Educational Resources Information Center
Owen, Rebecca L.; Breyer, Emelita D.
2005-01-01
The Molecular Genetics and Protein Structure and Function workshop is one of a series of workshops offered by the National Science Foundation-funded Center for Workshops in the Chemical Sciences. The workshop provides a hands-on introduction to current topics and techniques in molecular genetics and protein structure/function as applied to…
Fourier-based classification of protein secondary structures.
Shu, Jian-Jun; Yong, Kian Yan
2017-04-15
The correct prediction of protein secondary structures is one of the key issues in predicting the correct protein folded shape, which is used for determining gene function. Existing methods make use of amino acids properties as indices to classify protein secondary structures, but are faced with a significant number of misclassifications. The paper presents a technique for the classification of protein secondary structures based on protein "signal-plotting" and the use of the Fourier technique for digital signal processing. New indices are proposed to classify protein secondary structures by analyzing hydrophobicity profiles. The approach is simple and straightforward. Results show that the more types of protein secondary structures can be classified by means of these newly-proposed indices. Copyright © 2017 Elsevier Inc. All rights reserved.
A protein domain-based interactome network for C. elegans early embryogenesis
Boxem, Mike; Maliga, Zoltan; Klitgord, Niels; Li, Na; Lemmens, Irma; Mana, Miyeko; de Lichtervelde, Lorenzo; Mul, Joram D.; van de Peut, Diederik; Devos, Maxime; Simonis, Nicolas; Yildirim, Muhammed A.; Cokol, Murat; Kao, Huey-Ling; de Smet, Anne-Sophie; Wang, Haidong; Schlaitz, Anne-Lore; Hao, Tong; Milstein, Stuart; Fan, Changyu; Tipsword, Mike; Drew, Kevin; Galli, Matilde; Rhrissorrakrai, Kahn; Drechsel, David; Koller, Daphne; Roth, Frederick P.; Iakoucheva, Lilia M.; Dunker, A. Keith; Bonneau, Richard; Gunsalus, Kristin C.; Hill, David E.; Piano, Fabio; Tavernier, Jan; van den Heuvel, Sander; Hyman, Anthony A.; Vidal, Marc
2008-01-01
Summary Many protein-protein interactions are mediated through independently folding modular domains. Proteome-wide efforts to model protein-protein interaction or “interactome” networks have largely ignored this modular organization of proteins. We developed an experimental strategy to efficiently identify interaction domains and generated a domain-based interactome network for proteins involved in C. elegans early embryonic cell divisions. Minimal interacting regions were identified for over 200 proteins, providing important information on their domain organization. Furthermore, our approach increased the sensitivity of the two-hybrid system, resulting in a more complete interactome network. This interactome modeling strategy revealed new insights into C. elegans centrosome function and is applicable to other biological processes in this and other organisms. PMID:18692475
Activity-Based Protein Profiling of Microbes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sadler, Natalie C.; Wright, Aaron T.
Activity-Based Protein Profiling (ABPP) in conjunction with multimodal characterization techniques has yielded impactful findings in microbiology, particularly in pathogen, bioenergy, drug discovery, and environmental research. Using small molecule chemical probes that react irreversibly with specific proteins or protein families in complex systems has provided insights in enzyme functions in central metabolic pathways, drug-protein interactions, and regulatory protein redox, for systems ranging from photoautotrophic cyanobacteria to mycobacteria, and combining live cell or cell extract ABPP with proteomics, molecular biology, modeling, and other techniques has greatly expanded our understanding of these systems. New opportunities for application of ABPP to microbial systems include:more » enhancing protein annotation, characterizing protein activities in myriad environments, and reveal signal transduction and regulatory mechanisms in microbial systems.« less
del Sol, Antonio; Araúzo-Bravo, Marcos J; Amoros, Dolors; Nussinov, Ruth
2007-01-01
Background Allosteric communications are vital for cellular signaling. Here we explore a relationship between protein architectural organization and shortcuts in signaling pathways. Results We show that protein domains consist of modules interconnected by residues that mediate signaling through the shortest pathways. These mediating residues tend to be located at the inter-modular boundaries, which are more rigid and display a larger number of long-range interactions than intra-modular regions. The inter-modular boundaries contain most of the residues centrally conserved in the protein fold, which may be crucial for information transfer between amino acids. Our approach to modular decomposition relies on a representation of protein structures as residue-interacting networks, and removal of the most central residue contacts, which are assumed to be crucial for allosteric communications. The modular decomposition of 100 multi-domain protein structures indicates that modules constitute the building blocks of domains. The analysis of 13 allosteric proteins revealed that modules characterize experimentally identified functional regions. Based on the study of an additional functionally annotated dataset of 115 proteins, we propose that high-modularity modules include functional sites and are the basic functional units. We provide examples (the Gαs subunit and P450 cytochromes) to illustrate that the modular architecture of active sites is linked to their functional specialization. Conclusion Our method decomposes protein structures into modules, allowing the study of signal transmission between functional sites. A modular configuration might be advantageous: it allows signaling proteins to expand their regulatory linkages and may elicit a broader range of control mechanisms either via modular combinations or through modulation of inter-modular linkages. PMID:17531094
Ravikumar, Ke; Liu, Haibin; Cohn, Judith D; Wall, Michael E; Verspoor, Karin
2012-10-05
We propose a method for automatic extraction of protein-specific residue mentions from the biomedical literature. The method searches text for mentions of amino acids at specific sequence positions and attempts to correctly associate each mention with a protein also named in the text. The methods presented in this work will enable improved protein functional site extraction from articles, ultimately supporting protein function prediction. Our method made use of linguistic patterns for identifying the amino acid residue mentions in text. Further, we applied an automated graph-based method to learn syntactic patterns corresponding to protein-residue pairs mentioned in the text. We finally present an approach to automated construction of relevant training and test data using the distant supervision model. The performance of the method was assessed by extracting protein-residue relations from a new automatically generated test set of sentences containing high confidence examples found using distant supervision. It achieved a F-measure of 0.84 on automatically created silver corpus and 0.79 on a manually annotated gold data set for this task, outperforming previous methods. The primary contributions of this work are to (1) demonstrate the effectiveness of distant supervision for automatic creation of training data for protein-residue relation extraction, substantially reducing the effort and time involved in manual annotation of a data set and (2) show that the graph-based relation extraction approach we used generalizes well to the problem of protein-residue association extraction. This work paves the way towards effective extraction of protein functional residues from the literature.
Protein mechanics: from single molecules to functional biomaterials.
Li, Hongbin; Cao, Yi
2010-10-19
Elastomeric proteins act as the essential functional units in a wide variety of biomechanical machinery and serve as the basic building blocks for biological materials that exhibit superb mechanical properties. These proteins provide the desired elasticity, mechanical strength, resilience, and toughness within these materials. Understanding the mechanical properties of elastomeric protein-based biomaterials is a multiscale problem spanning from the atomistic/molecular level to the macroscopic level. Uncovering the design principles of individual elastomeric building blocks is critical both for the scientific understanding of multiscale mechanics of biomaterials and for the rational engineering of novel biomaterials with desirable mechanical properties. The development of single-molecule force spectroscopy techniques has provided methods for characterizing mechanical properties of elastomeric proteins one molecule at a time. Single-molecule atomic force microscopy (AFM) is uniquely suited to this purpose. Molecular dynamic simulations, protein engineering techniques, and single-molecule AFM study have collectively revealed tremendous insights into the molecular design of single elastomeric proteins, which can guide the design and engineering of elastomeric proteins with tailored mechanical properties. Researchers are focusing experimental efforts toward engineering artificial elastomeric proteins with mechanical properties that mimic or even surpass those of natural elastomeric proteins. In this Account, we summarize our recent experimental efforts to engineer novel artificial elastomeric proteins and develop general and rational methodologies to tune the nanomechanical properties of elastomeric proteins at the single-molecule level. We focus on general design principles used for enhancing the mechanical stability of proteins. These principles include the development of metal-chelation-based general methodology, strategies to control the unfolding hierarchy of multidomain elastomeric proteins, and the design of novel elastomeric proteins that exhibit stimuli-responsive mechanical properties. Moving forward, we are now exploring the use of these artificial elastomeric proteins as building blocks of protein-based biomaterials. Ultimately, we would like to rationally tailor mechanical properties of elastomeric protein-based materials by programming the molecular sequence, and thus nanomechanical properties, of elastomeric proteins at the single-molecule level. This step would help bridge the gap between single protein mechanics and material biomechanics, revealing how the mechanical properties of individual elastomeric proteins are translated into the properties of macroscopic materials.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sadler, Natalie C.; Angel, Thomas E.; Lewis, Michael P.
High-fat diet (HFD) induced obesity and concomitant development of insulin resistance (IR) and type 2 diabetes mellitus have been linked to mitochondrial dysfunction. However, it is not clear whether mitochondrial dysfunction is a direct effect of a HFD or if the mitochondrial function is reduced with increased HFD duration. We hypothesized that the function of mitochondrial oxidative and lipid metabolism functions in skeletal muscle mitochondria for HFD mice are similar or elevated relative to standard diet (SD) mice, thereby IR is neither cause nor consequence of mitochondrial dysfunction. We applied a chemical probe approach to identify functionally reactive ATPases andmore » nucleotide-binding proteins in mitochondria isolated from skeletal muscle of C57Bl/6J mice fed HFD or SD chow for 2-, 8-, or 16-weeks; feeding time points known to induce IR. A total of 293 probe-labeled proteins were identified by mass spectrometry-based proteomics, of which 54 differed in abundance between HFD and SD mice. We found proteins associated with the TCA cycle, oxidative phosphorylation (OXPHOS), and lipid metabolism were altered in function when comparing SD to HFD fed mice at 2-weeks, however by 16-weeks HFD mice had TCA cycle, β-oxidation, and respiratory chain function at levels similar to or higher than SD mice.« less
The mechanism of protein export enhancement by the SecDF membrane component
Tsukazaki, Tomoya; Nureki, Osamu
2011-01-01
Protein transport across membranes is a fundamental and essential cellular activity in all organisms. In bacteria, protein export across the cytoplasmic membrane, driven by dynamic interplays between the protein-conducting SecYEG channel (Sec translocon) and the SecA ATPase, is enhanced by the proton motive force (PMF) and a membrane-integrated Sec component, SecDF. However, the structure and function of SecDF have remained unclear. We solved the first crystal structure of SecDF, consisting of a pseudo-symmetrical 12-helix transmembrane domain and two protruding periplasmic domains. Based on the structural features, we proposed that SecDF functions as a membrane-integrated chaperone, which drives protein movement without using the major energetic currency, ATP, but with remarkable cycles of conformational changes, powered by the proton gradient across the membrane. By a series of biochemical and biophysical approaches, several functionally important residues in the transmembrane region have been identified and our model of the SecDF function has been verified. PMID:27857601
PROFESS: a PROtein Function, Evolution, Structure and Sequence database
Triplet, Thomas; Shortridge, Matthew D.; Griep, Mark A.; Stark, Jaime L.; Powers, Robert; Revesz, Peter
2010-01-01
The proliferation of biological databases and the easy access enabled by the Internet is having a beneficial impact on biological sciences and transforming the way research is conducted. There are ∼1100 molecular biology databases dispersed throughout the Internet. To assist in the functional, structural and evolutionary analysis of the abundant number of novel proteins continually identified from whole-genome sequencing, we introduce the PROFESS (PROtein Function, Evolution, Structure and Sequence) database. Our database is designed to be versatile and expandable and will not confine analysis to a pre-existing set of data relationships. A fundamental component of this approach is the development of an intuitive query system that incorporates a variety of similarity functions capable of generating data relationships not conceived during the creation of the database. The utility of PROFESS is demonstrated by the analysis of the structural drift of homologous proteins and the identification of potential pancreatic cancer therapeutic targets based on the observation of protein–protein interaction networks. Database URL: http://cse.unl.edu/∼profess/ PMID:20624718
Comparative proteomic analysis of outer membrane protein 43 (omp43)-deficient Bartonella henselae.
Kang, Jun-Gu; Lee, Hee-Woo; Ko, Sungjin; Chae, Joon-Seok
2018-01-31
Outer membrane proteins (OMPs) of Gram-negative bacteria constitute the first line of defense protecting cells against environmental stresses including chemical, biophysical, and biological attacks. Although the 43-kDa OMP (OMP43) is major porin protein among Bartonella henselae -derived OMPs, its function remains unreported. In this study, OMP43-deficient mutant B. henselae (Δomp43) was generated to investigate OMP43 function. Interestingly, Δ omp 43 exhibited weaker proliferative ability than that of wild-type (WT) B. henselae . To study the differences in proteomic expression between WT and Δ omp 43, two-dimensional gel electrophoresis-based proteomic analysis was performed. Based on Clusters of Orthologus Groups functional assignments, 12 proteins were associated with metabolism, 7 proteins associated with information storage and processing, and 3 proteins associated with cellular processing and signaling. By semi-quantitative reverse transcriptase polymerase chain reaction, increases in tld D, efp, ntr X, pdh A, pur B, and ATPA mRNA expression and decreases in Rho and yfe A mRNA expression were confirmed in Δ omp 43. In conclusion, this is the first report showing that a loss of OMP43 expression in B. henselae leads to retarded proliferation. Furthermore, our proteomic data provide useful information for the further investigation of mechanisms related to the growth of B. henselae.
Phytochemicals perturb membranes and promiscuously alter protein function.
Ingólfsson, Helgi I; Thakur, Pratima; Herold, Karl F; Hobart, E Ashley; Ramsey, Nicole B; Periole, Xavier; de Jong, Djurre H; Zwama, Martijn; Yilmaz, Duygu; Hall, Katherine; Maretzky, Thorsten; Hemmings, Hugh C; Blobel, Carl; Marrink, Siewert J; Koçer, Armağan; Sack, Jon T; Andersen, Olaf S
2014-08-15
A wide variety of phytochemicals are consumed for their perceived health benefits. Many of these phytochemicals have been found to alter numerous cell functions, but the mechanisms underlying their biological activity tend to be poorly understood. Phenolic phytochemicals are particularly promiscuous modifiers of membrane protein function, suggesting that some of their actions may be due to a common, membrane bilayer-mediated mechanism. To test whether bilayer perturbation may underlie this diversity of actions, we examined five bioactive phenols reported to have medicinal value: capsaicin from chili peppers, curcumin from turmeric, EGCG from green tea, genistein from soybeans, and resveratrol from grapes. We find that each of these widely consumed phytochemicals alters lipid bilayer properties and the function of diverse membrane proteins. Molecular dynamics simulations show that these phytochemicals modify bilayer properties by localizing to the bilayer/solution interface. Bilayer-modifying propensity was verified using a gramicidin-based assay, and indiscriminate modulation of membrane protein function was demonstrated using four proteins: membrane-anchored metalloproteases, mechanosensitive ion channels, and voltage-dependent potassium and sodium channels. Each protein exhibited similar responses to multiple phytochemicals, consistent with a common, bilayer-mediated mechanism. Our results suggest that many effects of amphiphilic phytochemicals are due to cell membrane perturbations, rather than specific protein binding.
Phytochemicals Perturb Membranes and Promiscuously Alter Protein Function
2015-01-01
A wide variety of phytochemicals are consumed for their perceived health benefits. Many of these phytochemicals have been found to alter numerous cell functions, but the mechanisms underlying their biological activity tend to be poorly understood. Phenolic phytochemicals are particularly promiscuous modifiers of membrane protein function, suggesting that some of their actions may be due to a common, membrane bilayer-mediated mechanism. To test whether bilayer perturbation may underlie this diversity of actions, we examined five bioactive phenols reported to have medicinal value: capsaicin from chili peppers, curcumin from turmeric, EGCG from green tea, genistein from soybeans, and resveratrol from grapes. We find that each of these widely consumed phytochemicals alters lipid bilayer properties and the function of diverse membrane proteins. Molecular dynamics simulations show that these phytochemicals modify bilayer properties by localizing to the bilayer/solution interface. Bilayer-modifying propensity was verified using a gramicidin-based assay, and indiscriminate modulation of membrane protein function was demonstrated using four proteins: membrane-anchored metalloproteases, mechanosensitive ion channels, and voltage-dependent potassium and sodium channels. Each protein exhibited similar responses to multiple phytochemicals, consistent with a common, bilayer-mediated mechanism. Our results suggest that many effects of amphiphilic phytochemicals are due to cell membrane perturbations, rather than specific protein binding. PMID:24901212
Identifying functionally informative evolutionary sequence profiles.
Gil, Nelson; Fiser, Andras
2018-04-15
Multiple sequence alignments (MSAs) can provide essential input to many bioinformatics applications, including protein structure prediction and functional annotation. However, the optimal selection of sequences to obtain biologically informative MSAs for such purposes is poorly explored, and has traditionally been performed manually. We present Selection of Alignment by Maximal Mutual Information (SAMMI), an automated, sequence-based approach to objectively select an optimal MSA from a large set of alternatives sampled from a general sequence database search. The hypothesis of this approach is that the mutual information among MSA columns will be maximal for those MSAs that contain the most diverse set possible of the most structurally and functionally homogeneous protein sequences. SAMMI was tested to select MSAs for functional site residue prediction by analysis of conservation patterns on a set of 435 proteins obtained from protein-ligand (peptides, nucleic acids and small substrates) and protein-protein interaction databases. Availability and implementation: A freely accessible program, including source code, implementing SAMMI is available at https://github.com/nelsongil92/SAMMI.git. andras.fiser@einstein.yu.edu. Supplementary data are available at Bioinformatics online.
Real-Time Ligand Binding Pocket Database Search Using Local Surface Descriptors
Chikhi, Rayan; Sael, Lee; Kihara, Daisuke
2010-01-01
Due to the increasing number of structures of unknown function accumulated by ongoing structural genomics projects, there is an urgent need for computational methods for characterizing protein tertiary structures. As functions of many of these proteins are not easily predicted by conventional sequence database searches, a legitimate strategy is to utilize structure information in function characterization. Of a particular interest is prediction of ligand binding to a protein, as ligand molecule recognition is a major part of molecular function of proteins. Predicting whether a ligand molecule binds a protein is a complex problem due to the physical nature of protein-ligand interactions and the flexibility of both binding sites and ligand molecules. However, geometric and physicochemical complementarity is observed between the ligand and its binding site in many cases. Therefore, ligand molecules which bind to a local surface site in a protein can be predicted by finding similar local pockets of known binding ligands in the structure database. Here, we present two representations of ligand binding pockets and utilize them for ligand binding prediction by pocket shape comparison. These representations are based on mapping of surface properties of binding pockets, which are compactly described either by the two dimensional pseudo-Zernike moments or the 3D Zernike descriptors. These compact representations allow a fast real-time pocket searching against a database. Thorough benchmark study employing two different datasets show that our representations are competitive with the other existing methods. Limitations and potentials of the shape-based methods as well as possible improvements are discussed. PMID:20455259
Real-time ligand binding pocket database search using local surface descriptors.
Chikhi, Rayan; Sael, Lee; Kihara, Daisuke
2010-07-01
Because of the increasing number of structures of unknown function accumulated by ongoing structural genomics projects, there is an urgent need for computational methods for characterizing protein tertiary structures. As functions of many of these proteins are not easily predicted by conventional sequence database searches, a legitimate strategy is to utilize structure information in function characterization. Of particular interest is prediction of ligand binding to a protein, as ligand molecule recognition is a major part of molecular function of proteins. Predicting whether a ligand molecule binds a protein is a complex problem due to the physical nature of protein-ligand interactions and the flexibility of both binding sites and ligand molecules. However, geometric and physicochemical complementarity is observed between the ligand and its binding site in many cases. Therefore, ligand molecules which bind to a local surface site in a protein can be predicted by finding similar local pockets of known binding ligands in the structure database. Here, we present two representations of ligand binding pockets and utilize them for ligand binding prediction by pocket shape comparison. These representations are based on mapping of surface properties of binding pockets, which are compactly described either by the two-dimensional pseudo-Zernike moments or the three-dimensional Zernike descriptors. These compact representations allow a fast real-time pocket searching against a database. Thorough benchmark studies employing two different datasets show that our representations are competitive with the other existing methods. Limitations and potentials of the shape-based methods as well as possible improvements are discussed.
Modelling and enhanced molecular dynamics to steer structure-based drug discovery.
Kalyaanamoorthy, Subha; Chen, Yi-Ping Phoebe
2014-05-01
The ever-increasing gap between the availabilities of the genome sequences and the crystal structures of proteins remains one of the significant challenges to the modern drug discovery efforts. The knowledge of structure-dynamics-functionalities of proteins is important in order to understand several key aspects of structure-based drug discovery, such as drug-protein interactions, drug binding and unbinding mechanisms and protein-protein interactions. This review presents a brief overview on the different state of the art computational approaches that are applied for protein structure modelling and molecular dynamics simulations of biological systems. We give an essence of how different enhanced sampling molecular dynamics approaches, together with regular molecular dynamics methods, assist in steering the structure based drug discovery processes. Copyright © 2013 Elsevier Ltd. All rights reserved.
Towards quantitative classification of folded proteins in terms of elementary functions.
Hu, Shuangwei; Krokhotin, Andrei; Niemi, Antti J; Peng, Xubiao
2011-04-01
A comparative classification scheme provides a good basis for several approaches to understand proteins, including prediction of relations between their structure and biological function. But it remains a challenge to combine a classification scheme that describes a protein starting from its well-organized secondary structures and often involves direct human involvement, with an atomary-level physics-based approach where a protein is fundamentally nothing more than an ensemble of mutually interacting carbon, hydrogen, oxygen, and nitrogen atoms. In order to bridge these two complementary approaches to proteins, conceptually novel tools need to be introduced. Here we explain how an approach toward geometric characterization of entire folded proteins can be based on a single explicit elementary function that is familiar from nonlinear physical systems where it is known as the kink soliton. Our approach enables the conversion of hierarchical structural information into a quantitative form that allows for a folded protein to be characterized in terms of a small number of global parameters that are in principle computable from atomary-level considerations. As an example we describe in detail how the native fold of the myoglobin 1M6C emerges from a combination of kink solitons with a very high atomary-level accuracy. We also verify that our approach describes longer loops and loops connecting α helices with β strands, with the same overall accuracy. ©2011 American Physical Society
Rattei, Thomas; Tischler, Patrick; Götz, Stefan; Jehl, Marc-André; Hoser, Jonathan; Arnold, Roland; Conesa, Ana; Mewes, Hans-Werner
2010-01-01
The prediction of protein function as well as the reconstruction of evolutionary genesis employing sequence comparison at large is still the most powerful tool in sequence analysis. Due to the exponential growth of the number of known protein sequences and the subsequent quadratic growth of the similarity matrix, the computation of the Similarity Matrix of Proteins (SIMAP) becomes a computational intensive task. The SIMAP database provides a comprehensive and up-to-date pre-calculation of the protein sequence similarity matrix, sequence-based features and sequence clusters. As of September 2009, SIMAP covers 48 million proteins and more than 23 million non-redundant sequences. Novel features of SIMAP include the expansion of the sequence space by including databases such as ENSEMBL as well as the integration of metagenomes based on their consistent processing and annotation. Furthermore, protein function predictions by Blast2GO are pre-calculated for all sequences in SIMAP and the data access and query functions have been improved. SIMAP assists biologists to query the up-to-date sequence space systematically and facilitates large-scale downstream projects in computational biology. Access to SIMAP is freely provided through the web portal for individuals (http://mips.gsf.de/simap/) and for programmatic access through DAS (http://webclu.bio.wzw.tum.de/das/) and Web-Service (http://mips.gsf.de/webservices/services/SimapService2.0?wsdl).
Semi-supervised protein subcellular localization.
Xu, Qian; Hu, Derek Hao; Xue, Hong; Yu, Weichuan; Yang, Qiang
2009-01-30
Protein subcellular localization is concerned with predicting the location of a protein within a cell using computational method. The location information can indicate key functionalities of proteins. Accurate predictions of subcellular localizations of protein can aid the prediction of protein function and genome annotation, as well as the identification of drug targets. Computational methods based on machine learning, such as support vector machine approaches, have already been widely used in the prediction of protein subcellular localization. However, a major drawback of these machine learning-based approaches is that a large amount of data should be labeled in order to let the prediction system learn a classifier of good generalization ability. However, in real world cases, it is laborious, expensive and time-consuming to experimentally determine the subcellular localization of a protein and prepare instances of labeled data. In this paper, we present an approach based on a new learning framework, semi-supervised learning, which can use much fewer labeled instances to construct a high quality prediction model. We construct an initial classifier using a small set of labeled examples first, and then use unlabeled instances to refine the classifier for future predictions. Experimental results show that our methods can effectively reduce the workload for labeling data using the unlabeled data. Our method is shown to enhance the state-of-the-art prediction results of SVM classifiers by more than 10%.
Application of activity-based protein profiling to study enzyme function in adipocytes.
Galmozzi, Andrea; Dominguez, Eduardo; Cravatt, Benjamin F; Saez, Enrique
2014-01-01
Activity-based protein profiling (ABPP) is a chemical proteomics approach that utilizes small-molecule probes to determine the functional state of enzymes directly in native systems. ABPP probes selectively label active enzymes, but not their inactive forms, facilitating the characterization of changes in enzyme activity that occur without alterations in protein levels. ABPP can be a tool superior to conventional gene expression and proteomic profiling methods to discover new enzymes active in adipocytes and to detect differences in the activity of characterized enzymes that may be associated with disorders of adipose tissue function. ABPP probes have been developed that react selectively with most members of specific enzyme classes. Here, using as an example the serine hydrolase family that includes many enzymes with critical roles in adipocyte physiology, we describe methods to apply ABPP analysis to the study of adipocyte enzymatic pathways. © 2014 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Cai, Bingna; Pan, Jianyu; Wu, Yuantao; Wan, Peng; Sun, Huili
2013-07-01
Oyster peptides were produced from Crassostrea hongkongensis and used as a new protein source for the preparation of an oyster peptide-based enteral nutrition formula (OPENF). Reserpineinduced malabsorption mice and cyclophosphamide-induced immunosuppression mice were used in this study. OPENF powder is light yellow green and has a protein-fat-carbohydrate ratio of 16:9:75 with good solubility in water. A pilot study investigating immune functional impacts of the OPENF on mice show that the OPENF enhanced spleen lymphocyte proliferation and the activity of natural killer (NK) cells in BALB/c mice. Furthermore, OPENF can improve intestinal absorption, increase food utilization ratio, and maintain the normal physiological function of mice. These results suggest that oyster peptides could serve as a new protein source for use in enteral nutrition formula, but more importantly, also indicate that OPENF has an immunostimulating effect in mice.
HSP27, 70 and 90, anti-apoptotic proteins, in clinical cancer therapy (Review).
Wang, Xiaoxia; Chen, Meijuan; Zhou, Jing; Zhang, Xu
2014-07-01
Among the heat shock proteins (HSP), HSP27, HSP70 and HSP90 are the most studied stress-inducible HSPs, and are induced in response to a wide variety of physiological and environmental insults, thus allowing cells to survive to lethal conditions based on their powerful cytoprotective functions. Different functions of HSPs have been described to explain their cytoprotective functions, including their most basic role as molecular chaperones, that is to regulate protein folding, transport, translocation and assembly, especially helping in the refolding of misfolded proteins, as well as their anti-apoptotic properties. In cancer cells, the expression and/or activity of the three HSPs is abnormally high, and is associated with increased tumorigenicity, metastatic potential of cancer cells and resistance to chemotherapy. Associating with key apoptotic factors, they are powerful anti-apoptotic proteins, having the capacity to block the cell death process at different levels. Altogether, the properties suggest that HSP27, HSP70 and HSP90 are appropriate targets for modulating cell death pathways. In this review, we summarize the role of HSP90, HSP70 and HSP27 in apoptosis and the emerging strategies that have been developed for cancer therapy based on the inhibition of the three HSPs.
Ohuchi, Shoji J; Sagawa, Fumihiko; Sakamoto, Taiichi; Inoue, Tan
2015-10-23
RNA-protein complexes (RNPs) are useful for constructing functional nano-objects because a variety of functional proteins can be displayed on a designed RNA scaffold. Here, we report circular permutations of an RNA-binding protein L7Ae based on the three-dimensional structure information to alter the orientation of the displayed proteins on the RNA scaffold. An electrophoretic mobility shift assay and atomic force microscopy (AFM) analysis revealed that most of the designed circular permutants formed an RNP nano-object. Moreover, the alteration of the enhanced green fluorescent protein (EGFP) orientation was confirmed with AFM by employing EGFP on the L7Ae permutant on the RNA. The results demonstrate that targeted fine-tuning of the stereo-specific fixation of a protein on a protein-binding RNA is feasible by using the circular permutation technique. Copyright © 2015 Elsevier Inc. All rights reserved.
Controlling allosteric networks in proteins
NASA Astrophysics Data System (ADS)
Dokholyan, Nikolay
2013-03-01
We present a novel methodology based on graph theory and discrete molecular dynamics simulations for delineating allosteric pathways in proteins. We use this methodology to uncover the structural mechanisms responsible for coupling of distal sites on proteins and utilize it for allosteric modulation of proteins. We will present examples where inference of allosteric networks and its rewiring allows us to ``rescue'' cystic fibrosis transmembrane conductance regulator (CFTR), a protein associated with fatal genetic disease cystic fibrosis. We also use our methodology to control protein function allosterically. We design a novel protein domain that can be inserted into identified allosteric site of target protein. Using a drug that binds to our domain, we alter the function of the target protein. We successfully tested this methodology in vitro, in living cells and in zebrafish. We further demonstrate transferability of our allosteric modulation methodology to other systems and extend it to become ligh-activatable.
Conjugated Polymers/DNA Hybrid Materials for Protein Inactivation.
Zhao, Likun; Zhang, Jiangyan; Xu, Huiming; Geng, Hao; Cheng, Yongqiang
2016-09-07
Chromophore-assisted light inactivation (CALI) is a powerful tool for analyzing protein functions due to the high degree of spatial and temporal resolution. In this work, we demonstrate a CALI approach based on conjugated polymers (CPs)/DNA hybrid material for protein inactivation. The target protein is conjugated with single-stranded DNA in advance. Single-stranded DNA can form CPs/DNA hybrid material with cationic CPs via electrostatic and hydrophobic interactions. Through the formation of CPs/DNA hybrid material, the target protein that is conjugated with DNA is brought into close proximity to CPs. Under irradiation, CPs harvest light and generate reactive oxygen species (ROS), resulting in the inactivation of the adjacent target protein. This approach can efficiently inactivate any target protein which is conjugated with DNA and has good specificity and universality, providing a new strategy for studies of protein function and adjustment of protein activity.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ohuchi, Shoji J.; Sagawa, Fumihiko; Sakamoto, Taiichi
RNA-protein complexes (RNPs) are useful for constructing functional nano-objects because a variety of functional proteins can be displayed on a designed RNA scaffold. Here, we report circular permutations of an RNA-binding protein L7Ae based on the three-dimensional structure information to alter the orientation of the displayed proteins on the RNA scaffold. An electrophoretic mobility shift assay and atomic force microscopy (AFM) analysis revealed that most of the designed circular permutants formed an RNP nano-object. Moreover, the alteration of the enhanced green fluorescent protein (EGFP) orientation was confirmed with AFM by employing EGFP on the L7Ae permutant on the RNA. Themore » results demonstrate that targeted fine-tuning of the stereo-specific fixation of a protein on a protein-binding RNA is feasible by using the circular permutation technique.« less
Lizunov, A Y; Gonchar, A L; Zaitseva, N I; Zosimov, V V
2015-10-26
We analyzed the frequency with which intraligand contacts occurred in a set of 1300 protein-ligand complexes [ Plewczynski et al. J. Comput. Chem. 2011 , 32 , 742 - 755 .]. Our analysis showed that flexible ligands often form intraligand hydrophobic contacts, while intraligand hydrogen bonds are rare. The test set was also thoroughly investigated and classified. We suggest a universal method for enhancement of a scoring function based on a potential of mean force (PMF-based score) by adding a term accounting for intraligand interactions. The method was implemented via in-house developed program, utilizing an Algo_score scoring function [ Ramensky et al. Proteins: Struct., Funct., Genet. 2007 , 69 , 349 - 357 .] based on the Tarasov-Muryshev PMF [ Muryshev et al. J. Comput.-Aided Mol. Des. 2003 , 17 , 597 - 605 .]. The enhancement of the scoring function was shown to significantly improve the docking and scoring quality for flexible ligands in the test set of 1300 protein-ligand complexes [ Plewczynski et al. J. Comput. Chem. 2011 , 32 , 742 - 755 .]. We then investigated the correlation of the docking results with two parameters of intraligand interactions estimation. These parameters are the weight of intraligand interactions and the minimum number of bonds between the ligand atoms required to take their interaction into account.
Hierarchical Ensemble Methods for Protein Function Prediction
2014-01-01
Protein function prediction is a complex multiclass multilabel classification problem, characterized by multiple issues such as the incompleteness of the available annotations, the integration of multiple sources of high dimensional biomolecular data, the unbalance of several functional classes, and the difficulty of univocally determining negative examples. Moreover, the hierarchical relationships between functional classes that characterize both the Gene Ontology and FunCat taxonomies motivate the development of hierarchy-aware prediction methods that showed significantly better performances than hierarchical-unaware “flat” prediction methods. In this paper, we provide a comprehensive review of hierarchical methods for protein function prediction based on ensembles of learning machines. According to this general approach, a separate learning machine is trained to learn a specific functional term and then the resulting predictions are assembled in a “consensus” ensemble decision, taking into account the hierarchical relationships between classes. The main hierarchical ensemble methods proposed in the literature are discussed in the context of existing computational methods for protein function prediction, highlighting their characteristics, advantages, and limitations. Open problems of this exciting research area of computational biology are finally considered, outlining novel perspectives for future research. PMID:25937954
Regad, Leslie; Martin, Juliette; Camproux, Anne-Claude
2011-06-20
One of the strategies for protein function annotation is to search particular structural motifs that are known to be shared by proteins with a given function. Here, we present a systematic extraction of structural motifs of seven residues from protein loops and we explore their correspondence with functional sites. Our approach is based on the structural alphabet HMM-SA (Hidden Markov Model - Structural Alphabet), which allows simplification of protein structures into uni-dimensional sequences, and advanced pattern statistics adapted to short sequences. Structural motifs of interest are selected by looking for structural motifs significantly over-represented in SCOP superfamilies in protein loops. We discovered two types of structural motifs significantly over-represented in SCOP superfamilies: (i) ubiquitous motifs, shared by several superfamilies and (ii) superfamily-specific motifs, over-represented in few superfamilies. A comparison of ubiquitous words with known small structural motifs shows that they contain well-described motifs as turn, niche or nest motifs. A comparison between superfamily-specific motifs and biological annotations of Swiss-Prot reveals that some of them actually correspond to functional sites involved in the binding sites of small ligands, such as ATP/GTP, NAD(P) and SAH/SAM. Our findings show that statistical over-representation in SCOP superfamilies is linked to functional features. The detection of over-represented motifs within structures simplified by HMM-SA is therefore a promising approach for prediction of functional sites and annotation of uncharacterized proteins.
2011-01-01
Background One of the strategies for protein function annotation is to search particular structural motifs that are known to be shared by proteins with a given function. Results Here, we present a systematic extraction of structural motifs of seven residues from protein loops and we explore their correspondence with functional sites. Our approach is based on the structural alphabet HMM-SA (Hidden Markov Model - Structural Alphabet), which allows simplification of protein structures into uni-dimensional sequences, and advanced pattern statistics adapted to short sequences. Structural motifs of interest are selected by looking for structural motifs significantly over-represented in SCOP superfamilies in protein loops. We discovered two types of structural motifs significantly over-represented in SCOP superfamilies: (i) ubiquitous motifs, shared by several superfamilies and (ii) superfamily-specific motifs, over-represented in few superfamilies. A comparison of ubiquitous words with known small structural motifs shows that they contain well-described motifs as turn, niche or nest motifs. A comparison between superfamily-specific motifs and biological annotations of Swiss-Prot reveals that some of them actually correspond to functional sites involved in the binding sites of small ligands, such as ATP/GTP, NAD(P) and SAH/SAM. Conclusions Our findings show that statistical over-representation in SCOP superfamilies is linked to functional features. The detection of over-represented motifs within structures simplified by HMM-SA is therefore a promising approach for prediction of functional sites and annotation of uncharacterized proteins. PMID:21689388
Phylogeny-dominant classification of J-proteins in Arabidopsis thaliana and Brassica oleracea.
Zhang, Bin; Qiu, Han-Lin; Qu, Dong-Hai; Ruan, Ying; Chen, Dong-Hong
2018-04-05
Hsp40s or DnaJ/J-proteins are evolutionarily conserved in all organisms as co-chaperones of molecular chaperone HSP70s that mainly participate in maintaining cellular protein homeostasis, such as protein folding, assembly, stabilization, and translocation under normal conditions as well as refolding and degradation under environmental stresses. It has been reported that Arabidopsis J-proteins are classified into four classes (types A-D) according to domain organization, but their phylogenetic relationships are unknown. Here, we identified 129 J-proteins in the world-wide popular vegetable Brassica oleracea, a close relative of the model plant Arabidopsis, and also revised the information of Arabidopsis J-proteins based on the latest online bioresources. According to phylogenetic analysis with domain organization and gene structure as references, the J-proteins from Arabidopsis and B. oleracea were classified into 15 main clades (I-XV) separated by a number of undefined small branches with remote relationship. Based on the number of members, they respectively belong to multigene clades, oligo-gene clades, and mono-gene clades. The J-protein genes from different clades may function together or separately to constitute a complicated regulatory network. This study provides a constructive viewpoint for J-protein classification and an informative platform for further functional dissection and resistant genes discovery related to genetic improvement of crop plants.
Zhang, Jingjing; Zhang, Lei; Qiu, Jinkui; Nian, Hongjuan
2015-10-01
Cryptococcus humicola is a highly aluminum (Al) tolerant yeast strain isolated from a tea field. Here the relative changes of protein expression in C. humicola undergoing aluminum stress were analyzed to understand the genetic basis of aluminum tolerance. In this work, iTRAQ-based (isobaric tags for relative and absolute quantification) quantitative proteomic technology was used to detect statistically significant proteins associated with the response to aluminum stress. A total of 625 proteins were identified and were mainly involved in translation/ribosomal structure and biogenesis, posttranslational modification/protein turnover/chaperones, energy production and conversion, and amino acid transport and metabolism. Of these proteins, 59 exhibited differential expression during aluminum stress. Twenty-nine proteins up-regulated by aluminum were mainly involved in translation/ribosomal structure and biogenesis, posttranslational modification/protein turnover and chaperones, and lipid transport and metabolism. Thirty proteins down-regulated by aluminum were mainly associated with energy transport and metabolism, translation/ribosomal structure and biogenesis, posttranslational modification/protein turnover/chaperones, and lipid transport and metabolism. The potential functions of some proteins in aluminum tolerance are discussed. These functional changes may be beneficial for cells to protect themselves from aluminum toxic conditions. Crown Copyright © 2015. Published by Elsevier B.V. All rights reserved.
Khafizov, Kamil; Madrid-Aliste, Carlos; Almo, Steven C; Fiser, Andras
2014-03-11
The exponential growth of protein sequence data provides an ever-expanding body of unannotated and misannotated proteins. The National Institutes of Health-supported Protein Structure Initiative and related worldwide structural genomics efforts facilitate functional annotation of proteins through structural characterization. Recently there have been profound changes in the taxonomic composition of sequence databases, which are effectively redefining the scope and contribution of these large-scale structure-based efforts. The faster-growing bacterial genomic entries have overtaken the eukaryotic entries over the last 5 y, but also have become more redundant. Despite the enormous increase in the number of sequences, the overall structural coverage of proteins--including proteins for which reliable homology models can be generated--on the residue level has increased from 30% to 40% over the last 10 y. Structural genomics efforts contributed ∼50% of this new structural coverage, despite determining only ∼10% of all new structures. Based on current trends, it is expected that ∼55% structural coverage (the level required for significant functional insight) will be achieved within 15 y, whereas without structural genomics efforts, realizing this goal will take approximately twice as long.
GAL4 transactivation-based assay for the detection of selective intercellular protein movement.
Kumar, Dhinesh; Chen, Huan; Rim, Yeonggil; Kim, Jae-Yean
2015-01-01
Several plant proteins function as intercellular messenger to specify cell fate and coordinate plant development. Such intercellular communication can be achieved by direct, selective, or nonselective (diffusion-based) trafficking through plasmodesmata (PD), the symplasmic membrane-lined nanochannels adjoining two cells. A trichome rescue trafficking assay was reported to allow the detection of protein movement in Arabidopsis leaf tissue using transgenic gene expression. Here, we provide a protocol to dissect the mode of intercellular protein movement in Arabidopsis root. This assay system involves a root ground tissue-specific GAL4/UAS transactivation expression system in combination with fluorescent reporter proteins. In this system, mCherry, a red fluorescent protein, can move cell to cell via diffusion, while mCherry-H2B is tightly cell autonomous. Thus, a protein fused to mCherry-H2B that can move out from the site of synthesis likely contains a selective trafficking signal to impart a cell-to-cell gain-of-trafficking function to the cell-autonomous mCherry-H2B. This approach can be adapted to investigate the cell-to-cell trafficking properties of any protein of interest.
Development of the field of structural physiology
FUJIYOSHI, Yoshinori
2015-01-01
Electron crystallography is especially useful for studying the structure and function of membrane proteins — key molecules with important functions in neural and other cells. Electron crystallography is now an established technique for analyzing the structures of membrane proteins in lipid bilayers that closely simulate their natural biological environment. Utilizing cryo-electron microscopes with helium-cooled specimen stages that were developed through a personal motivation to understand the functions of neural systems from a structural point of view, the structures of membrane proteins can be analyzed at a higher than 3 Å resolution. This review covers four objectives. First, I introduce the new research field of structural physiology. Second, I recount some of the struggles involved in developing cryo-electron microscopes. Third, I review the structural and functional analyses of membrane proteins mainly by electron crystallography using cryo-electron microscopes. Finally, I discuss multifunctional channels named “adhennels” based on structures analyzed using electron and X-ray crystallography. PMID:26560835
Energy design for protein-protein interactions
Ravikant, D. V. S.; Elber, Ron
2011-01-01
Proteins bind to other proteins efficiently and specifically to carry on many cell functions such as signaling, activation, transport, enzymatic reactions, and more. To determine the geometry and strength of binding of a protein pair, an energy function is required. An algorithm to design an optimal energy function, based on empirical data of protein complexes, is proposed and applied. Emphasis is made on negative design in which incorrect geometries are presented to the algorithm that learns to avoid them. For the docking problem the search for plausible geometries can be performed exhaustively. The possible geometries of the complex are generated on a grid with the help of a fast Fourier transform algorithm. A novel formulation of negative design makes it possible to investigate iteratively hundreds of millions of negative examples while monotonically improving the quality of the potential. Experimental structures for 640 protein complexes are used to generate positive and negative examples for learning parameters. The algorithm designed in this work finds the correct binding structure as the lowest energy minimum in 318 cases of the 640 examples. Further benchmarks on independent sets confirm the significant capacity of the scoring function to recognize correct modes of interactions. PMID:21842951