Scavuzzo-Duggan, Tess R.; Chaves, Arielle M.; Roberts, Alison W.
2015-07-14
Here, a method for rapid in vivo functional analysis of engineered proteins was developed using Physcomitrella patens. A complementation assay was designed for testing structure/function relationships in cellulose synthase (CESA) proteins. The components of the assay include (1) construction of test vectors that drive expression of epitope-tagged PpCESA5 carrying engineered mutations, (2) transformation of a ppcesa5 knockout line that fails to produce gametophores with test and control vectors, (3) scoring the stable transformants for gametophore production, (4) statistical analysis comparing complementation rates for test vectors to positive and negative control vectors, and (5) analysis of transgenic protein expression by Westernmore » blotting. The assay distinguished mutations that generate fully functional, nonfunctional, and partially functional proteins. In conclusion, compared with existing methods for in vivo testing of protein function, this complementation assay provides a rapid method for investigating protein structure/function relationships in plants.« less
Genome-wide protein-protein interactions and protein function exploration in cyanobacteria
Lv, Qi; Ma, Weimin; Liu, Hui; Li, Jiang; Wang, Huan; Lu, Fang; Zhao, Chen; Shi, Tieliu
2015-01-01
Genome-wide network analysis is well implemented to study proteins of unknown function. Here, we effectively explored protein functions and the biological mechanism based on inferred high confident protein-protein interaction (PPI) network in cyanobacteria. We integrated data from seven different sources and predicted 1,997 PPIs, which were evaluated by experiments in molecular mechanism, text mining of literatures in proved direct/indirect evidences, and “interologs” in conservation. Combined the predicted PPIs with known PPIs, we obtained 4,715 no-redundant PPIs (involving 3,231 proteins covering over 90% of genome) to generate the PPI network. Based on the PPI network, terms in Gene ontology (GO) were assigned to function-unknown proteins. Functional modules were identified by dissecting the PPI network into sub-networks and analyzing pathway enrichment, with which we investigated novel function of underlying proteins in protein complexes and pathways. Examples of photosynthesis and DNA repair indicate that the network approach is a powerful tool in protein function analysis. Overall, this systems biology approach provides a new insight into posterior functional analysis of PPIs in cyanobacteria. PMID:26490033
Nadzirin, Nurul; Firdaus-Raih, Mohd
2012-10-08
Proteins of uncharacterized functions form a large part of many of the currently available biological databases and this situation exists even in the Protein Data Bank (PDB). Our analysis of recent PDB data revealed that only 42.53% of PDB entries (1084 coordinate files) that were categorized under "unknown function" are true examples of proteins of unknown function at this point in time. The remainder 1465 entries also annotated as such appear to be able to have their annotations re-assessed, based on the availability of direct functional characterization experiments for the protein itself, or for homologous sequences or structures thus enabling computational function inference.
Analysis of sequence repeats of proteins in the PDB.
Mary Rajathei, David; Selvaraj, Samuel
2013-12-01
Internal repeats in protein sequences play a significant role in the evolution of protein structure and function. Applications of different bioinformatics tools help in the identification and characterization of these repeats. In the present study, we analyzed sequence repeats in a non-redundant set of proteins available in the Protein Data Bank (PDB). We used RADAR for detecting internal repeats in a protein, PDBeFOLD for assessing structural similarity, PDBsum for finding functional involvement and Pfam for domain assignment of the repeats in a protein. Through the analysis of sequence repeats, we found that identity of the sequence repeats falls in the range of 20-40% and, the superimposed structures of the most of the sequence repeats maintain similar overall folding. Analysis sequence repeats at the functional level reveals that most of the sequence repeats are involved in the function of the protein through functionally involved residues in the repeat regions. We also found that sequence repeats in single and two domain proteins often contained conserved sequence motifs for the function of the domain. Copyright © 2013 Elsevier Ltd. All rights reserved.
Functional Proteomic Analysis of Human NucleolusD⃞
Scherl, Alexander; Couté, Yohann; Déon, Catherine; Callé, Aleth; Kindbeiter, Karine; Sanchez, Jean-Charles; Greco, Anna; Hochstrasser, Denis; Diaz, Jean-Jacques
2002-01-01
The notion of a “plurifunctional” nucleolus is now well established. However, molecular mechanisms underlying the biological processes occurring within this nuclear domain remain only partially understood. As a first step in elucidating these mechanisms we have carried out a proteomic analysis to draw up a list of proteins present within nucleoli of HeLa cells. This analysis allowed the identification of 213 different nucleolar proteins. This catalog complements that of the 271 proteins obtained recently by others, giving a total of ∼350 different nucleolar proteins. Functional classification of these proteins allowed outlining several biological processes taking place within nucleoli. Bioinformatic analyses permitted the assignment of hypothetical functions for 43 proteins for which no functional information is available. Notably, a role in ribosome biogenesis was proposed for 31 proteins. More generally, this functional classification reinforces the plurifunctional nature of nucleoli and provides convincing evidence that nucleoli may play a central role in the control of gene expression. Finally, this analysis supports the recent demonstration of a coupling of transcription and translation in higher eukaryotes. PMID:12429849
Zebra: a web server for bioinformatic analysis of diverse protein families.
Suplatov, Dmitry; Kirilin, Evgeny; Takhaveev, Vakil; Svedas, Vytas
2014-01-01
During evolution of proteins from a common ancestor, one functional property can be preserved while others can vary leading to functional diversity. A systematic study of the corresponding adaptive mutations provides a key to one of the most challenging problems of modern structural biology - understanding the impact of amino acid substitutions on protein function. The subfamily-specific positions (SSPs) are conserved within functional subfamilies but are different between them and, therefore, seem to be responsible for functional diversity in protein superfamilies. Consequently, a corresponding method to perform the bioinformatic analysis of sequence and structural data has to be implemented in the common laboratory practice to study the structure-function relationship in proteins and develop novel protein engineering strategies. This paper describes Zebra web server - a powerful remote platform that implements a novel bioinformatic analysis algorithm to study diverse protein families. It is the first application that provides specificity determinants at different levels of functional classification, therefore addressing complex functional diversity of large superfamilies. Statistical analysis is implemented to automatically select a set of highly significant SSPs to be used as hotspots for directed evolution or rational design experiments and analyzed studying the structure-function relationship. Zebra results are provided in two ways - (1) as a single all-in-one parsable text file and (2) as PyMol sessions with structural representation of SSPs. Zebra web server is available at http://biokinet.belozersky.msu.ru/zebra .
Boyanova, Desislava; Nilla, Santosh; Klau, Gunnar W.; Dandekar, Thomas; Müller, Tobias; Dittrich, Marcus
2014-01-01
The continuously evolving field of proteomics produces increasing amounts of data while improving the quality of protein identifications. Albeit quantitative measurements are becoming more popular, many proteomic studies are still based on non-quantitative methods for protein identification. These studies result in potentially large sets of identified proteins, where the biological interpretation of proteins can be challenging. Systems biology develops innovative network-based methods, which allow an integrated analysis of these data. Here we present a novel approach, which combines prior knowledge of protein-protein interactions (PPI) with proteomics data using functional similarity measurements of interacting proteins. This integrated network analysis exactly identifies network modules with a maximal consistent functional similarity reflecting biological processes of the investigated cells. We validated our approach on small (H9N2 virus-infected gastric cells) and large (blood constituents) proteomic data sets. Using this novel algorithm, we identified characteristic functional modules in virus-infected cells, comprising key signaling proteins (e.g. the stress-related kinase RAF1) and demonstrate that this method allows a module-based functional characterization of cell types. Analysis of a large proteome data set of blood constituents resulted in clear separation of blood cells according to their developmental origin. A detailed investigation of the T-cell proteome further illustrates how the algorithm partitions large networks into functional subnetworks each representing specific cellular functions. These results demonstrate that the integrated network approach not only allows a detailed analysis of proteome networks but also yields a functional decomposition of complex proteomic data sets and thereby provides deeper insights into the underlying cellular processes of the investigated system. PMID:24807868
Advances in structural and functional analysis of membrane proteins by electron crystallography
Wisedchaisri, Goragot; Reichow, Steve L.; Gonen, Tamir
2011-01-01
Summary Electron crystallography is a powerful technique for the study of membrane protein structure and function in the lipid environment. When well-ordered two-dimensional crystals are obtained the structure of both protein and lipid can be determined and lipid-protein interactions analyzed. Protons and ionic charges can be visualized by electron crystallography and the protein of interest can be captured for structural analysis in a variety of physiologically distinct states. This review highlights the strengths of electron crystallography and the momentum that is building up in automation and the development of high throughput tools and methods for structural and functional analysis of membrane proteins by electron crystallography. PMID:22000511
Advances in structural and functional analysis of membrane proteins by electron crystallography.
Wisedchaisri, Goragot; Reichow, Steve L; Gonen, Tamir
2011-10-12
Electron crystallography is a powerful technique for the study of membrane protein structure and function in the lipid environment. When well-ordered two-dimensional crystals are obtained the structure of both protein and lipid can be determined and lipid-protein interactions analyzed. Protons and ionic charges can be visualized by electron crystallography and the protein of interest can be captured for structural analysis in a variety of physiologically distinct states. This review highlights the strengths of electron crystallography and the momentum that is building up in automation and the development of high throughput tools and methods for structural and functional analysis of membrane proteins by electron crystallography. Copyright © 2011 Elsevier Ltd. All rights reserved.
Text Mining Improves Prediction of Protein Functional Sites
Cohn, Judith D.; Ravikumar, Komandur E.
2012-01-01
We present an approach that integrates protein structure analysis and text mining for protein functional site prediction, called LEAP-FS (Literature Enhanced Automated Prediction of Functional Sites). The structure analysis was carried out using Dynamics Perturbation Analysis (DPA), which predicts functional sites at control points where interactions greatly perturb protein vibrations. The text mining extracts mentions of residues in the literature, and predicts that residues mentioned are functionally important. We assessed the significance of each of these methods by analyzing their performance in finding known functional sites (specifically, small-molecule binding sites and catalytic sites) in about 100,000 publicly available protein structures. The DPA predictions recapitulated many of the functional site annotations and preferentially recovered binding sites annotated as biologically relevant vs. those annotated as potentially spurious. The text-based predictions were also substantially supported by the functional site annotations: compared to other residues, residues mentioned in text were roughly six times more likely to be found in a functional site. The overlap of predictions with annotations improved when the text-based and structure-based methods agreed. Our analysis also yielded new high-quality predictions of many functional site residues that were not catalogued in the curated data sources we inspected. We conclude that both DPA and text mining independently provide valuable high-throughput protein functional site predictions, and that integrating the two methods using LEAP-FS further improves the quality of these predictions. PMID:22393388
Dewhurst, Henry M.; Choudhury, Shilpa; Torres, Matthew P.
2015-01-01
Predicting the biological function potential of post-translational modifications (PTMs) is becoming increasingly important in light of the exponential increase in available PTM data from high-throughput proteomics. We developed structural analysis of PTM hotspots (SAPH-ire)—a quantitative PTM ranking method that integrates experimental PTM observations, sequence conservation, protein structure, and interaction data to allow rank order comparisons within or between protein families. Here, we applied SAPH-ire to the study of PTMs in diverse G protein families, a conserved and ubiquitous class of proteins essential for maintenance of intracellular structure (tubulins) and signal transduction (large and small Ras-like G proteins). A total of 1728 experimentally verified PTMs from eight unique G protein families were clustered into 451 unique hotspots, 51 of which have a known and cited biological function or response. Using customized software, the hotspots were analyzed in the context of 598 unique protein structures. By comparing distributions of hotspots with known versus unknown function, we show that SAPH-ire analysis is predictive for PTM biological function. Notably, SAPH-ire revealed high-ranking hotspots for which a functional impact has not yet been determined, including phosphorylation hotspots in the N-terminal tails of G protein gamma subunits—conserved protein structures never before reported as regulators of G protein coupled receptor signaling. To validate this prediction we used the yeast model system for G protein coupled receptor signaling, revealing that gamma subunit–N-terminal tail phosphorylation is activated in response to G protein coupled receptor stimulation and regulates protein stability in vivo. These results demonstrate the utility of integrating protein structural and sequence features into PTM prioritization schemes that can improve the analysis and functional power of modification-specific proteomics data. PMID:26070665
Zhang, Yu-Juan; Yang, Chun-Lin; Hao, You-Jin; Li, Ying; Chen, Bin; Wen, Jian-Fan
2014-01-25
To fully explore the trends of atomic composition during the macroevolution from prokaryote to eukaryote, five atoms (oxygen, sulfur, nitrogen, carbon, hydrogen) and related functional groups in prokaryotic and eukaryotic proteins were surveyed and compared. Genome-wide analysis showed that eukaryotic proteins have more oxygen, sulfur and nitrogen atoms than prokaryotes do. Clusters of Orthologous Groups (COG) analysis revealed that oxygen, sulfur, carbon and hydrogen frequencies are higher in eukaryotic proteins than in their prokaryotic orthologs. Furthermore, functional group analysis demonstrated that eukaryotic proteins tend to have higher proportions of sulfhydryl, hydroxyl and acylamino, but lower of sulfide and carboxyl. Taken together, an apparent trend of increase was observed for oxygen and sulfur atoms in the macroevolution; the variation of oxygen and sulfur compositions and their related functional groups in macroevolution made eukaryotic proteins carry more useful functional groups. These results will be helpful for better understanding the functional significances of atomic composition evolution. Copyright © 2013 Elsevier B.V. All rights reserved.
Proteomic Analysis of the Arabidopsis Nucleolus Suggests Novel Nucleolar FunctionsD⃞
Pendle, Alison F.; Clark, Gillian P.; Boon, Reinier; Lewandowska, Dominika; Lam, Yun Wah; Andersen, Jens; Mann, Matthias; Lamond, Angus I.; Brown, John W. S.; Shaw, Peter J.
2005-01-01
The eukaryotic nucleolus is involved in ribosome biogenesis and a wide range of other RNA metabolism and cellular functions. An important step in the functional analysis of the nucleolus is to determine the complement of proteins of this nuclear compartment. Here, we describe the first proteomic analysis of plant (Arabidopsis thaliana) nucleoli, in which we have identified 217 proteins. This allows a direct comparison of the proteomes of an important nuclear structure between two widely divergent species: human and Arabidopsis. The comparison identified many common proteins, plant-specific proteins, proteins of unknown function found in both proteomes, and proteins that were nucleolar in plants but nonnucleolar in human. Seventy-two proteins were expressed as GFP fusions and 87% showed nucleolar or nucleolar-associated localization. In a striking and unexpected finding, we have identified six components of the postsplicing exon-junction complex (EJC) involved in mRNA export and nonsense-mediated decay (NMD)/mRNA surveillance. This association was confirmed by GFP-fusion protein localization. These results raise the possibility that in plants, nucleoli may have additional functions in mRNA export or surveillance. PMID:15496452
Liu, Xiuying; Luo, GuanZheng; Bai, Xiujuan; Wang, Xiu-Jie
2009-10-01
MicroRNAs are approximately 22 nt long small non-coding RNAs that play important regulatory roles in eukaryotes. The biogenesis and functional processes of microRNAs require the participation of many proteins, of which, the well studied ones are Dicer, Drosha, Argonaute and Exportin 5. To systematically study these four protein families, we screened 11 animal genomes to search for genes encoding above mentioned proteins, and identified some new members for each family. Domain analysis results revealed that most proteins within the same family share identical or similar domains. Alternative spliced transcript variants were found for some proteins. We also examined the expression patterns of these proteins in different human tissues and identified other proteins that could potentially interact with these proteins. These findings provided systematic information on the four key proteins involved in microRNA biogenesis and functional pathways in animals, and will shed light on further functional studies of these proteins.
Rice proteome analysis: a step toward functional analysis of the rice genome.
Komatsu, Setsuko; Tanaka, Naoki
2005-03-01
The technique of proteome analysis using 2-DE has the power to monitor global changes that occur in the protein complement of tissues and subcellular compartments. In this review, we describe construction of the rice proteome database, the cataloging of rice proteins, and the functional characterization of some of the proteins identified. Initially, proteins extracted from various tissues and organelles were separated by 2-DE and an image analyzer was used to construct a display or reference map of the proteins. The rice proteome database currently contains 23 reference maps based on 2-DE of proteins from different rice tissues and subcellular compartments. These reference maps comprise 13 129 rice proteins, and the amino acid sequences of 5092 of these proteins are entered in the database. Major proteins involved in growth or stress responses have been identified by using a proteomics approach and some of these proteins have unique functions. Furthermore, initial work has also begun on analyzing the phosphoproteome and protein-protein interactions in rice. The information obtained from the rice proteome database will aid in the molecular cloning of rice genes and in predicting the function of unknown proteins.
Comparative bioinformatics analyses and profiling of lysosome-related organelle proteomes
NASA Astrophysics Data System (ADS)
Hu, Zhang-Zhi; Valencia, Julio C.; Huang, Hongzhan; Chi, An; Shabanowitz, Jeffrey; Hearing, Vincent J.; Appella, Ettore; Wu, Cathy
2007-01-01
Complete and accurate profiling of cellular organelle proteomes, while challenging, is important for the understanding of detailed cellular processes at the organelle level. Mass spectrometry technologies coupled with bioinformatics analysis provide an effective approach for protein identification and functional interpretation of organelle proteomes. In this study, we have compiled human organelle reference datasets from large-scale proteomic studies and protein databases for seven lysosome-related organelles (LROs), as well as the endoplasmic reticulum and mitochondria, for comparative organelle proteome analysis. Heterogeneous sources of human organelle proteins and rodent homologs are mapped to human UniProtKB protein entries based on ID and/or peptide mappings, followed by functional annotation and categorization using the iProXpress proteomic expression analysis system. Cataloging organelle proteomes allows close examination of both shared and unique proteins among various LROs and reveals their functional relevance. The proteomic comparisons show that LROs are a closely related family of organelles. The shared proteins indicate the dynamic and hybrid nature of LROs, while the unique transmembrane proteins may represent additional candidate marker proteins for LROs. This comparative analysis, therefore, provides a basis for hypothesis formulation and experimental validation of organelle proteins and their functional roles.
Du, Yushen; Wu, Nicholas C.; Jiang, Lin; Zhang, Tianhao; Gong, Danyang; Shu, Sara; Wu, Ting-Ting
2016-01-01
ABSTRACT Identification and annotation of functional residues are fundamental questions in protein sequence analysis. Sequence and structure conservation provides valuable information to tackle these questions. It is, however, limited by the incomplete sampling of sequence space in natural evolution. Moreover, proteins often have multiple functions, with overlapping sequences that present challenges to accurate annotation of the exact functions of individual residues by conservation-based methods. Using the influenza A virus PB1 protein as an example, we developed a method to systematically identify and annotate functional residues. We used saturation mutagenesis and high-throughput sequencing to measure the replication capacity of single nucleotide mutations across the entire PB1 protein. After predicting protein stability upon mutations, we identified functional PB1 residues that are essential for viral replication. To further annotate the functional residues important to the canonical or noncanonical functions of viral RNA-dependent RNA polymerase (vRdRp), we performed a homologous-structure analysis with 16 different vRdRp structures. We achieved high sensitivity in annotating the known canonical polymerase functional residues. Moreover, we identified a cluster of noncanonical functional residues located in the loop region of the PB1 β-ribbon. We further demonstrated that these residues were important for PB1 protein nuclear import through the interaction with Ran-binding protein 5. In summary, we developed a systematic and sensitive method to identify and annotate functional residues that are not restrained by sequence conservation. Importantly, this method is generally applicable to other proteins about which homologous-structure information is available. PMID:27803181
NovelFam3000 – Uncharacterized human protein domains conserved across model organisms
Kemmer, Danielle; Podowski, Raf M; Arenillas, David; Lim, Jonathan; Hodges, Emily; Roth, Peggy; Sonnhammer, Erik LL; Höög, Christer; Wasserman, Wyeth W
2006-01-01
Background Despite significant efforts from the research community, an extensive portion of the proteins encoded by human genes lack an assigned cellular function. Most metazoan proteins are composed of structural and/or functional domains, of which many appear in multiple proteins. Once a domain is characterized in one protein, the presence of a similar sequence in an uncharacterized protein serves as a basis for inference of function. Thus knowledge of a domain's function, or the protein within which it arises, can facilitate the analysis of an entire set of proteins. Description From the Pfam domain database, we extracted uncharacterized protein domains represented in proteins from humans, worms, and flies. A data centre was created to facilitate the analysis of the uncharacterized domain-containing proteins. The centre both provides researchers with links to dispersed internet resources containing gene-specific experimental data and enables them to post relevant experimental results or comments. For each human gene in the system, a characterization score is posted, allowing users to track the progress of characterization over time or to identify for study uncharacterized domains in well-characterized genes. As a test of the system, a subset of 39 domains was selected for analysis and the experimental results posted to the NovelFam3000 system. For 25 human protein members of these 39 domain families, detailed sub-cellular localizations were determined. Specific observations are presented based on the analysis of the integrated information provided through the online NovelFam3000 system. Conclusion Consistent experimental results between multiple members of a domain family allow for inferences of the domain's functional role. We unite bioinformatics resources and experimental data in order to accelerate the functional characterization of scarcely annotated domain families. PMID:16533400
EMSA Analysis of DNA Binding By Rgg Proteins
LaSarre, Breah; Federle, Michael J.
2016-01-01
In bacteria, interaction of various proteins with DNA is essential for the regulation of specific target gene expression. Electrophoretic mobility shift assay (EMSA) is an in vitro approach allowing for the visualization of these protein-DNA interactions. Rgg proteins comprise a family of transcriptional regulators widespread among the Firmicutes. Some of these proteins function independently to regulate target gene expression, while others have now been demonstrated to function as effectors of cell-to-cell communication, having regulatory activities that are modulated via direct interaction with small signaling peptides. EMSA analysis can be used to assess DNA binding of either type of Rgg protein. EMSA analysis of Rgg protein activity has facilitated in vitro confirmation of regulatory targets, identification of precise DNA binding sites via DNA probe mutagenesis, and characterization of the mechanism by which some cognate signaling peptides modulate Rgg protein function (e.g. interruption of DNA-binding in some cases). PMID:27430004
EMSA Analysis of DNA Binding By Rgg Proteins.
LaSarre, Breah; Federle, Michael J
2013-08-20
In bacteria, interaction of various proteins with DNA is essential for the regulation of specific target gene expression. Electrophoretic mobility shift assay (EMSA) is an in vitro approach allowing for the visualization of these protein-DNA interactions. Rgg proteins comprise a family of transcriptional regulators widespread among the Firmicutes. Some of these proteins function independently to regulate target gene expression, while others have now been demonstrated to function as effectors of cell-to-cell communication, having regulatory activities that are modulated via direct interaction with small signaling peptides. EMSA analysis can be used to assess DNA binding of either type of Rgg protein. EMSA analysis of Rgg protein activity has facilitated in vitro confirmation of regulatory targets, identification of precise DNA binding sites via DNA probe mutagenesis, and characterization of the mechanism by which some cognate signaling peptides modulate Rgg protein function ( e.g. interruption of DNA-binding in some cases).
Huang, He; Sarai, Akinori
2012-12-01
The evolvability of proteins is not only restricted by functional and structural importance, but also by other factors such as gene duplication, protein stability, and an organism's robustness. Recently, intrinsically disordered proteins (IDPs)/regions (IDRs) have been suggested to play a role in facilitating protein evolution. However, the mechanisms by which this occurs remain largely unknown. To address this, we have systematically analyzed the relationship between the evolvability, stability, and function of IDPs/IDRs. Evolutionary analysis shows that more recently emerged IDRs have higher evolutionary rates with more functional constraints relaxed (or experiencing more positive selection), and that this may have caused accelerated evolution in the flanking regions and in the whole protein. A systematic analysis of observed stability changes due to single amino acid mutations in IDRs and ordered regions shows that while most mutations induce a destabilizing effect in proteins, mutations in IDRs cause smaller stability changes than in ordered regions. The weaker impact of mutations in IDRs on protein stability may have advantages for protein evolvability in the gain of new functions. Interestingly, however, an analysis of functional motifs in the PROSITE and ELM databases showed that motifs in IDRs are more conserved, characterized by smaller entropy and lower evolutionary rate, than in ordered regions. This apparently opposing evolutionary effect may be partly due to the flexible nature of motifs in IDRs, which require some key amino acid residues to engage in tighter interactions with other molecules. Our study suggests that the unique conformational and thermodynamic characteristics of IDPs/IDRs play an important role in the evolvability of proteins to gain new functions. Copyright © 2012 Elsevier Ltd. All rights reserved.
Naqvi, Ahmad Abu Turab; Ahmad, Faizan; Hassan, Md Imtaiyaz
2015-01-01
Mycobacterium leprae is an intracellular obligate parasite that causes leprosy in humans, and it leads to the destruction of peripheral nerves and skin deformation. Here, we report an extensive analysis of the hypothetical proteins (HPs) from M. leprae strain Br4923, assigning their functions to better understand the mechanism of pathogenesis and to search for potential therapeutic interventions. The genome of M. leprae encodes 1604 proteins, of which the functions of 632 are not known (HPs). In this paper, we predicted the probable functions of 312 HPs. First, we classified all HPs into families and subfamilies on the basis of sequence similarity, followed by domain assignment, which provides many clues for their possible function. However, the functions of 320 proteins were not predicted because of low sequence similarity with proteins of known function. Annotated HPs were categorized into enzymes, binding proteins, transporters, and proteins involved in cellular processes. We found several novel proteins whose functions were unknown for M. leprae. These proteins have a requisite association with bacterial virulence and pathogenicity. Finally, our sequence-based analysis will be helpful for further validation and the search for potential drug targets while developing effective drugs to cure leprosy.
Dong, Zheng; Zhou, Hongyu; Tao, Peng
2018-02-01
PAS domains are widespread in archaea, bacteria, and eukaryota, and play important roles in various functions. In this study, we aim to explore functional evolutionary relationship among proteins in the PAS domain superfamily in view of the sequence-structure-dynamics-function relationship. We collected protein sequences and crystal structure data from RCSB Protein Data Bank of the PAS domain superfamily belonging to three biological functions (nucleotide binding, photoreceptor activity, and transferase activity). Protein sequences were aligned and then used to select sequence-conserved residues and build phylogenetic tree. Three-dimensional structure alignment was also applied to obtain structure-conserved residues. The protein dynamics were analyzed using elastic network model (ENM) and validated by molecular dynamics (MD) simulation. The result showed that the proteins with same function could be grouped by sequence similarity, and proteins in different functional groups displayed statistically significant difference in their vibrational patterns. Interestingly, in all three functional groups, conserved amino acid residues identified by sequence and structure conservation analysis generally have a lower fluctuation than other residues. In addition, the fluctuation of conserved residues in each biological function group was strongly correlated with the corresponding biological function. This research suggested a direct connection in which the protein sequences were related to various functions through structural dynamics. This is a new attempt to delineate functional evolution of proteins using the integrated information of sequence, structure, and dynamics. © 2017 The Protein Society.
Dewhurst, Henry M; Choudhury, Shilpa; Torres, Matthew P
2015-08-01
Predicting the biological function potential of post-translational modifications (PTMs) is becoming increasingly important in light of the exponential increase in available PTM data from high-throughput proteomics. We developed structural analysis of PTM hotspots (SAPH-ire)--a quantitative PTM ranking method that integrates experimental PTM observations, sequence conservation, protein structure, and interaction data to allow rank order comparisons within or between protein families. Here, we applied SAPH-ire to the study of PTMs in diverse G protein families, a conserved and ubiquitous class of proteins essential for maintenance of intracellular structure (tubulins) and signal transduction (large and small Ras-like G proteins). A total of 1728 experimentally verified PTMs from eight unique G protein families were clustered into 451 unique hotspots, 51 of which have a known and cited biological function or response. Using customized software, the hotspots were analyzed in the context of 598 unique protein structures. By comparing distributions of hotspots with known versus unknown function, we show that SAPH-ire analysis is predictive for PTM biological function. Notably, SAPH-ire revealed high-ranking hotspots for which a functional impact has not yet been determined, including phosphorylation hotspots in the N-terminal tails of G protein gamma subunits--conserved protein structures never before reported as regulators of G protein coupled receptor signaling. To validate this prediction we used the yeast model system for G protein coupled receptor signaling, revealing that gamma subunit-N-terminal tail phosphorylation is activated in response to G protein coupled receptor stimulation and regulates protein stability in vivo. These results demonstrate the utility of integrating protein structural and sequence features into PTM prioritization schemes that can improve the analysis and functional power of modification-specific proteomics data. © 2015 by The American Society for Biochemistry and Molecular Biology, Inc.
Assessment of protein set coherence using functional annotations
Chagoyen, Monica; Carazo, Jose M; Pascual-Montano, Alberto
2008-01-01
Background Analysis of large-scale experimental datasets frequently produces one or more sets of proteins that are subsequently mined for functional interpretation and validation. To this end, a number of computational methods have been devised that rely on the analysis of functional annotations. Although current methods provide valuable information (e.g. significantly enriched annotations, pairwise functional similarities), they do not specifically measure the degree of homogeneity of a protein set. Results In this work we present a method that scores the degree of functional homogeneity, or coherence, of a set of proteins on the basis of the global similarity of their functional annotations. The method uses statistical hypothesis testing to assess the significance of the set in the context of the functional space of a reference set. As such, it can be used as a first step in the validation of sets expected to be homogeneous prior to further functional interpretation. Conclusion We evaluate our method by analysing known biologically relevant sets as well as random ones. The known relevant sets comprise macromolecular complexes, cellular components and pathways described for Saccharomyces cerevisiae, which are mostly significantly coherent. Finally, we illustrate the usefulness of our approach for validating 'functional modules' obtained from computational analysis of protein-protein interaction networks. Matlab code and supplementary data are available at PMID:18937846
FunShift: a database of function shift analysis on protein subfamilies
Abhiman, Saraswathi; Sonnhammer, Erik L. L.
2005-01-01
Members of a protein family normally have a general biochemical function in common, but frequently one or more subgroups have evolved a slightly different function, such as different substrate specificity. It is important to detect such function shifts for a more accurate functional annotation. The FunShift database described here is a compilation of function shift analysis performed between subfamilies in protein families. It consists of two main components: (i) subfamilies derived from protein domain families and (ii) pairwise subfamily comparisons analyzed for function shift. The present release, FunShift 12, was derived from Pfam 12 and consists of 151 934 subfamilies derived from 7300 families. We carried out function shift analysis by two complementary methods on families with up to 500 members. From a total of 179 210 subfamily pairs, 62 384 were predicted to be functionally shifted in 2881 families. Each subfamily pair is provided with a markup of probable functional specificity-determining sites. Tools for searching and exploring the data are provided to make this database a valuable resource for protein function annotation. Knowledge of these functionally important sites will be useful for experimental biologists performing functional mutation studies. FunShift is available at http://FunShift.cgb.ki.se. PMID:15608176
Functional Evolution of PLP-dependent Enzymes based on Active-Site Structural Similarities
Catazaro, Jonathan; Caprez, Adam; Guru, Ashu; Swanson, David; Powers, Robert
2014-01-01
Families of distantly related proteins typically have very low sequence identity, which hinders evolutionary analysis and functional annotation. Slowly evolving features of proteins, such as an active site, are therefore valuable for annotating putative and distantly related proteins. To date, a complete evolutionary analysis of the functional relationship of an entire enzyme family based on active-site structural similarities has not yet been undertaken. Pyridoxal-5’-phosphate (PLP) dependent enzymes are primordial enzymes that diversified in the last universal ancestor. Using the Comparison of Protein Active Site Structures (CPASS) software and database, we show that the active site structures of PLP-dependent enzymes can be used to infer evolutionary relationships based on functional similarity. The enzymes successfully clustered together based on substrate specificity, function, and three-dimensional fold. This study demonstrates the value of using active site structures for functional evolutionary analysis and the effectiveness of CPASS. PMID:24920327
Functional evolution of PLP-dependent enzymes based on active-site structural similarities.
Catazaro, Jonathan; Caprez, Adam; Guru, Ashu; Swanson, David; Powers, Robert
2014-10-01
Families of distantly related proteins typically have very low sequence identity, which hinders evolutionary analysis and functional annotation. Slowly evolving features of proteins, such as an active site, are therefore valuable for annotating putative and distantly related proteins. To date, a complete evolutionary analysis of the functional relationship of an entire enzyme family based on active-site structural similarities has not yet been undertaken. Pyridoxal-5'-phosphate (PLP) dependent enzymes are primordial enzymes that diversified in the last universal ancestor. Using the comparison of protein active site structures (CPASS) software and database, we show that the active site structures of PLP-dependent enzymes can be used to infer evolutionary relationships based on functional similarity. The enzymes successfully clustered together based on substrate specificity, function, and three-dimensional-fold. This study demonstrates the value of using active site structures for functional evolutionary analysis and the effectiveness of CPASS. © 2014 Wiley Periodicals, Inc.
Du, Yushen; Wu, Nicholas C; Jiang, Lin; Zhang, Tianhao; Gong, Danyang; Shu, Sara; Wu, Ting-Ting; Sun, Ren
2016-11-01
Identification and annotation of functional residues are fundamental questions in protein sequence analysis. Sequence and structure conservation provides valuable information to tackle these questions. It is, however, limited by the incomplete sampling of sequence space in natural evolution. Moreover, proteins often have multiple functions, with overlapping sequences that present challenges to accurate annotation of the exact functions of individual residues by conservation-based methods. Using the influenza A virus PB1 protein as an example, we developed a method to systematically identify and annotate functional residues. We used saturation mutagenesis and high-throughput sequencing to measure the replication capacity of single nucleotide mutations across the entire PB1 protein. After predicting protein stability upon mutations, we identified functional PB1 residues that are essential for viral replication. To further annotate the functional residues important to the canonical or noncanonical functions of viral RNA-dependent RNA polymerase (vRdRp), we performed a homologous-structure analysis with 16 different vRdRp structures. We achieved high sensitivity in annotating the known canonical polymerase functional residues. Moreover, we identified a cluster of noncanonical functional residues located in the loop region of the PB1 β-ribbon. We further demonstrated that these residues were important for PB1 protein nuclear import through the interaction with Ran-binding protein 5. In summary, we developed a systematic and sensitive method to identify and annotate functional residues that are not restrained by sequence conservation. Importantly, this method is generally applicable to other proteins about which homologous-structure information is available. To fully comprehend the diverse functions of a protein, it is essential to understand the functionality of individual residues. Current methods are highly dependent on evolutionary sequence conservation, which is usually limited by sampling size. Sequence conservation-based methods are further confounded by structural constraints and multifunctionality of proteins. Here we present a method that can systematically identify and annotate functional residues of a given protein. We used a high-throughput functional profiling platform to identify essential residues. Coupling it with homologous-structure comparison, we were able to annotate multiple functions of proteins. We demonstrated the method with the PB1 protein of influenza A virus and identified novel functional residues in addition to its canonical function as an RNA-dependent RNA polymerase. Not limited to virology, this method is generally applicable to other proteins that can be functionally selected and about which homologous-structure information is available. Copyright © 2016 Du et al.
A phylogenetic analysis of normal modes evolution in enzymes and its relationship to enzyme function
Lai, Jason; Jin, Jing; Kubelka, Jan; Liberles, David A.
2012-01-01
Since the dynamic nature of protein structures is essential for enzymatic function, it is expected that the functional evolution can be inferred from the changes in the protein dynamics. However, dynamics can also diverge neutrally with sequence substitution between enzymes without changes of function. In this study, a phylogenetic approach is implemented to explore the relationship between enzyme dynamics and function through evolutionary history. Protein dynamics are described by normal mode analysis based on a simplified harmonic potential force field applied to the reduced Cα representation of the protein structure while enzymatic function is described by Enzyme Commission (EC) numbers. Similarity of the binding pocket dynamics at each branch of the protein family’s phylogeny was analyzed in two ways: 1) explicitly by quantifying the normal mode overlap calculated for the reconstructed ancestral proteins at each end and 2) implicitly using a diffusion model to obtain the reconstructed lineage-specific changes in the normal modes. Both explicit and implicit ancestral reconstruction identified generally faster rates of change in dynamics compared with the expected change from neutral evolution at the branches of potential functional divergences for the alpha-amylase, D-isomer specific 2-hydroxyacid dehydrogenase, and copper-containing amine oxidase protein families. Normal modes analysis added additional information over just comparing the RMSD of static structures. However, the branch-specific changes were not statistically significant compared to background function-independent neutral rates of change of dynamic properties and blind application of the analysis would not enable prediction of changes in enzyme specificity. PMID:22651983
Lai, Jason; Jin, Jing; Kubelka, Jan; Liberles, David A
2012-09-21
Since the dynamic nature of protein structures is essential for enzymatic function, it is expected that functional evolution can be inferred from the changes in protein dynamics. However, dynamics can also diverge neutrally with sequence substitution between enzymes without changes of function. In this study, a phylogenetic approach is implemented to explore the relationship between enzyme dynamics and function through evolutionary history. Protein dynamics are described by normal mode analysis based on a simplified harmonic potential force field applied to the reduced C(α) representation of the protein structure while enzymatic function is described by Enzyme Commission numbers. Similarity of the binding pocket dynamics at each branch of the protein family's phylogeny was analyzed in two ways: (1) explicitly by quantifying the normal mode overlap calculated for the reconstructed ancestral proteins at each end and (2) implicitly using a diffusion model to obtain the reconstructed lineage-specific changes in the normal modes. Both explicit and implicit ancestral reconstruction identified generally faster rates of change in dynamics compared with the expected change from neutral evolution at the branches of potential functional divergences for the α-amylase, D-isomer-specific 2-hydroxyacid dehydrogenase, and copper-containing amine oxidase protein families. Normal mode analysis added additional information over just comparing the RMSD of static structures. However, the branch-specific changes were not statistically significant compared to background function-independent neutral rates of change of dynamic properties and blind application of the analysis would not enable prediction of changes in enzyme specificity. Copyright © 2012 Elsevier Ltd. All rights reserved.
Stein, Matthias; Pilli, Manohar; Bernauer, Sabine; Habermann, Bianca H.; Zerial, Marino; Wade, Rebecca C.
2012-01-01
Background Rab GTPases constitute the largest subfamily of the Ras protein superfamily. Rab proteins regulate organelle biogenesis and transport, and display distinct binding preferences for effector and activator proteins, many of which have not been elucidated yet. The underlying molecular recognition motifs, binding partner preferences and selectivities are not well understood. Methodology/Principal Findings Comparative analysis of the amino acid sequences and the three-dimensional electrostatic and hydrophobic molecular interaction fields of 62 human Rab proteins revealed a wide range of binding properties with large differences between some Rab proteins. This analysis assists the functional annotation of Rab proteins 12, 14, 26, 37 and 41 and provided an explanation for the shared function of Rab3 and 27. Rab7a and 7b have very different electrostatic potentials, indicating that they may bind to different effector proteins and thus, exert different functions. The subfamily V Rab GTPases which are associated with endosome differ subtly in the interaction properties of their switch regions, and this may explain exchange factor specificity and exchange kinetics. Conclusions/Significance We have analysed conservation of sequence and of molecular interaction fields to cluster and annotate the human Rab proteins. The analysis of three dimensional molecular interaction fields provides detailed insight that is not available from a sequence-based approach alone. Based on our results, we predict novel functions for some Rab proteins and provide insights into their divergent functions and the determinants of their binding partner selectivity. PMID:22523562
Loo, Lit-Hsin; Laksameethanasan, Danai; Tung, Yi-Ling
2014-03-01
Protein subcellular localization is a major determinant of protein function. However, this important protein feature is often described in terms of discrete and qualitative categories of subcellular compartments, and therefore it has limited applications in quantitative protein function analyses. Here, we present Protein Localization Analysis and Search Tools (PLAST), an automated analysis framework for constructing and comparing quantitative signatures of protein subcellular localization patterns based on microscopy images. PLAST produces human-interpretable protein localization maps that quantitatively describe the similarities in the localization patterns of proteins and major subcellular compartments, without requiring manual assignment or supervised learning of these compartments. Using the budding yeast Saccharomyces cerevisiae as a model system, we show that PLAST is more accurate than existing, qualitative protein localization annotations in identifying known co-localized proteins. Furthermore, we demonstrate that PLAST can reveal protein localization-function relationships that are not obvious from these annotations. First, we identified proteins that have similar localization patterns and participate in closely-related biological processes, but do not necessarily form stable complexes with each other or localize at the same organelles. Second, we found an association between spatial and functional divergences of proteins during evolution. Surprisingly, as proteins with common ancestors evolve, they tend to develop more diverged subcellular localization patterns, but still occupy similar numbers of compartments. This suggests that divergence of protein localization might be more frequently due to the development of more specific localization patterns over ancestral compartments than the occupation of new compartments. PLAST enables systematic and quantitative analyses of protein localization-function relationships, and will be useful to elucidate protein functions and how these functions were acquired in cells from different organisms or species. A public web interface of PLAST is available at http://plast.bii.a-star.edu.sg.
Loo, Lit-Hsin; Laksameethanasan, Danai; Tung, Yi-Ling
2014-01-01
Protein subcellular localization is a major determinant of protein function. However, this important protein feature is often described in terms of discrete and qualitative categories of subcellular compartments, and therefore it has limited applications in quantitative protein function analyses. Here, we present Protein Localization Analysis and Search Tools (PLAST), an automated analysis framework for constructing and comparing quantitative signatures of protein subcellular localization patterns based on microscopy images. PLAST produces human-interpretable protein localization maps that quantitatively describe the similarities in the localization patterns of proteins and major subcellular compartments, without requiring manual assignment or supervised learning of these compartments. Using the budding yeast Saccharomyces cerevisiae as a model system, we show that PLAST is more accurate than existing, qualitative protein localization annotations in identifying known co-localized proteins. Furthermore, we demonstrate that PLAST can reveal protein localization-function relationships that are not obvious from these annotations. First, we identified proteins that have similar localization patterns and participate in closely-related biological processes, but do not necessarily form stable complexes with each other or localize at the same organelles. Second, we found an association between spatial and functional divergences of proteins during evolution. Surprisingly, as proteins with common ancestors evolve, they tend to develop more diverged subcellular localization patterns, but still occupy similar numbers of compartments. This suggests that divergence of protein localization might be more frequently due to the development of more specific localization patterns over ancestral compartments than the occupation of new compartments. PLAST enables systematic and quantitative analyses of protein localization-function relationships, and will be useful to elucidate protein functions and how these functions were acquired in cells from different organisms or species. A public web interface of PLAST is available at http://plast.bii.a-star.edu.sg. PMID:24603469
Fungal proteomics: from identification to function.
Doyle, Sean
2011-08-01
Some fungi cause disease in humans and plants, while others have demonstrable potential for the control of insect pests. In addition, fungi are also a rich reservoir of therapeutic metabolites and industrially useful enzymes. Detailed analysis of fungal biochemistry is now enabled by multiple technologies including protein mass spectrometry, genome and transcriptome sequencing and advances in bioinformatics. Yet, the assignment of function to fungal proteins, encoded either by in silico annotated, or unannotated genes, remains problematic. The purpose of this review is to describe the strategies used by many researchers to reveal protein function in fungi, and more importantly, to consolidate the nomenclature of 'unknown function protein' as opposed to 'hypothetical protein' - once any protein has been identified by protein mass spectrometry. A combination of approaches including comparative proteomics, pathogen-induced protein expression and immunoproteomics are outlined, which, when used in combination with a variety of other techniques (e.g. functional genomics, microarray analysis, immunochemical and infection model systems), appear to yield comprehensive and definitive information on protein function in fungi. The relative advantages of proteomic, as opposed to transcriptomic-only, analyses are also described. In the future, combined high-throughput, quantitative proteomics, allied to transcriptomic sequencing, are set to reveal much about protein function in fungi. © 2011 Federation of European Microbiological Societies. Published by Blackwell Publishing Ltd. All rights reserved.
Alamgir, Md; Eroukova, Veronika; Jessulat, Matthew; Xu, Jianhua; Golshani, Ashkan
2008-01-01
Background Functional genomics has received considerable attention in the post-genomic era, as it aims to identify function(s) for different genes. One way to study gene function is to investigate the alterations in the responses of deletion mutants to different stimuli. Here we investigate the genetic profile of yeast non-essential gene deletion array (yGDA, ~4700 strains) for increased sensitivity to paromomycin, which targets the process of protein synthesis. Results As expected, our analysis indicated that the majority of deletion strains (134) with increased sensitivity to paromomycin, are involved in protein biosynthesis. The remaining strains can be divided into smaller functional categories: metabolism (45), cellular component biogenesis and organization (28), DNA maintenance (21), transport (20), others (38) and unknown (39). These may represent minor cellular target sites (side-effects) for paromomycin. They may also represent novel links to protein synthesis. One of these strains carries a deletion for a previously uncharacterized ORF, YBR261C, that we term TAE1 for Translation Associated Element 1. Our focused follow-up experiments indicated that deletion of TAE1 alters the ribosomal profile of the mutant cells. Also, gene deletion strain for TAE1 has defects in both translation efficiency and fidelity. Miniaturized synthetic genetic array analysis further indicates that TAE1 genetically interacts with 16 ribosomal protein genes. Phenotypic suppression analysis using TAE1 overexpression also links TAE1 to protein synthesis. Conclusion We show that a previously uncharacterized ORF, YBR261C, affects the process of protein synthesis and reaffirm that large-scale genetic profile analysis can be a useful tool to study novel gene function(s). PMID:19055778
Alamgir, Md; Eroukova, Veronika; Jessulat, Matthew; Xu, Jianhua; Golshani, Ashkan
2008-12-03
Functional genomics has received considerable attention in the post-genomic era, as it aims to identify function(s) for different genes. One way to study gene function is to investigate the alterations in the responses of deletion mutants to different stimuli. Here we investigate the genetic profile of yeast non-essential gene deletion array (yGDA, approximately 4700 strains) for increased sensitivity to paromomycin, which targets the process of protein synthesis. As expected, our analysis indicated that the majority of deletion strains (134) with increased sensitivity to paromomycin, are involved in protein biosynthesis. The remaining strains can be divided into smaller functional categories: metabolism (45), cellular component biogenesis and organization (28), DNA maintenance (21), transport (20), others (38) and unknown (39). These may represent minor cellular target sites (side-effects) for paromomycin. They may also represent novel links to protein synthesis. One of these strains carries a deletion for a previously uncharacterized ORF, YBR261C, that we term TAE1 for Translation Associated Element 1. Our focused follow-up experiments indicated that deletion of TAE1 alters the ribosomal profile of the mutant cells. Also, gene deletion strain for TAE1 has defects in both translation efficiency and fidelity. Miniaturized synthetic genetic array analysis further indicates that TAE1 genetically interacts with 16 ribosomal protein genes. Phenotypic suppression analysis using TAE1 overexpression also links TAE1 to protein synthesis. We show that a previously uncharacterized ORF, YBR261C, affects the process of protein synthesis and reaffirm that large-scale genetic profile analysis can be a useful tool to study novel gene function(s).
Analysis of functional redundancies within the Arabidopsis TCP transcription factor family.
Danisman, Selahattin; van Dijk, Aalt D J; Bimbo, Andrea; van der Wal, Froukje; Hennig, Lars; de Folter, Stefan; Angenent, Gerco C; Immink, Richard G H
2013-12-01
Analyses of the functions of TEOSINTE-LIKE1, CYCLOIDEA, and PROLIFERATING CELL FACTOR1 (TCP) transcription factors have been hampered by functional redundancy between its individual members. In general, putative functionally redundant genes are predicted based on sequence similarity and confirmed by genetic analysis. In the TCP family, however, identification is impeded by relatively low overall sequence similarity. In a search for functionally redundant TCP pairs that control Arabidopsis leaf development, this work performed an integrative bioinformatics analysis, combining protein sequence similarities, gene expression data, and results of pair-wise protein-protein interaction studies for the 24 members of the Arabidopsis TCP transcription factor family. For this, the work completed any lacking gene expression and protein-protein interaction data experimentally and then performed a comprehensive prediction of potential functional redundant TCP pairs. Subsequently, redundant functions could be confirmed for selected predicted TCP pairs by genetic and molecular analyses. It is demonstrated that the previously uncharacterized class I TCP19 gene plays a role in the control of leaf senescence in a redundant fashion with TCP20. Altogether, this work shows the power of combining classical genetic and molecular approaches with bioinformatics predictions to unravel functional redundancies in the TCP transcription factor family.
Pan-Cancer Analysis of Mutation Hotspots in Protein Domains.
Miller, Martin L; Reznik, Ed; Gauthier, Nicholas P; Aksoy, Bülent Arman; Korkut, Anil; Gao, Jianjiong; Ciriello, Giovanni; Schultz, Nikolaus; Sander, Chris
2015-09-23
In cancer genomics, recurrence of mutations in independent tumor samples is a strong indicator of functional impact. However, rare functional mutations can escape detection by recurrence analysis owing to lack of statistical power. We enhance statistical power by extending the notion of recurrence of mutations from single genes to gene families that share homologous protein domains. Domain mutation analysis also sharpens the functional interpretation of the impact of mutations, as domains more succinctly embody function than entire genes. By mapping mutations in 22 different tumor types to equivalent positions in multiple sequence alignments of domains, we confirm well-known functional mutation hotspots, identify uncharacterized rare variants in one gene that are equivalent to well-characterized mutations in another gene, detect previously unknown mutation hotspots, and provide hypotheses about molecular mechanisms and downstream effects of domain mutations. With the rapid expansion of cancer genomics projects, protein domain hotspot analysis will likely provide many more leads linking mutations in proteins to the cancer phenotype. Copyright © 2015 Elsevier Inc. All rights reserved.
ERIC Educational Resources Information Center
Terrell, Cassidy R.; Listenberger, Laura L.
2017-01-01
Recognizing that undergraduate students can benefit from analysis of 3D protein structure and function, we have developed a multiweek, inquiry-based molecular visualization project for Biochemistry I students. This project uses a virtual model of cyclooxygenase-1 (COX-1) to guide students through multiple levels of protein structure analysis. The…
Global Proteome Analysis Links Lysine Acetylation to Diverse Functions in Oryza Sativa.
Xue, Chao; Liu, Shuai; Chen, Chen; Zhu, Jun; Yang, Xibin; Zhou, Yong; Guo, Rui; Liu, Xiaoyu; Gong, Zhiyun
2018-01-01
Lysine acetylation (Kac) is an important protein post-translational modification in both eukaryotes and prokaryotes. Herein, we report the results of a global proteome analysis of Kac and its diverse functions in rice (Oryza sativa). We identified 1353 Kac sites in 866 proteins in rice seedlings. A total of 11 Kac motifs are conserved, and 45% of the identified proteins are localized to the chloroplast. Among all acetylated proteins, 38 Kac sites are combined in core histones. Bioinformatics analysis revealed that Kac occurs on a diverse range of proteins involved in a wide variety of biological processes, especially photosynthesis. Protein-protein interaction networks of the identified proteins provided further evidence that Kac contributes to a wide range of regulatory functions. Furthermore, we demonstrated that the acetylation level of histone H3 (lysine 27 and 36) is increased in response to cold stress. In summary, our approach comprehensively profiles the regulatory roles of Kac in the growth and development of rice. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Kwon, Daehong; Lee, Daehwan; Kim, Juyeon; Lee, Jongin; Sim, Mikang; Kim, Jaebum
2018-05-09
Proteins perform biological functions through cascading interactions with each other by forming protein complexes. As a result, interactions among proteins, called protein-protein interactions (PPIs) are not completely free from selection constraint during evolution. Therefore, the identification and analysis of PPI changes during evolution can give us new insight into the evolution of functions. Although many algorithms, databases and websites have been developed to help the study of PPIs, most of them are limited to visualize the structure and features of PPIs in a chosen single species with limited functions in the visualization perspective. This leads to difficulties in the identification of different patterns of PPIs in different species and their functional consequences. To resolve these issues, we developed a web application, called INTER-Species Protein Interaction Analysis (INTERSPIA). Given a set of proteins of user's interest, INTERSPIA first discovers additional proteins that are functionally associated with the input proteins and searches for different patterns of PPIs in multiple species through a server-side pipeline, and second visualizes the dynamics of PPIs in multiple species using an easy-to-use web interface. INTERSPIA is freely available at http://bioinfo.konkuk.ac.kr/INTERSPIA/.
Proteome analysis of the almond kernel (Prunus dulcis).
Li, Shugang; Geng, Fang; Wang, Ping; Lu, Jiankang; Ma, Meihu
2016-08-01
Almond (Prunus dulcis) is a popular tree nut worldwide and offers many benefits to human health. However, the importance of almond kernel proteins in the nutrition and function in human health requires further evaluation. The present study presents a systematic evaluation of the proteins in the almond kernel using proteomic analysis. The nutrient and amino acid content in almond kernels from Xinjiang is similar to that of American varieties; however, Xinjiang varieties have a higher protein content. Two-dimensional electrophoresis analysis demonstrated a wide distribution of molecular weights and isoelectric points of almond kernel proteins. A total of 434 proteins were identified by LC-MS/MS, and most were proteins that were experimentally confirmed for the first time. Gene ontology (GO) analysis of the 434 proteins indicated that proteins involved in primary biological processes including metabolic processes (67.5%), cellular processes (54.1%), and single-organism processes (43.4%), the main molecular function of almond kernel proteins are in catalytic activity (48.0%), binding (45.4%) and structural molecule activity (11.9%), and proteins are primarily distributed in cell (59.9%), organelle (44.9%), and membrane (22.8%). Almond kernel is a source of a wide variety of proteins. This study provides important information contributing to the screening and identification of almond proteins, the understanding of almond protein function, and the development of almond protein products. © 2015 Society of Chemical Industry. © 2015 Society of Chemical Industry.
PredictProtein—an open resource for online prediction of protein structural and functional features
Yachdav, Guy; Kloppmann, Edda; Kajan, Laszlo; Hecht, Maximilian; Goldberg, Tatyana; Hamp, Tobias; Hönigschmid, Peter; Schafferhans, Andrea; Roos, Manfred; Bernhofer, Michael; Richter, Lothar; Ashkenazy, Haim; Punta, Marco; Schlessinger, Avner; Bromberg, Yana; Schneider, Reinhard; Vriend, Gerrit; Sander, Chris; Ben-Tal, Nir; Rost, Burkhard
2014-01-01
PredictProtein is a meta-service for sequence analysis that has been predicting structural and functional features of proteins since 1992. Queried with a protein sequence it returns: multiple sequence alignments, predicted aspects of structure (secondary structure, solvent accessibility, transmembrane helices (TMSEG) and strands, coiled-coil regions, disulfide bonds and disordered regions) and function. The service incorporates analysis methods for the identification of functional regions (ConSurf), homology-based inference of Gene Ontology terms (metastudent), comprehensive subcellular localization prediction (LocTree3), protein–protein binding sites (ISIS2), protein–polynucleotide binding sites (SomeNA) and predictions of the effect of point mutations (non-synonymous SNPs) on protein function (SNAP2). Our goal has always been to develop a system optimized to meet the demands of experimentalists not highly experienced in bioinformatics. To this end, the PredictProtein results are presented as both text and a series of intuitive, interactive and visually appealing figures. The web server and sources are available at http://ppopen.rostlab.org. PMID:24799431
Functionalizing Microporous Membranes for Protein Purification and Protein Digestion
NASA Astrophysics Data System (ADS)
Dong, Jinlan; Bruening, Merlin L.
2015-07-01
This review examines advances in the functionalization of microporous membranes for protein purification and the development of protease-containing membranes for controlled protein digestion prior to mass spectrometry analysis. Recent studies confirm that membranes are superior to bead-based columns for rapid protein capture, presumably because convective mass transport in membrane pores rapidly brings proteins to binding sites. Modification of porous membranes with functional polymeric films or TiO2 nanoparticles yields materials that selectively capture species ranging from phosphopeptides to His-tagged proteins, and protein-binding capacities often exceed those of commercial beads. Thin membranes also provide a convenient framework for creating enzyme-containing reactors that afford control over residence times. With millisecond residence times, reactors with immobilized proteases limit protein digestion to increase sequence coverage in mass spectrometry analysis and facilitate elucidation of protein structures. This review emphasizes the advantages of membrane-based techniques and concludes with some challenges for their practical application.
Functionalizing Microporous Membranes for Protein Purification and Protein Digestion.
Dong, Jinlan; Bruening, Merlin L
2015-01-01
This review examines advances in the functionalization of microporous membranes for protein purification and the development of protease-containing membranes for controlled protein digestion prior to mass spectrometry analysis. Recent studies confirm that membranes are superior to bead-based columns for rapid protein capture, presumably because convective mass transport in membrane pores rapidly brings proteins to binding sites. Modification of porous membranes with functional polymeric films or TiO₂ nanoparticles yields materials that selectively capture species ranging from phosphopeptides to His-tagged proteins, and protein-binding capacities often exceed those of commercial beads. Thin membranes also provide a convenient framework for creating enzyme-containing reactors that afford control over residence times. With millisecond residence times, reactors with immobilized proteases limit protein digestion to increase sequence coverage in mass spectrometry analysis and facilitate elucidation of protein structures. This review emphasizes the advantages of membrane-based techniques and concludes with some challenges for their practical application.
Protein analysis: key to the future.
Boodhun, Nawsheen
2018-05-01
Protein analysis is crucial to elucidating the function of proteins and understanding the impact of their presence, absence and alteration. This is key to advancing knowledge about diseases, providing the opportunity for biomarker discovery and development of therapeutics. In this issue of Tech News, Nawsheen Boodhun explores the various means of protein analysis.
Müller, Boje; Groscurth, Sira; Menzel, Matthias; Rüping, Boris A.; Twyman, Richard M.; Prüfer, Dirk; Noll, Gundula A.
2014-01-01
Background and Aims Forisomes are specialized structural phloem proteins that mediate sieve element occlusion after wounding exclusively in papilionoid legumes, but most studies of forisome structure and function have focused on the Old World clade rather than the early lineages. A comprehensive phylogenetic, molecular, structural and functional analysis of forisomes from species covering a broad spectrum of the papilionoid legumes was therefore carried out, including the first analysis of Dipteryx panamensis forisomes, representing the earliest branch of the Papilionoideae lineage. The aim was to study the molecular, structural and functional conservation among forisomes from different tribes and to establish the roles of individual forisome subunits. Methods Sequence analysis and bioinformatics were combined with structural and functional analysis of native forisomes and artificial forisome-like protein bodies, the latter produced by expressing forisome genes from different legumes in a heterologous background. The structure of these bodies was analysed using a combination of confocal laser scanning microscopy (CLSM), scanning electron microscopy (SEM) and transmission electron microscopy (TEM), and the function of individual subunits was examined by combinatorial expression, micromanipulation and light microscopy. Key Results Dipteryx panamensis native forisomes and homomeric protein bodies assembled from the single sieve element occlusion by forisome (SEO-F) subunit identified in this species were structurally and functionally similar to forisomes from the Old World clade. In contrast, homomeric protein bodies assembled from individual SEO-F subunits from Old World species yielded artificial forisomes differing in proportion to their native counterparts, suggesting that multiple SEO-F proteins are required for forisome assembly in these plants. Structural differences between Medicago truncatula native forisomes, homomeric protein bodies and heteromeric bodies containing all possible subunit combinations suggested that combinations of SEO-F proteins may fine-tune the geometric proportions and reactivity of forisomes. Conclusions It is concluded that forisome structure and function have been strongly conserved during evolution and that species-dependent subsets of SEO-F proteins may have evolved to fine-tune the structure of native forisomes. PMID:24694827
Proteomics profiling of interactome dynamics by colocalisation analysis (COLA).
Mardakheh, Faraz K; Sailem, Heba Z; Kümper, Sandra; Tape, Christopher J; McCully, Ryan R; Paul, Angela; Anjomani-Virmouni, Sara; Jørgensen, Claus; Poulogiannis, George; Marshall, Christopher J; Bakal, Chris
2016-12-20
Localisation and protein function are intimately linked in eukaryotes, as proteins are localised to specific compartments where they come into proximity of other functionally relevant proteins. Significant co-localisation of two proteins can therefore be indicative of their functional association. We here present COLA, a proteomics based strategy coupled with a bioinformatics framework to detect protein-protein co-localisations on a global scale. COLA reveals functional interactions by matching proteins with significant similarity in their subcellular localisation signatures. The rapid nature of COLA allows mapping of interactome dynamics across different conditions or treatments with high precision.
Shah, Anup D; Inder, Kerry L; Shah, Alok K; Cristino, Alexandre S; McKie, Arthur B; Gabra, Hani; Davis, Melissa J; Hill, Michelle M
2016-10-07
Lipid rafts are dynamic membrane microdomains that orchestrate molecular interactions and are implicated in cancer development. To understand the functions of lipid rafts in cancer, we performed an integrated analysis of quantitative lipid raft proteomics data sets modeling progression in breast cancer, melanoma, and renal cell carcinoma. This analysis revealed that cancer development is associated with increased membrane raft-cytoskeleton interactions, with ∼40% of elevated lipid raft proteins being cytoskeletal components. Previous studies suggest a potential functional role for the raft-cytoskeleton in the action of the putative tumor suppressors PTRF/Cavin-1 and Merlin. To extend the observation, we examined lipid raft proteome modulation by an unrelated tumor suppressor opioid binding protein cell-adhesion molecule (OPCML) in ovarian cancer SKOV3 cells. In agreement with the other model systems, quantitative proteomics revealed that 39% of OPCML-depleted lipid raft proteins are cytoskeletal components, with microfilaments and intermediate filaments specifically down-regulated. Furthermore, protein-protein interaction network and simulation analysis showed significantly higher interactions among cancer raft proteins compared with general human raft proteins. Collectively, these results suggest increased cytoskeleton-mediated stabilization of lipid raft domains with greater molecular interactions as a common, functional, and reversible feature of cancer cells.
DNA mimic proteins: functions, structures, and bioinformatic analysis.
Wang, Hao-Ching; Ho, Chun-Han; Hsu, Kai-Cheng; Yang, Jinn-Moon; Wang, Andrew H-J
2014-05-13
DNA mimic proteins have DNA-like negative surface charge distributions, and they function by occupying the DNA binding sites of DNA binding proteins to prevent these sites from being accessed by DNA. DNA mimic proteins control the activities of a variety of DNA binding proteins and are involved in a wide range of cellular mechanisms such as chromatin assembly, DNA repair, transcription regulation, and gene recombination. However, the sequences and structures of DNA mimic proteins are diverse, making them difficult to predict by bioinformatic search. To date, only a few DNA mimic proteins have been reported. These DNA mimics were not found by searching for functional motifs in their sequences but were revealed only by structural analysis of their charge distribution. This review highlights the biological roles and structures of 16 reported DNA mimic proteins. We also discuss approaches that might be used to discover new DNA mimic proteins.
Greenfield, Norma J.
2009-01-01
Circular dichroism (CD) is an excellent spectroscopic technique for following the unfolding and folding of proteins as a function of temperature. One of its principal applications is to determine the effects of mutations and ligands on protein and polypeptide stability If the change in CD as a function of temperature is reversible, analysis of the data may be used to determined the van't Hoff enthalpy (ΔH) and entropy (ΔS) of unfolding, the midpoint of the unfolding transition (TM) and the free energy (ΔG) of unfolding. Binding constants of protein-protein and protein-ligand interactions may also be estimated from the unfolding curves. Analysis of CD spectra obtained as a function of temperature is also useful to determine whether a protein has unfolding intermediates. Measurement of the spectra of five folded proteins and their unfolding curves at a single wavelength takes approximately eight hours. PMID:17406506
Mao, Song; Chai, Xiaoqiang; Hu, Yuling; Hou, Xugang; Tang, Yiheng; Bi, Cheng; Li, Xiao
2014-01-01
Mitochondrion plays a central role in diverse biological processes in most eukaryotes, and its dysfunctions are critically involved in a large number of diseases and the aging process. A systematic identification of mitochondrial proteomes and characterization of functional linkages among mitochondrial proteins are fundamental in understanding the mechanisms underlying biological functions and human diseases associated with mitochondria. Here we present a database MitProNet which provides a comprehensive knowledgebase for mitochondrial proteome, interactome and human diseases. First an inventory of mammalian mitochondrial proteins was compiled by widely collecting proteomic datasets, and the proteins were classified by machine learning to achieve a high-confidence list of mitochondrial proteins. The current version of MitProNet covers 1124 high-confidence proteins, and the remainders were further classified as middle- or low-confidence. An organelle-specific network of functional linkages among mitochondrial proteins was then generated by integrating genomic features encoded by a wide range of datasets including genomic context, gene expression profiles, protein-protein interactions, functional similarity and metabolic pathways. The functional-linkage network should be a valuable resource for the study of biological functions of mitochondrial proteins and human mitochondrial diseases. Furthermore, we utilized the network to predict candidate genes for mitochondrial diseases using prioritization algorithms. All proteins, functional linkages and disease candidate genes in MitProNet were annotated according to the information collected from their original sources including GO, GEO, OMIM, KEGG, MIPS, HPRD and so on. MitProNet features a user-friendly graphic visualization interface to present functional analysis of linkage networks. As an up-to-date database and analysis platform, MitProNet should be particularly helpful in comprehensive studies of complicated biological mechanisms underlying mitochondrial functions and human mitochondrial diseases. MitProNet is freely accessible at http://bio.scu.edu.cn:8085/MitProNet. PMID:25347823
PROFESS: a PROtein Function, Evolution, Structure and Sequence database
Triplet, Thomas; Shortridge, Matthew D.; Griep, Mark A.; Stark, Jaime L.; Powers, Robert; Revesz, Peter
2010-01-01
The proliferation of biological databases and the easy access enabled by the Internet is having a beneficial impact on biological sciences and transforming the way research is conducted. There are ∼1100 molecular biology databases dispersed throughout the Internet. To assist in the functional, structural and evolutionary analysis of the abundant number of novel proteins continually identified from whole-genome sequencing, we introduce the PROFESS (PROtein Function, Evolution, Structure and Sequence) database. Our database is designed to be versatile and expandable and will not confine analysis to a pre-existing set of data relationships. A fundamental component of this approach is the development of an intuitive query system that incorporates a variety of similarity functions capable of generating data relationships not conceived during the creation of the database. The utility of PROFESS is demonstrated by the analysis of the structural drift of homologous proteins and the identification of potential pancreatic cancer therapeutic targets based on the observation of protein–protein interaction networks. Database URL: http://cse.unl.edu/∼profess/ PMID:20624718
Vyas, Sejal; Chesarone-Cataldo, Melissa; Todorova, Tanya; Huang, Yun-Han; Chang, Paul
2013-01-01
The poly(ADP-ribose) polymerase (PARP) family of proteins use NAD+ as their substrate to modify acceptor proteins with adenosine diphosphate-ribose (ADPr) modifications. The function of most PARPs under physiological conditions is unknown. Here, to better understand this protein family, we systematically analyze the cell cycle localization of each PARP and of poly(ADP-ribose), a product of PARP activity, then identify the knock-down phenotype of each protein and perform secondary assays to elucidate function. We show that most PARPs are cytoplasmic, identify cell cycle differences in the ratio of nuclear to cytoplasmic poly(ADP-ribose), and identify four phenotypic classes of PARP function. These include the regulation of membrane structures, cell viability, cell division, and the actin cytoskeleton. Further analysis of PARP14 shows that it is a component of focal adhesion complexes required for proper cell motility and focal adhesion function. In total, we show that PARP proteins are critical regulators of eukaryotic physiology. PMID:23917125
Heinz, Eva; Lithgow, Trevor
2014-01-01
Members of the Omp85/TpsB protein superfamily are ubiquitously distributed in Gram-negative bacteria, and function in protein translocation (e.g., FhaC) or the assembly of outer membrane proteins (e.g., BamA). Several recent findings are suggestive of a further level of variation in the superfamily, including the identification of the novel membrane protein assembly factor TamA and protein translocase PlpD. To investigate the diversity and the causal evolutionary events, we undertook a comprehensive comparative sequence analysis of the Omp85/TpsB proteins. A total of 10 protein subfamilies were apparent, distinguished in their domain structure and sequence signatures. In addition to the proteins FhaC, BamA, and TamA, for which structural and functional information is available, are families of proteins with so far undescribed domain architectures linked to the Omp85 β-barrel domain. This study brings a classification structure to a dynamic protein superfamily of high interest given its essential function for Gram-negative bacteria as well as its diverse domain architecture, and we discuss several scenarios of putative functions of these so far undescribed proteins. PMID:25101071
Network Analysis of Protein Adaptation: Modeling the Functional Impact of Multiple Mutations
Beleva Guthrie, Violeta; Masica, David L; Fraser, Andrew; Federico, Joseph; Fan, Yunfan; Camps, Manel; Karchin, Rachel
2018-01-01
Abstract The evolution of new biochemical activities frequently involves complex dependencies between mutations and rapid evolutionary radiation. Mutation co-occurrence and covariation have previously been used to identify compensating mutations that are the result of physical contacts and preserve protein function and fold. Here, we model pairwise functional dependencies and higher order interactions that enable evolution of new protein functions. We use a network model to find complex dependencies between mutations resulting from evolutionary trade-offs and pleiotropic effects. We present a method to construct these networks and to identify functionally interacting mutations in both extant and reconstructed ancestral sequences (Network Analysis of Protein Adaptation). The time ordering of mutations can be incorporated into the networks through phylogenetic reconstruction. We apply NAPA to three distantly homologous β-lactamase protein clusters (TEM, CTX-M-3, and OXA-51), each of which has experienced recent evolutionary radiation under substantially different selective pressures. By analyzing the network properties of each protein cluster, we identify key adaptive mutations, positive pairwise interactions, different adaptive solutions to the same selective pressure, and complex evolutionary trajectories likely to increase protein fitness. We also present evidence that incorporating information from phylogenetic reconstruction and ancestral sequence inference can reduce the number of spurious links in the network, whereas preserving overall network community structure. The analysis does not require structural or biochemical data. In contrast to function-preserving mutation dependencies, which are frequently from structural contacts, gain-of-function mutation dependencies are most commonly between residues distal in protein structure. PMID:29522102
The Papillomavirus E2 proteins
DOE Office of Scientific and Technical Information (OSTI.GOV)
McBride, Alison A., E-mail: amcbride@nih.gov
2013-10-15
The papillomavirus E2 proteins are pivotal to the viral life cycle and have well characterized functions in transcriptional regulation, initiation of DNA replication and partitioning the viral genome. The E2 proteins also function in vegetative DNA replication, post-transcriptional processes and possibly packaging. This review describes structural and functional aspects of the E2 proteins and their binding sites on the viral genome. It is intended to be a reference guide to this viral protein. - Highlights: • Overview of E2 protein functions. • Structural domains of the papillomavirus E2 proteins. • Analysis of E2 binding sites in different genera of papillomaviruses.more » • Compilation of E2 associated proteins. • Comparison of key mutations in distinct E2 functions.« less
Mass Spectrometry Analysis of Spatial Protein Networks by Colocalization Analysis (COLA).
Mardakheh, Faraz K
2017-01-01
A major challenge in systems biology is comprehensive mapping of protein interaction networks. Crucially, such interactions are often dynamic in nature, necessitating methods that can rapidly mine the interactome across varied conditions and treatments to reveal change in the interaction networks. Recently, we described a fast mass spectrometry-based method to reveal functional interactions in mammalian cells on a global scale, by revealing spatial colocalizations between proteins (COLA) (Mardakheh et al., Mol Biosyst 13:92-105, 2017). As protein localization and function are inherently linked, significant colocalization between two proteins is a strong indication for their functional interaction. COLA uses rapid complete subcellular fractionation, coupled with quantitative proteomics to generate a subcellular localization profile for each protein quantified by the mass spectrometer. Robust clustering is then applied to reveal significant similarities in protein localization profiles, indicative of colocalization.
Decomposition of Proteins into Dynamic Units from Atomic Cross-Correlation Functions.
Calligari, Paolo; Gerolin, Marco; Abergel, Daniel; Polimeno, Antonino
2017-01-10
In this article, we present a clustering method of atoms in proteins based on the analysis of the correlation times of interatomic distance correlation functions computed from MD simulations. The goal is to provide a coarse-grained description of the protein in terms of fewer elements that can be treated as dynamically independent subunits. Importantly, this domain decomposition method does not take into account structural properties of the protein. Instead, the clustering of protein residues in terms of networks of dynamically correlated domains is defined on the basis of the effective correlation times of the pair distance correlation functions. For these properties, our method stands as a complementary analysis to the customary protein decomposition in terms of quasi-rigid, structure-based domains. Results obtained for a prototypal protein structure illustrate the approach proposed.
USDA-ARS?s Scientific Manuscript database
Male ejaculate proteins, including both sperm and seminal fluid proteins, play an important role in mediating reproductive biology. The function of ejaculate proteins can include enabling sperm-egg interactions, enhancing sperm storage, mediating female attractiveness, and even regulating female lif...
Protein function prediction using neighbor relativity in protein-protein interaction network.
Moosavi, Sobhan; Rahgozar, Masoud; Rahimi, Amir
2013-04-01
There is a large gap between the number of discovered proteins and the number of functionally annotated ones. Due to the high cost of determining protein function by wet-lab research, function prediction has become a major task for computational biology and bioinformatics. Some researches utilize the proteins interaction information to predict function for un-annotated proteins. In this paper, we propose a novel approach called "Neighbor Relativity Coefficient" (NRC) based on interaction network topology which estimates the functional similarity between two proteins. NRC is calculated for each pair of proteins based on their graph-based features including distance, common neighbors and the number of paths between them. In order to ascribe function to an un-annotated protein, NRC estimates a weight for each neighbor to transfer its annotation to the unknown protein. Finally, the unknown protein will be annotated by the top score transferred functions. We also investigate the effect of using different coefficients for various types of functions. The proposed method has been evaluated on Saccharomyces cerevisiae and Homo sapiens interaction networks. The performance analysis demonstrates that NRC yields better results in comparison with previous protein function prediction approaches that utilize interaction network. Copyright © 2012 Elsevier Ltd. All rights reserved.
Computer analysis of protein functional sites projection on exon structure of genes in Metazoa.
Medvedeva, Irina V; Demenkov, Pavel S; Ivanisenko, Vladimir A
2015-01-01
Study of the relationship between the structural and functional organization of proteins and their coding genes is necessary for an understanding of the evolution of molecular systems and can provide new knowledge for many applications for designing proteins with improved medical and biological properties. It is well known that the functional properties of proteins are determined by their functional sites. Functional sites are usually represented by a small number of amino acid residues that are distantly located from each other in the amino acid sequence. They are highly conserved within their functional group and vary significantly in structure between such groups. According to this facts analysis of the general properties of the structural organization of the functional sites at the protein level and, at the level of exon-intron structure of the coding gene is still an actual problem. One approach to this analysis is the projection of amino acid residue positions of the functional sites along with the exon boundaries to the gene structure. In this paper, we examined the discontinuity of the functional sites in the exon-intron structure of genes and the distribution of lengths and phases of the functional site encoding exons in vertebrate genes. We have shown that the DNA fragments coding the functional sites were in the same exons, or in close exons. The observed tendency to cluster the exons that code functional sites which could be considered as the unit of protein evolution. We studied the characteristics of the structure of the exon boundaries that code, and do not code, functional sites in 11 Metazoa species. This is accompanied by a reduced frequency of intercodon gaps (phase 0) in exons encoding the amino acid residue functional site, which may be evidence of the existence of evolutionary limitations to the exon shuffling. These results characterize the features of the coding exon-intron structure that affect the functionality of the encoded protein and allow a better understanding of the emergence of biological diversity.
Protein interactions and ligand binding: from protein subfamilies to functional specificity.
Rausell, Antonio; Juan, David; Pazos, Florencio; Valencia, Alfonso
2010-02-02
The divergence accumulated during the evolution of protein families translates into their internal organization as subfamilies, and it is directly reflected in the characteristic patterns of differentially conserved residues. These specifically conserved positions in protein subfamilies are known as "specificity determining positions" (SDPs). Previous studies have limited their analysis to the study of the relationship between these positions and ligand-binding specificity, demonstrating significant yet limited predictive capacity. We have systematically extended this observation to include the role of differential protein interactions in the segregation of protein subfamilies and explored in detail the structural distribution of SDPs at protein interfaces. Our results show the extensive influence of protein interactions in the evolution of protein families and the widespread association of SDPs with protein interfaces. The combined analysis of SDPs in interfaces and ligand-binding sites provides a more complete picture of the organization of protein families, constituting the necessary framework for a large scale analysis of the evolution of protein function.
Zhang, Shuxing; Kaplan, Andrew H.; Tropsha, Alexander
2009-01-01
The Simplicial Neighborhood Analysis of Protein Packing (SNAPP) method was used to predict the effect of mutagenesis on the enzymatic activity of the HIV-1 protease (HIVP). SNAPP relies on a four-body statistical scoring function derived from the analysis of spatially nearest neighbor residue compositional preferences in a diverse and representative subset of protein structures from the Protein Data Bank. The method was applied to the analysis of HIVP mutants with residue substitutions in the hydrophobic core as well as at the interface between the two protease monomers. Both wild type and tethered structures were employed in the calculations. We obtained a strong correlation, with R2 as high as 0.96, between ΔSNAPP score (i.e., the difference in SNAPP scores between wild type and mutant proteins) and the protease catalytic activity for tethered structures. A weaker but significant correlation was also obtained for non-tethered structures as well. Our analysis identified residues both in the hydrophobic core and at the dimeric interface (DI) that are very important for the protease function. This study demonstrates a potential utility of the SNAPP method for rational design of mutagenesis studies and protein engineering. PMID:18498108
Rice proteome database: a step toward functional analysis of the rice genome.
Komatsu, Setsuko
2005-09-01
The technique of proteome analysis using two-dimensional polyacrylamide gel electrophoresis (2D-PAGE) has the power to monitor global changes that occur in the protein complement of tissues and subcellular compartments. In this study, the proteins of rice were cataloged, a rice proteome database was constructed, and a functional characterization of some of the identified proteins was undertaken. Proteins extracted from various tissues and subcellular compartments in rice were separated by 2D-PAGE and an image analyzer was used to construct a display of the proteins. The Rice Proteome Database contains 23 reference maps based on 2D-PAGE of proteins from various rice tissues and subcellular compartments. These reference maps comprise 13129 identified proteins, and the amino acid sequences of 5092 proteins are entered in the database. Major proteins involved in growth or stress responses were identified using the proteome approach. Some of these proteins, including a beta-tubulin, calreticulin, and ribulose-1,5-bisphosphate carboxylase/oxygenase activase in rice, have unexpected functions. The information obtained from the Rice Proteome Database will aid in cloning the genes for and predicting the function of unknown proteins.
Bioinformatics analysis of disordered proteins in prokaryotes.
Pavlović-Lažetić, Gordana M; Mitić, Nenad S; Kovačević, Jovana J; Obradović, Zoran; Malkov, Saša N; Beljanski, Miloš V
2011-03-02
A significant number of proteins have been shown to be intrinsically disordered, meaning that they lack a fixed 3 D structure or contain regions that do not posses a well defined 3 D structure. It has also been proven that a protein's disorder content is related to its function. We have performed an exhaustive analysis and comparison of the disorder content of proteins from prokaryotic organisms (i.e., superkingdoms Archaea and Bacteria) with respect to functional categories they belong to, i.e., Clusters of Orthologous Groups of proteins (COGs) and groups of COGs-Cellular processes (Cp), Information storage and processing (Isp), Metabolism (Me) and Poorly characterized (Pc). We also analyzed the disorder content of proteins with respect to various genomic, metabolic and ecological characteristics of the organism they belong to. We used correlations and association rule mining in order to identify the most confident associations between specific modalities of the characteristics considered and disorder content. Bacteria are shown to have a somewhat higher level of protein disorder than archaea, except for proteins in the Me functional group. It is demonstrated that the Isp and Cp functional groups in particular (L-repair function and N-cell motility and secretion COGs of proteins in specific) possess the highest disorder content, while Me proteins, in general, posses the lowest. Disorder fractions have been confirmed to have the lowest level for the so-called order-promoting amino acids and the highest level for the so-called disorder promoters. For each pair of organism characteristics, specific modalities are identified with the maximum disorder proteins in the corresponding organisms, e.g., high genome size-high GC content organisms, facultative anaerobic-low GC content organisms, aerobic-high genome size organisms, etc. Maximum disorder in archaea is observed for high GC content-low genome size organisms, high GC content-facultative anaerobic or aquatic or mesophilic organisms, etc. Maximum disorder in bacteria is observed for high GC content-high genome size organisms, high genome size-aerobic organisms, etc. Some of the most reliable association rules mined establish relationships between high GC content and high protein disorder, medium GC content and both medium and low protein disorder, anaerobic organisms and medium protein disorder, Gammaproteobacteria and low protein disorder, etc. A web site Prokaryote Disorder Database has been designed and implemented at the address http://bioinfo.matf.bg.ac.rs/disorder, which contains complete results of the analysis of protein disorder performed for 296 prokaryotic completely sequenced genomes. Exhaustive disorder analysis has been performed by functional classes of proteins, for a larger dataset of prokaryotic organisms than previously done. Results obtained are well correlated to those previously published, with some extension in the range of disorder level and clear distinction between functional classes of proteins. Wide correlation and association analysis between protein disorder and genomic and ecological characteristics has been performed for the first time. The results obtained give insight into multi-relationships among the characteristics and protein disorder. Such analysis provides for better understanding of the evolutionary process and may be useful for taxon determination. The main drawback of the approach is the fact that the disorder considered has been predicted and not experimentally established.
A human functional protein interaction network and its application to cancer data analysis
2010-01-01
Background One challenge facing biologists is to tease out useful information from massive data sets for further analysis. A pathway-based analysis may shed light by projecting candidate genes onto protein functional relationship networks. We are building such a pathway-based analysis system. Results We have constructed a protein functional interaction network by extending curated pathways with non-curated sources of information, including protein-protein interactions, gene coexpression, protein domain interaction, Gene Ontology (GO) annotations and text-mined protein interactions, which cover close to 50% of the human proteome. By applying this network to two glioblastoma multiforme (GBM) data sets and projecting cancer candidate genes onto the network, we found that the majority of GBM candidate genes form a cluster and are closer than expected by chance, and the majority of GBM samples have sequence-altered genes in two network modules, one mainly comprising genes whose products are localized in the cytoplasm and plasma membrane, and another comprising gene products in the nucleus. Both modules are highly enriched in known oncogenes, tumor suppressors and genes involved in signal transduction. Similar network patterns were also found in breast, colorectal and pancreatic cancers. Conclusions We have built a highly reliable functional interaction network upon expert-curated pathways and applied this network to the analysis of two genome-wide GBM and several other cancer data sets. The network patterns revealed from our results suggest common mechanisms in the cancer biology. Our system should provide a foundation for a network or pathway-based analysis platform for cancer and other diseases. PMID:20482850
Exploring the evolution of protein function in Archaea.
Goncearenco, Alexander; Berezovsky, Igor N
2012-05-30
Despite recent progress in studies of the evolution of protein function, the questions what were the first functional protein domains and what were their basic building blocks remain unresolved. Previously, we introduced the concept of elementary functional loops (EFLs), which are the functional units of enzymes that provide elementary reactions in biochemical transformations. They are presumably descendants of primordial catalytic peptides. We analyzed distant evolutionary connections between protein functions in Archaea based on the EFLs comprising them. We show examples of the involvement of EFLs in new functional domains, as well as reutilization of EFLs and functional domains in building multidomain structures and protein complexes. Our analysis of the archaeal superkingdom yields the dominating mechanisms in different periods of protein evolution, which resulted in several levels of the organization of biochemical function. First, functional domains emerged as combinations of prebiotic peptides with the very basic functions, such as nucleotide/phosphate and metal cofactor binding. Second, domain recombination brought to the evolutionary scene the multidomain proteins and complexes. Later, reutilization and de novo design of functional domains and elementary functional loops complemented evolution of protein function.
A proteomic analysis of leaf sheaths from rice.
Shen, Shihua; Matsubae, Masami; Takao, Toshifumi; Tanaka, Naoki; Komatsu, Setsuko
2002-10-01
The proteins extracted from the leaf sheaths of rice seedlings were separated by 2-D PAGE, and analyzed by Edman sequencing and mass spectrometry, followed by database searching. Image analysis revealed 352 protein spots on 2-D PAGE after staining with Coomassie Brilliant Blue. The amino acid sequences of 44 of 84 proteins were determined; for 31 of these proteins, a clear function could be assigned, whereas for 12 proteins, no function could be assigned. Forty proteins did not yield amino acid sequence information, because they were N-terminally blocked, or the obtained sequences were too short and/or did not give unambiguous results. Fifty-nine proteins were analyzed by mass spectrometry; all of these proteins were identified by matching to the protein database. The amino acid sequences of 19 of 27 proteins analyzed by mass spectrometry were similar to the results of Edman sequencing. These results suggest that 2-D PAGE combined with Edman sequencing and mass spectrometry analysis can be effectively used to identify plant proteins.
atBioNet--an integrated network analysis tool for genomics and biomarker discovery.
Ding, Yijun; Chen, Minjun; Liu, Zhichao; Ding, Don; Ye, Yanbin; Zhang, Min; Kelly, Reagan; Guo, Li; Su, Zhenqiang; Harris, Stephen C; Qian, Feng; Ge, Weigong; Fang, Hong; Xu, Xiaowei; Tong, Weida
2012-07-20
Large amounts of mammalian protein-protein interaction (PPI) data have been generated and are available for public use. From a systems biology perspective, Proteins/genes interactions encode the key mechanisms distinguishing disease and health, and such mechanisms can be uncovered through network analysis. An effective network analysis tool should integrate different content-specific PPI databases into a comprehensive network format with a user-friendly platform to identify key functional modules/pathways and the underlying mechanisms of disease and toxicity. atBioNet integrates seven publicly available PPI databases into a network-specific knowledge base. Knowledge expansion is achieved by expanding a user supplied proteins/genes list with interactions from its integrated PPI network. The statistically significant functional modules are determined by applying a fast network-clustering algorithm (SCAN: a Structural Clustering Algorithm for Networks). The functional modules can be visualized either separately or together in the context of the whole network. Integration of pathway information enables enrichment analysis and assessment of the biological function of modules. Three case studies are presented using publicly available disease gene signatures as a basis to discover new biomarkers for acute leukemia, systemic lupus erythematosus, and breast cancer. The results demonstrated that atBioNet can not only identify functional modules and pathways related to the studied diseases, but this information can also be used to hypothesize novel biomarkers for future analysis. atBioNet is a free web-based network analysis tool that provides a systematic insight into proteins/genes interactions through examining significant functional modules. The identified functional modules are useful for determining underlying mechanisms of disease and biomarker discovery. It can be accessed at: http://www.fda.gov/ScienceResearch/BioinformaticsTools/ucm285284.htm.
Qian, Guoliang; Zhou, Yijing; Zhao, Yancun; Song, Zhiwei; Wang, Suyan; Fan, Jiaqin; Hu, Baishi; Venturi, Vittorio; Liu, Fengquan
2013-07-05
Quorum sensing (QS) in Xanthomonas oryzae pv. oryzicola (Xoc), the causal agent of bacterial leaf streak, is mediated by the diffusible signal factor (DSF). DSF-mediating QS has been shown to control virulence and a set of virulence-related functions; however, the expression profiles and functions of extracellular proteins controlled by DSF signal remain largely unclear. In the present study, 33 DSF-regulated extracellular proteins, whose functions include small-protein mediating QS, oxidative adaptation, macromolecule metabolism, cell structure, biosynthesis of small molecules, intermediary metabolism, cellular process, protein catabolism, and hypothetical function, were identified by proteomics in Xoc. Of these, 15 protein encoding genes were in-frame deleted, and 4 of them, including three genes encoding type II secretion system (T2SS)-dependent proteins and one gene encoding an Ax21 (activator of XA21-mediated immunity)-like protein (a novel small-protein type QS signal) were determined to be required for full virulence in Xoc. The contributions of these four genes to important virulence-associated functions, including bacterial colonization, extracellular polysaccharide, cell motility, biofilm formation, and antioxidative ability, are presented. To our knowledge, our analysis is the first complete list of DSF-regulated extracellular proteins and functions in a Xanthomonas species. Our results show that DSF-type QS played critical roles in regulation of T2SS and Ax21-mediating QS, which sheds light on the role of DSF signaling in Xanthomonas.
Functional annotation from the genome sequence of the giant panda.
Huo, Tong; Zhang, Yinjie; Lin, Jianping
2012-08-01
The giant panda is one of the most critically endangered species due to the fragmentation and loss of its habitat. Studying the functions of proteins in this animal, especially specific trait-related proteins, is therefore necessary to protect the species. In this work, the functions of these proteins were investigated using the genome sequence of the giant panda. Data on 21,001 proteins and their functions were stored in the Giant Panda Protein Database, in which the proteins were divided into two groups: 20,179 proteins whose functions can be predicted by GeneScan formed the known-function group, whereas 822 proteins whose functions cannot be predicted by GeneScan comprised the unknown-function group. For the known-function group, we further classified the proteins by molecular function, biological process, cellular component, and tissue specificity. For the unknown-function group, we developed a strategy in which the proteins were filtered by cross-Blast to identify panda-specific proteins under the assumption that proteins related to the panda-specific traits in the unknown-function group exist. After this filtering procedure, we identified 32 proteins (2 of which are membrane proteins) specific to the giant panda genome as compared against the dog and horse genomes. Based on their amino acid sequences, these 32 proteins were further analyzed by functional classification using SVM-Prot, motif prediction using MyHits, and interacting protein prediction using the Database of Interacting Proteins. Nineteen proteins were predicted to be zinc-binding proteins, thus affecting the activities of nucleic acids. The 32 panda-specific proteins will be further investigated by structural and functional analysis.
Hashemi, Seirana; Nowzari Dalini, Abbas; Jalali, Adrin; Banaei-Moghaddam, Ali Mohammad; Razaghi-Moghadam, Zahra
2017-08-16
Discriminating driver mutations from the ones that play no role in cancer is a severe bottleneck in elucidating molecular mechanisms underlying cancer development. Since protein domains are representatives of functional regions within proteins, mutations on them may disturb the protein functionality. Therefore, studying mutations at domain level may point researchers to more accurate assessment of the functional impact of the mutations. This article presents a comprehensive study to map mutations from 29 cancer types to both sequence- and structure-based domains. Statistical analysis was performed to identify candidate domains in which mutations occur with high statistical significance. For each cancer type, the corresponding type-specific domains were distinguished among all candidate domains. Subsequently, cancer type-specific domains facilitated the identification of specific proteins for each cancer type. Besides, performing interactome analysis on specific proteins of each cancer type showed high levels of interconnectivity among them, which implies their functional relationship. To evaluate the role of mitochondrial genes, stem cell-specific genes and DNA repair genes in cancer development, their mutation frequency was determined via further analysis. This study has provided researchers with a publicly available data repository for studying both CATH and Pfam domain regions on protein-coding genes. Moreover, the associations between different groups of genes/domains and various cancer types have been clarified. The work is available at http://www.cancerouspdomains.ir .
Directed proteomic analysis of the human nucleolus.
Andersen, Jens S; Lyon, Carol E; Fox, Archa H; Leung, Anthony K L; Lam, Yun Wah; Steen, Hanno; Mann, Matthias; Lamond, Angus I
2002-01-08
The nucleolus is a subnuclear organelle containing the ribosomal RNA gene clusters and ribosome biogenesis factors. Recent studies suggest it may also have roles in RNA transport, RNA modification, and cell cycle regulation. Despite over 150 years of research into nucleoli, many aspects of their structure and function remain uncharacterized. We report a proteomic analysis of human nucleoli. Using a combination of mass spectrometry (MS) and sequence database searches, including online analysis of the draft human genome sequence, 271 proteins were identified. Over 30% of the nucleolar proteins were encoded by novel or uncharacterized genes, while the known proteins included several unexpected factors with no previously known nucleolar functions. MS analysis of nucleoli isolated from HeLa cells in which transcription had been inhibited showed that a subset of proteins was enriched. These data highlight the dynamic nature of the nucleolar proteome and show that proteins can either associate with nucleoli transiently or accumulate only under specific metabolic conditions. This extensive proteomic analysis shows that nucleoli have a surprisingly large protein complexity. The many novel factors and separate classes of proteins identified support the view that the nucleolus may perform additional functions beyond its known role in ribosome subunit biogenesis. The data also show that the protein composition of nucleoli is not static and can alter significantly in response to the metabolic state of the cell.
Zhang, Dong-Mei; Feng, Li-Xing; Li, Lu; Liu, Miao; Jiang, Bao-Hong; Yang, Min; Li, Guo-Qiang; Wu, Wan-Ying; Guo, De-An; Liu, Xuan
2016-09-01
The sea dragon Solenognathus hardwickii has long been used as a traditional Chinese medicine for the treatment of various diseases, such as male impotency. To gain a comprehensive insight into the protein components of the sea dragon, shotgun proteomic analysis of its protein expression profiling was conducted in the present study. Proteins were extracted from dried sea dragon using a trichloroacetic acid/acetone precipitation method and then separated by SDS-PAGE. The protein bands were cut from the gel and digested by trypsin to generate peptide mixture. The peptide fragments were then analyzed using nano liquid chromatography tandem mass spectrometry (nano-LC-ESI MS/MS). 810 proteins and 1 577 peptides were identified in the dried sea dragon. The identified proteins exhibited molecular weight values ranging from 1 900 to 3 516 900 Da and pI values from 3.8 to 12.18. Bioinformatic analysis was conducted using the DAVID Bioinformatics Resources 6.7 Gene Ontology (GO) analysis tool to explore possible functions of the identified proteins. Ascribed functions of the proteins mainly included intracellular non-membrane-bound organelle, non-membrane-bounded organelle, cytoskeleton, structural molecule activity, calcium ion binding and etc. Furthermore, possible signal networks of the identified proteins were predicted using STRING (Search Tool for the Retrieval of Interacting Genes) database. Ribosomal protein synthesis was found to play an important role in the signal network. The results of this study, to best of our knowledge, were the first to provide a reference proteome profile for the sea dragon, and would aid in the understanding of the expression and functions of the identified proteins. Copyright © 2016 China Pharmaceutical University. Published by Elsevier B.V. All rights reserved.
An overview of the structures of protein-DNA complexes
Luscombe, Nicholas M; Austin, Susan E; Berman , Helen M; Thornton, Janet M
2000-01-01
On the basis of a structural analysis of 240 protein-DNA complexes contained in the Protein Data Bank (PDB), we have classified the DNA-binding proteins involved into eight different structural/functional groups, which are further classified into 54 structural families. Here we present this classification and review the functions, structures and binding interactions of these protein-DNA complexes. PMID:11104519
Metagenomics and the protein universe
Godzik, Adam
2011-01-01
Metagenomics sequencing projects have dramatically increased our knowledge of the protein universe and provided over one-half of currently known protein sequences; they have also introduced a much broader phylogenetic diversity into the protein databases. The full analysis of metagenomic datasets is only beginning, but it has already led to the discovery of thousands of new protein families, likely representing novel functions specific to given environments. At the same time, a deeper analysis of such novel families, including experimental structure determination of some representatives, suggests that most of them represent distant homologs of already characterized protein families, and thus most of the protein diversity present in the new environments are due to functional divergence of the known protein families rather than the emergence of new ones. PMID:21497084
Polanco, Carlos; Samaniego Mendoza, José Lino; Buhse, Thomas; Uversky, Vladimir N; Bañuelos Chao, Ingrid Paola; Bañuelos Cedano, Marcela Angola; Tavera, Fernando Michel; Tavera, Daniel Michel; Falconi, Manuel; Ponce de León, Abelardo Vela
2018-03-06
The number of fatalities and economic losses caused by the Ebola virus infection across the planet culminated in the havoc that occurred between August and November 2014. However, little is known about the molecular protein profile of this devastating virus. This work represents a thorough bioinformatics analysis of the regularities of charge distribution (polar profiles) in two groups of proteins and their functional domains associated with Ebola virus disease: Ebola virus proteins and Human proteins interacting with Ebola virus. Our analysis reveals that a fragment exists in each of these proteins-one named the "functional domain"-with the polar profile similar to the polar profile of the protein that contains it. Each protein is formed by a group of short sub-sequences, where each fragment has a different and distinctive polar profile and where the polar profile between adjacent short sub-sequences changes orderly and gradually to coincide with the polar profile of the whole protein. When using the charge distribution as a metric, it was observed that it effectively discriminates the proteins from their functional domains. As a counterexample, the same test was applied to a set of synthetic proteins built for that purpose, revealing that any of the regularities reported here for the Ebola virus proteins and human proteins interacting with Ebola virus were not present in the synthetic proteins. Our results indicate that the polar profile of each protein studied and its corresponding functional domain are similar. Thus, when building each protein from its functional domai-adding one amino acid at a time and plotting each time its polar profile-it was observed that the resulting graphs can be divided into groups with similar polar profiles.
Li, Ying Hong; Xu, Jing Yu; Tao, Lin; Li, Xiao Feng; Li, Shuang; Zeng, Xian; Chen, Shang Ying; Zhang, Peng; Qin, Chu; Zhang, Cheng; Chen, Zhe; Zhu, Feng; Chen, Yu Zong
2016-01-01
Knowledge of protein function is important for biological, medical and therapeutic studies, but many proteins are still unknown in function. There is a need for more improved functional prediction methods. Our SVM-Prot web-server employed a machine learning method for predicting protein functional families from protein sequences irrespective of similarity, which complemented those similarity-based and other methods in predicting diverse classes of proteins including the distantly-related proteins and homologous proteins of different functions. Since its publication in 2003, we made major improvements to SVM-Prot with (1) expanded coverage from 54 to 192 functional families, (2) more diverse protein descriptors protein representation, (3) improved predictive performances due to the use of more enriched training datasets and more variety of protein descriptors, (4) newly integrated BLAST analysis option for assessing proteins in the SVM-Prot predicted functional families that were similar in sequence to a query protein, and (5) newly added batch submission option for supporting the classification of multiple proteins. Moreover, 2 more machine learning approaches, K nearest neighbor and probabilistic neural networks, were added for facilitating collective assessment of protein functions by multiple methods. SVM-Prot can be accessed at http://bidd2.nus.edu.sg/cgi-bin/svmprot/svmprot.cgi.
Clustering and Network Analysis of Reverse Phase Protein Array Data.
Byron, Adam
2017-01-01
Molecular profiling of proteins and phosphoproteins using a reverse phase protein array (RPPA) platform, with a panel of target-specific antibodies, enables the parallel, quantitative proteomic analysis of many biological samples in a microarray format. Hence, RPPA analysis can generate a high volume of multidimensional data that must be effectively interrogated and interpreted. A range of computational techniques for data mining can be applied to detect and explore data structure and to form functional predictions from large datasets. Here, two approaches for the computational analysis of RPPA data are detailed: the identification of similar patterns of protein expression by hierarchical cluster analysis and the modeling of protein interactions and signaling relationships by network analysis. The protocols use freely available, cross-platform software, are easy to implement, and do not require any programming expertise. Serving as data-driven starting points for further in-depth analysis, validation, and biological experimentation, these and related bioinformatic approaches can accelerate the functional interpretation of RPPA data.
Sheffield, Jeanne; Taylor, Nigel; Fauquet, Claude; Chen, Sixue
2006-03-01
Using high-resolution 2-DE, we resolved proteins extracted from fibrous and tuberous root tissues of 3-month-old cassava plants. Gel image analysis revealed an average of 1467 electrophoretically resolved spots on the fibrous gels and 1595 spots on the tuberous gels in pH 3-10 range. Protein spots from both sets of gels were digested with trypsin. The digests were subjected to nanoelectrospray quadrupole TOF tandem mass analysis. Currently, we have obtained 299 protein identifications for 292 gel spots corresponding to 237 proteins. The proteins span various functional categories from energy, primary and secondary metabolism, disease and defense, destination and storage, transport, signal transduction, protein synthesis, cell structure, and transcription to cell growth and division. Gel image analysis has shown unique, as well as up- and down-regulated proteins, present in the tuberous and the fibrous tissues. Quantitative and qualitative analysis of the cassava root proteome is an important step towards further characterization of differentially expressed proteins and the elucidation of the mechanisms underlying the development and biological functions of the two types of roots.
Combs, Steven A; Mueller, Benjamin K; Meiler, Jens
2018-05-29
Partial covalent interactions (PCIs) in proteins, which include hydrogen bonds, salt bridges, cation-π, and π-π interactions, contribute to thermodynamic stability and facilitate interactions with other biomolecules. Several score functions have been developed within the Rosetta protein modeling framework that identify and evaluate these PCIs through analyzing the geometry between participating atoms. However, we hypothesize that PCIs can be unified through a simplified electron orbital representation. To test this hypothesis, we have introduced orbital based chemical descriptors for PCIs into Rosetta, called the PCI score function. Optimal geometries for the PCIs are derived from a statistical analysis of high-quality protein structures obtained from the Protein Data Bank (PDB), and the relative orientation of electron deficient hydrogen atoms and electron-rich lone pair or π orbitals are evaluated. We demonstrate that nativelike geometries of hydrogen bonds, salt bridges, cation-π, and π-π interactions are recapitulated during minimization of protein conformation. The packing density of tested protein structures increased from the standard score function from 0.62 to 0.64, closer to the native value of 0.70. Overall, rotamer recovery improved when using the PCI score function (75%) as compared to the standard Rosetta score function (74%). The PCI score function represents an improvement over the standard Rosetta score function for protein model scoring; in addition, it provides a platform for future directions in the analysis of small molecule to protein interactions, which depend on partial covalent interactions.
Evolution-Based Functional Decomposition of Proteins
Rivoire, Olivier; Reynolds, Kimberly A.; Ranganathan, Rama
2016-01-01
The essential biological properties of proteins—folding, biochemical activities, and the capacity to adapt—arise from the global pattern of interactions between amino acid residues. The statistical coupling analysis (SCA) is an approach to defining this pattern that involves the study of amino acid coevolution in an ensemble of sequences comprising a protein family. This approach indicates a functional architecture within proteins in which the basic units are coupled networks of amino acids termed sectors. This evolution-based decomposition has potential for new understandings of the structural basis for protein function. To facilitate its usage, we present here the principles and practice of the SCA and introduce new methods for sector analysis in a python-based software package (pySCA). We show that the pattern of amino acid interactions within sectors is linked to the divergence of functional lineages in a multiple sequence alignment—a model for how sector properties might be differentially tuned in members of a protein family. This work provides new tools for studying proteins and for generally testing the concept of sectors as the principal units of function and adaptive variation. PMID:27254668
Son, Ji-Hye; Hwang, Eurim C; Kim, Joungmok
2016-03-01
Ultraviolet radiation resistance-associated gene product (UVRAG) was originally identified as a protein involved in cellular responses to UV irradiation. Subsequent studies have demonstrated that UVRAG plays as an important role in autophagy, a lysosome-dependent catabolic program, as a part of a pro-autophagy PIK3C3/VPS34 lipid kinase complex. Several recent studies have shown that UVRAG is also involved in autophagy-independent cellular functions, such as DNA repair/stability and vesicular trafficking/fusion. Here, we examined the UVRAG protein interactome to obtain information about its functional network. To this end, we screened UVRAG-interacting proteins using a tandem affinity purification method coupled with MALDI-TOF/MS analysis. Our results demonstrate that UVRAG interacts with various proteins involved in a wide spectrum of cellular functions, including genome stability, protein translational elongation, protein localization (trafficking), vacuole organization, transmembrane transport as well as autophagy. Notably, the interactome list of high-confidence UVRAG-interacting proteins is enriched for proteins involved in the regulation of genome stability. Our systematic UVRAG interactome analysis should provide important clues for understanding a variety of UVRAG functions.
Ramos, H.; Shannon, P.; Aebersold, R.
2008-01-01
Motivation: Mass spectrometry experiments in the field of proteomics produce lists containing tens to thousands of identified proteins. With the protein information and property explorer (PIPE), the biologist can acquire functional annotations for these proteins and explore the enrichment of the list, or fraction thereof, with respect to functional classes. These protein lists may be saved for access at a later time or different location. The PIPE is interoperable with the Firegoose and the Gaggle, permitting wide-ranging data exploration and analysis. The PIPE is a rich-client web application which uses AJAX capabilities provided by the Google Web Toolkit, and server-side data storage using Hibernate. Availability: http://pipe.systemsbiology.net Contact: pshannon@systemsbiology.org PMID:18635572
Insights into Hox protein function from a large scale combinatorial analysis of protein domains.
Merabet, Samir; Litim-Mecheri, Isma; Karlsson, Daniel; Dixit, Richa; Saadaoui, Mehdi; Monier, Bruno; Brun, Christine; Thor, Stefan; Vijayraghavan, K; Perrin, Laurent; Pradel, Jacques; Graba, Yacine
2011-10-01
Protein function is encoded within protein sequence and protein domains. However, how protein domains cooperate within a protein to modulate overall activity and how this impacts functional diversification at the molecular and organism levels remains largely unaddressed. Focusing on three domains of the central class Drosophila Hox transcription factor AbdominalA (AbdA), we used combinatorial domain mutations and most known AbdA developmental functions as biological readouts to investigate how protein domains collectively shape protein activity. The results uncover redundancy, interactivity, and multifunctionality of protein domains as salient features underlying overall AbdA protein activity, providing means to apprehend functional diversity and accounting for the robustness of Hox-controlled developmental programs. Importantly, the results highlight context-dependency in protein domain usage and interaction, allowing major modifications in domains to be tolerated without general functional loss. The non-pleoitropic effect of domain mutation suggests that protein modification may contribute more broadly to molecular changes underlying morphological diversification during evolution, so far thought to rely largely on modification in gene cis-regulatory sequences.
Insights into Hox Protein Function from a Large Scale Combinatorial Analysis of Protein Domains
Karlsson, Daniel; Dixit, Richa; Saadaoui, Mehdi; Monier, Bruno; Brun, Christine; Thor, Stefan; Vijayraghavan, K.; Perrin, Laurent; Pradel, Jacques; Graba, Yacine
2011-01-01
Protein function is encoded within protein sequence and protein domains. However, how protein domains cooperate within a protein to modulate overall activity and how this impacts functional diversification at the molecular and organism levels remains largely unaddressed. Focusing on three domains of the central class Drosophila Hox transcription factor AbdominalA (AbdA), we used combinatorial domain mutations and most known AbdA developmental functions as biological readouts to investigate how protein domains collectively shape protein activity. The results uncover redundancy, interactivity, and multifunctionality of protein domains as salient features underlying overall AbdA protein activity, providing means to apprehend functional diversity and accounting for the robustness of Hox-controlled developmental programs. Importantly, the results highlight context-dependency in protein domain usage and interaction, allowing major modifications in domains to be tolerated without general functional loss. The non-pleoitropic effect of domain mutation suggests that protein modification may contribute more broadly to molecular changes underlying morphological diversification during evolution, so far thought to rely largely on modification in gene cis-regulatory sequences. PMID:22046139
Bioinformatics analysis of disordered proteins in prokaryotes
2011-01-01
Background A significant number of proteins have been shown to be intrinsically disordered, meaning that they lack a fixed 3 D structure or contain regions that do not posses a well defined 3 D structure. It has also been proven that a protein's disorder content is related to its function. We have performed an exhaustive analysis and comparison of the disorder content of proteins from prokaryotic organisms (i.e., superkingdoms Archaea and Bacteria) with respect to functional categories they belong to, i.e., Clusters of Orthologous Groups of proteins (COGs) and groups of COGs-Cellular processes (Cp), Information storage and processing (Isp), Metabolism (Me) and Poorly characterized (Pc). We also analyzed the disorder content of proteins with respect to various genomic, metabolic and ecological characteristics of the organism they belong to. We used correlations and association rule mining in order to identify the most confident associations between specific modalities of the characteristics considered and disorder content. Results Bacteria are shown to have a somewhat higher level of protein disorder than archaea, except for proteins in the Me functional group. It is demonstrated that the Isp and Cp functional groups in particular (L-repair function and N-cell motility and secretion COGs of proteins in specific) possess the highest disorder content, while Me proteins, in general, posses the lowest. Disorder fractions have been confirmed to have the lowest level for the so-called order-promoting amino acids and the highest level for the so-called disorder promoters. For each pair of organism characteristics, specific modalities are identified with the maximum disorder proteins in the corresponding organisms, e.g., high genome size-high GC content organisms, facultative anaerobic-low GC content organisms, aerobic-high genome size organisms, etc. Maximum disorder in archaea is observed for high GC content-low genome size organisms, high GC content-facultative anaerobic or aquatic or mesophilic organisms, etc. Maximum disorder in bacteria is observed for high GC content-high genome size organisms, high genome size-aerobic organisms, etc. Some of the most reliable association rules mined establish relationships between high GC content and high protein disorder, medium GC content and both medium and low protein disorder, anaerobic organisms and medium protein disorder, Gammaproteobacteria and low protein disorder, etc. A web site Prokaryote Disorder Database has been designed and implemented at the address http://bioinfo.matf.bg.ac.rs/disorder, which contains complete results of the analysis of protein disorder performed for 296 prokaryotic completely sequenced genomes. Conclusions Exhaustive disorder analysis has been performed by functional classes of proteins, for a larger dataset of prokaryotic organisms than previously done. Results obtained are well correlated to those previously published, with some extension in the range of disorder level and clear distinction between functional classes of proteins. Wide correlation and association analysis between protein disorder and genomic and ecological characteristics has been performed for the first time. The results obtained give insight into multi-relationships among the characteristics and protein disorder. Such analysis provides for better understanding of the evolutionary process and may be useful for taxon determination. The main drawback of the approach is the fact that the disorder considered has been predicted and not experimentally established. PMID:21366926
Network representation of protein interactions: Theory of graph description and analysis.
Kurzbach, Dennis
2016-09-01
A methodological framework is presented for the graph theoretical interpretation of NMR data of protein interactions. The proposed analysis generalizes the idea of network representations of protein structures by expanding it to protein interactions. This approach is based on regularization of residue-resolved NMR relaxation times and chemical shift data and subsequent construction of an adjacency matrix that represents the underlying protein interaction as a graph or network. The network nodes represent protein residues. Two nodes are connected if two residues are functionally correlated during the protein interaction event. The analysis of the resulting network enables the quantification of the importance of each amino acid of a protein for its interactions. Furthermore, the determination of the pattern of correlations between residues yields insights into the functional architecture of an interaction. This is of special interest for intrinsically disordered proteins, since the structural (three-dimensional) architecture of these proteins and their complexes is difficult to determine. The power of the proposed methodology is demonstrated at the example of the interaction between the intrinsically disordered protein osteopontin and its natural ligand heparin. © 2016 The Protein Society.
Friso, Giulia; Giacomelli, Lisa; Ytterberg, A Jimmy; Peltier, Jean-Benoit; Rudella, Andrea; Sun, Qi; Wijk, Klaas J van
2004-02-01
An extensive analysis of the Arabidopsis thaliana peripheral and integral thylakoid membrane proteome was performed by sequential extractions with salt, detergent, and organic solvents, followed by multidimensional protein separation steps (reverse-phase HPLC and one- and two-dimensional electrophoresis gels), different enzymatic and nonenzymatic protein cleavage techniques, mass spectrometry, and bioinformatics. Altogether, 154 proteins were identified, of which 76 (49%) were alpha-helical integral membrane proteins. Twenty-seven new proteins without known function but with predicted chloroplast transit peptides were identified, of which 17 (63%) are integral membrane proteins. These new proteins, likely important in thylakoid biogenesis, include two rubredoxins, a potential metallochaperone, and a new DnaJ-like protein. The data were integrated with our analysis of the lumenal-enriched proteome. We identified 83 out of 100 known proteins of the thylakoid localized photosynthetic apparatus, including several new paralogues and some 20 proteins involved in protein insertion, assembly, folding, or proteolysis. An additional 16 proteins are involved in translation, demonstrating that the thylakoid membrane surface is an important site for protein synthesis. The high coverage of the photosynthetic apparatus and the identification of known hydrophobic proteins with low expression levels, such as cpSecE, Ohp1, and Ohp2, indicate an excellent dynamic resolution of the analysis. The sequential extraction process proved very helpful to validate transmembrane prediction. Our data also were cross-correlated to chloroplast subproteome analyses by other laboratories. All data are deposited in a new curated plastid proteome database (PPDB) with multiple search functions (http://cbsusrv01.tc.cornell.edu/users/ppdb/). This PPDB will serve as an expandable resource for the plant community.
Domain fusion analysis by applying relational algebra to protein sequence and domain databases
Truong, Kevin; Ikura, Mitsuhiko
2003-01-01
Background Domain fusion analysis is a useful method to predict functionally linked proteins that may be involved in direct protein-protein interactions or in the same metabolic or signaling pathway. As separate domain databases like BLOCKS, PROSITE, Pfam, SMART, PRINTS-S, ProDom, TIGRFAMs, and amalgamated domain databases like InterPro continue to grow in size and quality, a computational method to perform domain fusion analysis that leverages on these efforts will become increasingly powerful. Results This paper proposes a computational method employing relational algebra to find domain fusions in protein sequence databases. The feasibility of this method was illustrated on the SWISS-PROT+TrEMBL sequence database using domain predictions from the Pfam HMM (hidden Markov model) database. We identified 235 and 189 putative functionally linked protein partners in H. sapiens and S. cerevisiae, respectively. From scientific literature, we were able to confirm many of these functional linkages, while the remainder offer testable experimental hypothesis. Results can be viewed at . Conclusion As the analysis can be computed quickly on any relational database that supports standard SQL (structured query language), it can be dynamically updated along with the sequence and domain databases, thereby improving the quality of predictions over time. PMID:12734020
In Silico Analysis for the Study of Botulinum Toxin Structure
NASA Astrophysics Data System (ADS)
Suzuki, Tomonori; Miyazaki, Satoru
2010-01-01
Protein-protein interactions play many important roles in biological function. Knowledge of protein-protein complex structure is required for understanding the function. The determination of protein-protein complex structure by experimental studies remains difficult, therefore computational prediction of protein structures by structure modeling and docking studies is valuable method. In addition, MD simulation is also one of the most popular methods for protein structure modeling and characteristics. Here, we attempt to predict protein-protein complex structure and property using some of bioinformatic methods, and we focus botulinum toxin complex as target structure.
Functional analysis of proteins and protein species using shotgun proteomics and linear mathematics.
Hoehenwarter, Wolfgang; Chen, Yanmei; Recuenco-Munoz, Luis; Wienkoop, Stefanie; Weckwerth, Wolfram
2011-07-01
Covalent post-translational modification of proteins is the primary modulator of protein function in the cell. It greatly expands the functional potential of the proteome compared to the genome. In the past few years shotgun proteomics-based research, where the proteome is digested into peptides prior to mass spectrometric analysis has been prolific in this area. It has determined the kinetics of tens of thousands of sites of covalent modification on an equally large number of proteins under various biological conditions and uncovered a transiently active regulatory network that extends into diverse branches of cellular physiology. In this review, we discuss this work in light of the concept of protein speciation, which emphasizes the entire post-translationally modified molecule and its interactions and not just the modification site as the functional entity. Sometimes, particularly when considering complex multisite modification, all of the modified molecular species involved in the investigated condition, the protein species must be completely resolved for full understanding. We present a mathematical technique that delivers a good approximation for shotgun proteomics data.
[FANCA gene mutation analysis in Fanconi anemia patients].
Chen, Fei; Peng, Guang-Jie; Zhang, Kejian; Hu, Qun; Zhang, Liu-Qing; Liu, Ai-Guo
2005-10-01
To screen the FANCA gene mutation and explore the FANCA protein function in Fanconi anemia (FA) patients. FANCA protein expression and its interaction with FANCF were analyzed using Western blot and immunoprecipitation in 3 cases of FA-A. Genomic DNA was used for MLPA analysis followed by sequencing. FANCA protein was undetectable and FANCA and FANCF protein interaction was impaired in these 3 cases of FA-A. Each case of FA-A contained biallelic pathogenic mutations in FANCA gene. No functional FANCA protein was found in these 3 cases of FA-A, and intragenic deletion, frame shift and splice site mutation were the major pathogenic mutations found in FANCA gene.
Protein arginine methylation: Cellular functions and methods of analysis.
Pahlich, Steffen; Zakaryan, Rouzanna P; Gehring, Heinz
2006-12-01
During the last few years, new members of the growing family of protein arginine methyltransferases (PRMTs) have been identified and the role of arginine methylation in manifold cellular processes like signaling, RNA processing, transcription, and subcellular transport has been extensively investigated. In this review, we describe recent methods and findings that have yielded new insights into the cellular functions of arginine-methylated proteins, and we evaluate the currently used procedures for the detection and analysis of arginine methylation.
Lessons on RNA Silencing Mechanisms in Plants from Eukaryotic Argonaute Structures[W
Poulsen, Christian; Vaucheret, Hervé; Brodersen, Peter
2013-01-01
RNA silencing refers to a collection of gene regulatory mechanisms that use small RNAs for sequence specific repression. These mechanisms rely on ARGONAUTE (AGO) proteins that directly bind small RNAs and thereby constitute the central component of the RNA-induced silencing complex (RISC). AGO protein function has been probed extensively by mutational analyses, particularly in plants where large allelic series of several AGO proteins have been isolated. Structures of entire human and yeast AGO proteins have only very recently been obtained, and they allow more precise analyses of functional consequences of mutations obtained by forward genetics. To a large extent, these analyses support current models of regions of particular functional importance of AGO proteins. Interestingly, they also identify previously unrecognized parts of AGO proteins with profound structural and functional importance and provide the first hints at structural elements that have important functions specific to individual AGO family members. A particularly important outcome of the analysis concerns the evidence for existence of Gly-Trp (GW) repeat interactors of AGO proteins acting in the plant microRNA pathway. The parallel analysis of AGO structures and plant AGO mutations also suggests that such interactions with GW proteins may be a determinant of whether an endonucleolytically competent RISC is formed. PMID:23303917
Lessons on RNA silencing mechanisms in plants from eukaryotic argonaute structures.
Poulsen, Christian; Vaucheret, Hervé; Brodersen, Peter
2013-01-01
RNA silencing refers to a collection of gene regulatory mechanisms that use small RNAs for sequence specific repression. These mechanisms rely on ARGONAUTE (AGO) proteins that directly bind small RNAs and thereby constitute the central component of the RNA-induced silencing complex (RISC). AGO protein function has been probed extensively by mutational analyses, particularly in plants where large allelic series of several AGO proteins have been isolated. Structures of entire human and yeast AGO proteins have only very recently been obtained, and they allow more precise analyses of functional consequences of mutations obtained by forward genetics. To a large extent, these analyses support current models of regions of particular functional importance of AGO proteins. Interestingly, they also identify previously unrecognized parts of AGO proteins with profound structural and functional importance and provide the first hints at structural elements that have important functions specific to individual AGO family members. A particularly important outcome of the analysis concerns the evidence for existence of Gly-Trp (GW) repeat interactors of AGO proteins acting in the plant microRNA pathway. The parallel analysis of AGO structures and plant AGO mutations also suggests that such interactions with GW proteins may be a determinant of whether an endonucleolytically competent RISC is formed.
Computer analysis of protein functional sites projection on exon structure of genes in Metazoa
2015-01-01
Background Study of the relationship between the structural and functional organization of proteins and their coding genes is necessary for an understanding of the evolution of molecular systems and can provide new knowledge for many applications for designing proteins with improved medical and biological properties. It is well known that the functional properties of proteins are determined by their functional sites. Functional sites are usually represented by a small number of amino acid residues that are distantly located from each other in the amino acid sequence. They are highly conserved within their functional group and vary significantly in structure between such groups. According to this facts analysis of the general properties of the structural organization of the functional sites at the protein level and, at the level of exon-intron structure of the coding gene is still an actual problem. Results One approach to this analysis is the projection of amino acid residue positions of the functional sites along with the exon boundaries to the gene structure. In this paper, we examined the discontinuity of the functional sites in the exon-intron structure of genes and the distribution of lengths and phases of the functional site encoding exons in vertebrate genes. We have shown that the DNA fragments coding the functional sites were in the same exons, or in close exons. The observed tendency to cluster the exons that code functional sites which could be considered as the unit of protein evolution. We studied the characteristics of the structure of the exon boundaries that code, and do not code, functional sites in 11 Metazoa species. This is accompanied by a reduced frequency of intercodon gaps (phase 0) in exons encoding the amino acid residue functional site, which may be evidence of the existence of evolutionary limitations to the exon shuffling. Conclusions These results characterize the features of the coding exon-intron structure that affect the functionality of the encoded protein and allow a better understanding of the emergence of biological diversity. PMID:26693737
Li, Lei; Nelson, Clark J.; Solheim, Cory; Whelan, James; Millar, A. Harvey
2012-01-01
The growth and development of plant tissues is associated with an ordered succession of cellular processes that are reflected in the appearance and disappearance of proteins. The control of the kinetics of protein turnover is central to how plants can rapidly and specifically alter protein abundance and thus molecular function in response to environmental or developmental cues. However, the processes of turnover are largely hidden during periods of apparent steady-state protein abundance, and even when proteins accumulate it is unclear whether enhanced synthesis or decreased degradation is responsible. We have used a 15N labeling strategy with inorganic nitrogen sources coupled to a two-dimensional fluorescence difference gel electrophoresis and mass spectrometry analysis of two-dimensional IEF/SDS-PAGE gel spots to define the rate of protein synthesis (KS) and degradation (KD) of Arabidopsis cell culture proteins. Through analysis of MALDI-TOF/TOF mass spectra from 120 protein spots, we were able to quantify KS and KD for 84 proteins across six functional groups and observe over 65-fold variation in protein degradation rates. KS and KD correlate with functional roles of the proteins in the cell and the time in the cell culture cycle. This approach is based on progressive 15N labeling that is innocuous for the plant cells and, because it can be used to target analysis of proteins through the use of specific gel spots, it has broad applicability. PMID:22215636
DWARF – a data warehouse system for analyzing protein families
Fischer, Markus; Thai, Quan K; Grieb, Melanie; Pleiss, Jürgen
2006-01-01
Background The emerging field of integrative bioinformatics provides the tools to organize and systematically analyze vast amounts of highly diverse biological data and thus allows to gain a novel understanding of complex biological systems. The data warehouse DWARF applies integrative bioinformatics approaches to the analysis of large protein families. Description The data warehouse system DWARF integrates data on sequence, structure, and functional annotation for protein fold families. The underlying relational data model consists of three major sections representing entities related to the protein (biochemical function, source organism, classification to homologous families and superfamilies), the protein sequence (position-specific annotation, mutant information), and the protein structure (secondary structure information, superimposed tertiary structure). Tools for extracting, transforming and loading data from public available resources (ExPDB, GenBank, DSSP) are provided to populate the database. The data can be accessed by an interface for searching and browsing, and by analysis tools that operate on annotation, sequence, or structure. We applied DWARF to the family of α/β-hydrolases to host the Lipase Engineering database. Release 2.3 contains 6138 sequences and 167 experimentally determined protein structures, which are assigned to 37 superfamilies 103 homologous families. Conclusion DWARF has been designed for constructing databases of large structurally related protein families and for evaluating their sequence-structure-function relationships by a systematic analysis of sequence, structure and functional annotation. It has been applied to predict biochemical properties from sequence, and serves as a valuable tool for protein engineering. PMID:17094801
Li, H; Ji, H; Wu, S S; Hou, B X
2016-12-09
Objective: To analyze the protein expression profile and the potential virulence factors of Porphyromonas endodontalis (Pe) via comparison with that of two strains of Porphyromonas gingivalis (Pg) with high and low virulences, respectively. Methods: Whole cell comparative proteomics of Pe ATCC35406 was examined and compared with that of high virulent strain Pg W83 andlow virulent strain Pg ATCC33277, respectively. Isobaric tags for relative and absolute quantitation (iTRAQ) combined with nano liquid chromatography-tandem mass spectrometry (Nano-LC-MS/MS) were adopted to identify and quantitate the proteins of Pe and two strains of Pg with various virulences by using the methods of isotopically labeled peptides, mass spectrometric detection and bioinformatics analysis. The biological functions of similar proteins expressed by Pe ATCC35406 and two strains of Pg were quantified and analyzed. Results: Totally 1 210 proteins were identified while Pe compared with Pg W83. There were 130 proteins (10.74% of the total proteins) expressed similarly, including 89 known functional proteins and 41 proteins of unknown functions. Totally 1 223 proteins were identified when Pe compared with Pg ATCC33277. There were 110 proteins (8.99% of the total proteins) expressed similarly, including 72 known functional proteins and 38 proteins of unknown functions. The similarly expressed proteins in Pe and Pg strains with various virulences mainly focused on catalytic activity and binding function, including recombination activation gene (RagA), lipoprotein, chaperonin Dnak, Clp family proteins (ClpC and ClpX) and various iron-binding proteins. They were involved in metabolism and cellular processes. In addition, the type and number of similar virulence proteins between Pe and high virulence Pg were higher than those between Pe and low virulence Pg. Conclusions: Lipoprotein, oxygen resistance protein, iron binding protein were probably the potential virulence factors of Pe ATCC35406. It was speculated that pathogenicity of Pe was more similar to high virulence Pg than that to low virulence strain.
Chen, Qing; Rehman, S; Smant, G; Jones, John T
2005-07-01
RNA interference (RNAi) has been used widely as a tool for examining gene function and a method that allows its use with plant-parasitic nematodes recently has been described. Here, we use a modified method to analyze the function of secreted beta-1,4, endoglucanases of the potato cyst nematode Globodera rostochiensis, the first in vivo functional analysis of a pathogenicity protein of a plant-parasitic nematode. Knockout of the beta-1,4, endoglucanases reduced the ability of the nematodes to invade roots. We also use RNAi to show that gr-ams-1, a secreted protein of the main sense organs (the amphids), is essential for host location.
Functional Advantages of Conserved Intrinsic Disorder in RNA-Binding Proteins.
Varadi, Mihaly; Zsolyomi, Fruzsina; Guharoy, Mainak; Tompa, Peter
2015-01-01
Proteins form large macromolecular assemblies with RNA that govern essential molecular processes. RNA-binding proteins have often been associated with conformational flexibility, yet the extent and functional implications of their intrinsic disorder have never been fully assessed. Here, through large-scale analysis of comprehensive protein sequence and structure datasets we demonstrate the prevalence of intrinsic structural disorder in RNA-binding proteins and domains. We addressed their functionality through a quantitative description of the evolutionary conservation of disordered segments involved in binding, and investigated the structural implications of flexibility in terms of conformational stability and interface formation. We conclude that the functional role of intrinsically disordered protein segments in RNA-binding is two-fold: first, these regions establish extended, conserved electrostatic interfaces with RNAs via induced fit. Second, conformational flexibility enables them to target different RNA partners, providing multi-functionality, while also ensuring specificity. These findings emphasize the functional importance of intrinsically disordered regions in RNA-binding proteins.
Barradas-Bautista, Didier; Moal, Iain H; Fernández-Recio, Juan
2017-07-01
Protein-protein interactions play fundamental roles in biological processes including signaling, metabolism, and trafficking. While the structure of a protein complex reveals crucial details about the interaction, it is often difficult to acquire this information experimentally. As the number of interactions discovered increases faster than they can be characterized, protein-protein docking calculations may be able to reduce this disparity by providing models of the interacting proteins. Rigid-body docking is a widely used docking approach, and is often capable of generating a pool of models within which a near-native structure can be found. These models need to be scored in order to select the acceptable ones from the set of poses. Recently, more than 100 scoring functions from the CCharPPI server were evaluated for this task using decoy structures generated with SwarmDock. Here, we extend this analysis to identify the predictive success rates of the scoring functions on decoys from three rigid-body docking programs, ZDOCK, FTDock, and SDOCK, allowing us to assess the transferability of the functions. We also apply set-theoretic measure to test whether the scoring functions are capable of identifying near-native poses within different subsets of the benchmark. This information can provide guides for the use of the most efficient scoring function for each docking method, as well as instruct future scoring functions development efforts. Proteins 2017; 85:1287-1297. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.
Analysis of the functional aspects and seminal plasma proteomic profile of sperm from smokers.
Antoniassi, Mariana Pereira; Intasqui, Paula; Camargo, Mariana; Zylbersztejn, Daniel Suslik; Carvalho, Valdemir Melechco; Cardozo, Karina H M; Bertolla, Ricardo Pimenta
2016-11-01
To evaluate the effect of smoking on sperm functional quality and seminal plasma proteomic profile. Sperm functional tests were performed in 20 non-smoking men with normal semen quality, according to the World Health Organization (2010) and in 20 smoking patients. These included: evaluation of DNA fragmentation by alkaline Comet assay; analysis of mitochondrial activity using DAB staining; and acrosomal integrity evaluation by PNA binding. The remaining semen was centrifuged and seminal plasma was used for proteomic analysis (liquid chromatography-tandem mass spectrometry). The quantified proteins were used for Venn diagram construction in Cytoscape 3.2.1 software, using the PINA4MS plug-in. Then, differentially expressed proteins were used for functional enrichment analysis of Gene Ontology categories, Kyoto Encyclopedia of Genes and Genomes and Reactome, using Cytoscape software and the ClueGO 2.2.0 plug-in. Smokers had a higher percentage of sperm DNA damage (Comet classes III and IV; P < 0.01), partially and fully inactive mitochondria (DAB classes III and IV; P = 0.001 and P = 0.006, respectively) and non-intact acrosomes (P < 0.01) when compared with the control group. With respect to proteomic analysis, 422 proteins were identified and quantified, of which one protein was absent, 27 proteins were under-represented and six proteins were over-represented in smokers. Functional enrichment analysis showed the enrichment of antigen processing and presentation, positive regulation of prostaglandin secretion involved in immune response, protein kinase A signalling and arachidonic acid secretion, complement activation, regulation of the cytokine-mediated signalling pathway and regulation of acute inflammatory response in the study group (smokers). In conclusion, cigarette smoking was associated with an inflammatory state in the accessory glands and in the testis, as shown by enriched proteomic pathways. This state causes an alteration in sperm functional quality, which is characterized by decreased acrosome integrity and mitochondrial activity, as well as by increased nuclear DNA fragmentation. © 2016 The Authors BJU International © 2016 BJU International Published by John Wiley & Sons Ltd.
NASA Astrophysics Data System (ADS)
Champeimont, Raphaël; Laine, Elodie; Hu, Shuang-Wei; Penin, Francois; Carbone, Alessandra
2016-05-01
A novel computational approach of coevolution analysis allowed us to reconstruct the protein-protein interaction network of the Hepatitis C Virus (HCV) at the residue resolution. For the first time, coevolution analysis of an entire viral genome was realized, based on a limited set of protein sequences with high sequence identity within genotypes. The identified coevolving residues constitute highly relevant predictions of protein-protein interactions for further experimental identification of HCV protein complexes. The method can be used to analyse other viral genomes and to predict the associated protein interaction networks.
CytoCluster: A Cytoscape Plugin for Cluster Analysis and Visualization of Biological Networks.
Li, Min; Li, Dongyan; Tang, Yu; Wu, Fangxiang; Wang, Jianxin
2017-08-31
Nowadays, cluster analysis of biological networks has become one of the most important approaches to identifying functional modules as well as predicting protein complexes and network biomarkers. Furthermore, the visualization of clustering results is crucial to display the structure of biological networks. Here we present CytoCluster, a cytoscape plugin integrating six clustering algorithms, HC-PIN (Hierarchical Clustering algorithm in Protein Interaction Networks), OH-PIN (identifying Overlapping and Hierarchical modules in Protein Interaction Networks), IPCA (Identifying Protein Complex Algorithm), ClusterONE (Clustering with Overlapping Neighborhood Expansion), DCU (Detecting Complexes based on Uncertain graph model), IPC-MCE (Identifying Protein Complexes based on Maximal Complex Extension), and BinGO (the Biological networks Gene Ontology) function. Users can select different clustering algorithms according to their requirements. The main function of these six clustering algorithms is to detect protein complexes or functional modules. In addition, BinGO is used to determine which Gene Ontology (GO) categories are statistically overrepresented in a set of genes or a subgraph of a biological network. CytoCluster can be easily expanded, so that more clustering algorithms and functions can be added to this plugin. Since it was created in July 2013, CytoCluster has been downloaded more than 9700 times in the Cytoscape App store and has already been applied to the analysis of different biological networks. CytoCluster is available from http://apps.cytoscape.org/apps/cytocluster.
CytoCluster: A Cytoscape Plugin for Cluster Analysis and Visualization of Biological Networks
Li, Min; Li, Dongyan; Tang, Yu; Wang, Jianxin
2017-01-01
Nowadays, cluster analysis of biological networks has become one of the most important approaches to identifying functional modules as well as predicting protein complexes and network biomarkers. Furthermore, the visualization of clustering results is crucial to display the structure of biological networks. Here we present CytoCluster, a cytoscape plugin integrating six clustering algorithms, HC-PIN (Hierarchical Clustering algorithm in Protein Interaction Networks), OH-PIN (identifying Overlapping and Hierarchical modules in Protein Interaction Networks), IPCA (Identifying Protein Complex Algorithm), ClusterONE (Clustering with Overlapping Neighborhood Expansion), DCU (Detecting Complexes based on Uncertain graph model), IPC-MCE (Identifying Protein Complexes based on Maximal Complex Extension), and BinGO (the Biological networks Gene Ontology) function. Users can select different clustering algorithms according to their requirements. The main function of these six clustering algorithms is to detect protein complexes or functional modules. In addition, BinGO is used to determine which Gene Ontology (GO) categories are statistically overrepresented in a set of genes or a subgraph of a biological network. CytoCluster can be easily expanded, so that more clustering algorithms and functions can be added to this plugin. Since it was created in July 2013, CytoCluster has been downloaded more than 9700 times in the Cytoscape App store and has already been applied to the analysis of different biological networks. CytoCluster is available from http://apps.cytoscape.org/apps/cytocluster. PMID:28858211
Gao, Yanpan; Chen, Yanyu; Zhan, Shaohua; Zhang, Wenhao; Xiong, Feng; Ge, Wei
2017-01-31
Phagocytosis and autophagy in macrophages have been shown to be essential to both innate and adaptive immunity. Lysosomes are the main catabolic subcellular organelles responsible for degradation and recycling of both extracellular and intracellular material, which are the final steps in phagocytosis and autophagy. However, the molecular mechanisms underlying lysosomal functions after infection remain obscure. In this study, we conducted a quantitative proteomics analysis of the changes in constitution and glycosylation of proteins in lysosomes derived from murine RAW 264.7 macrophage cells treated with different types of pathogens comprising examples of bacteria (Listeria monocytogenes, L. m), DNA viruses (herpes simplex virus type-1, HSV-1) and RNA viruses (vesicular stomatitis virus, VSV). In total, 3,704 lysosome-related proteins and 300 potential glycosylation sites on 193 proteins were identified. Comparative analysis showed that the aforementioned pathogens induced distinct alterations in the proteome of the lysosome, which is closely associated with the immune functions of macrophages, such as toll-like receptor activation, inflammation and antigen-presentation. The most significant changes in proteins and fluctuations in glycosylation were also determined. Furthermore, Western blot analysis showed that the changes in expression of these proteins were undetectable at the whole cell level. Thus, our study provides unique insights into the function of lysosomes in macrophage activation and immune responses.
Verma, Vikash; Mallik, Leena; Hariadi, Rizal F.; Sivaramakrishnan, Sivaraj; Skiniotis, Georgios; Joglekar, Ajit P.
2015-01-01
DNA origami provides a versatile platform for conducting ‘architecture-function’ analysis to determine how the nanoscale organization of multiple copies of a protein component within a multi-protein machine affects its overall function. Such analysis requires that the copy number of protein molecules bound to the origami scaffold exactly matches the desired number, and that it is uniform over an entire scaffold population. This requirement is challenging to satisfy for origami scaffolds with many protein hybridization sites, because it requires the successful completion of multiple, independent hybridization reactions. Here, we show that a cleavable dimerization domain on the hybridizing protein can be used to multiplex hybridization reactions on an origami scaffold. This strategy yields nearly 100% hybridization efficiency on a 6-site scaffold even when using low protein concentration and short incubation time. It can also be developed further to enable reliable patterning of a large number of molecules on DNA origami for architecture-function analysis. PMID:26348722
SAFE Software and FED Database to Uncover Protein-Protein Interactions using Gene Fusion Analysis.
Tsagrasoulis, Dimosthenis; Danos, Vasilis; Kissa, Maria; Trimpalis, Philip; Koumandou, V Lila; Karagouni, Amalia D; Tsakalidis, Athanasios; Kossida, Sophia
2012-01-01
Domain Fusion Analysis takes advantage of the fact that certain proteins in a given proteome A, are found to have statistically significant similarity with two separate proteins in another proteome B. In other words, the result of a fusion event between two separate proteins in proteome B is a specific full-length protein in proteome A. In such a case, it can be safely concluded that the protein pair has a common biological function or even interacts physically. In this paper, we present the Fusion Events Database (FED), a database for the maintenance and retrieval of fusion data both in prokaryotic and eukaryotic organisms and the Software for the Analysis of Fusion Events (SAFE), a computational platform implemented for the automated detection, filtering and visualization of fusion events (both available at: http://www.bioacademy.gr/bioinformatics/projects/ProteinFusion/index.htm). Finally, we analyze the proteomes of three microorganisms using these tools in order to demonstrate their functionality.
SAFE Software and FED Database to Uncover Protein-Protein Interactions using Gene Fusion Analysis
Tsagrasoulis, Dimosthenis; Danos, Vasilis; Kissa, Maria; Trimpalis, Philip; Koumandou, V. Lila; Karagouni, Amalia D.; Tsakalidis, Athanasios; Kossida, Sophia
2012-01-01
Domain Fusion Analysis takes advantage of the fact that certain proteins in a given proteome A, are found to have statistically significant similarity with two separate proteins in another proteome B. In other words, the result of a fusion event between two separate proteins in proteome B is a specific full-length protein in proteome A. In such a case, it can be safely concluded that the protein pair has a common biological function or even interacts physically. In this paper, we present the Fusion Events Database (FED), a database for the maintenance and retrieval of fusion data both in prokaryotic and eukaryotic organisms and the Software for the Analysis of Fusion Events (SAFE), a computational platform implemented for the automated detection, filtering and visualization of fusion events (both available at: http://www.bioacademy.gr/bioinformatics/projects/ProteinFusion/index.htm). Finally, we analyze the proteomes of three microorganisms using these tools in order to demonstrate their functionality. PMID:22267904
Protein Science by DNA Sequencing: How Advances in Molecular Biology Are Accelerating Biochemistry.
Higgins, Sean A; Savage, David F
2018-01-09
A fundamental goal of protein biochemistry is to determine the sequence-function relationship, but the vastness of sequence space makes comprehensive evaluation of this landscape difficult. However, advances in DNA synthesis and sequencing now allow researchers to assess the functional impact of every single mutation in many proteins, but challenges remain in library construction and the development of general assays applicable to a diverse range of protein functions. This Perspective briefly outlines the technical innovations in DNA manipulation that allow massively parallel protein biochemistry and then summarizes the methods currently available for library construction and the functional assays of protein variants. Areas in need of future innovation are highlighted with a particular focus on assay development and the use of computational analysis with machine learning to effectively traverse the sequence-function landscape. Finally, applications in the fundamentals of protein biochemistry, disease prediction, and protein engineering are presented.
Lisowska-Myjak, B; Skarżyńska, E; Bakun, M
2018-06-01
Intrauterine environmental factors can be associated with perinatal complications and long-term health outcomes although the underlying mechanisms remain poorly defined. Meconium formed exclusively in utero and passed naturally by a neonate may contain proteins which characterise the intrauterine environment. The aim of the study was proteomic analysis of the composition of meconium proteins and their classification by biological function. Proteomic techniques combining isoelectrofocussing fractionation and LC-MS/MS analysis were used to study the protein composition of a meconium sample obtained by pooling 50 serial meconium portions from 10 healthy full-term neonates. The proteins were classified by function based on the literature search for each protein in the PubMed database. A total of 946 proteins were identified in the meconium, including 430 proteins represented by two or more peptides. When the proteins were classified by their biological function the following were identified: immunoglobulin fragments and enzymatic, neutrophil-derived, structural and fetal intestine-specific proteins. Meconium is a rich source of proteins deposited in the fetal intestine during its development in utero. A better understanding of their specific biological functions in the intrauterine environment may help to identify these proteins which may serve as biomarkers associated with specific clinical conditions/diseases with the possible impact on the fetal development and further health consequences in infants, older children and adults.
Phosphoproteomic analysis of the Chlamydia caviae elementary body and reticulate body forms
Adams, Nancy E.; Maurelli, Anthony T.
2015-01-01
Chlamydia are Gram-negative, obligate intracellular bacteria responsible for significant diseases in humans and economically important domestic animals. These pathogens undergo a unique biphasic developmental cycle transitioning between the environmentally stable elementary body (EB) and the replicative intracellular reticulate body (RB), a conversion that appears to require extensive regulation of protein synthesis and function. However, Chlamydia possess a limited number of canonical mechanisms of transcriptional regulation. Ser/Thr/Tyr phosphorylation of proteins in bacteria has been increasingly recognized as an important mechanism of post-translational control of protein function. We utilized 2D gel electrophoresis coupled with phosphoprotein staining and MALDI-TOF/TOF analysis to map the phosphoproteome of the EB and RB forms of Chlamydia caviae. Forty-two non-redundant phosphorylated proteins were identified (some proteins were present in multiple locations within the gels). Thirty-four phosphorylated proteins were identified in EBs, including proteins found in central metabolism and protein synthesis, Chlamydia-specific hypothetical proteins and virulence-related proteins. Eleven phosphorylated proteins were identified in RBs, mostly involved in protein synthesis and folding and a single virulence-related protein. Only three phosphoproteins were found in both EB and RB phosphoproteomes. Collectively, 41 of 42 C. caviae phosphoproteins were present across Chlamydia species, consistent with the existence of a conserved chlamydial phosphoproteome. The abundance of stage-specific phosphoproteins suggests that protein phosphorylation may play a role in regulating the function of developmental-stage-specific proteins and/or may function in concert with other factors in directing EB–RB transitions. PMID:25998263
Phosphoproteomic analysis of the Chlamydia caviae elementary body and reticulate body forms.
Fisher, Derek J; Adams, Nancy E; Maurelli, Anthony T
2015-08-01
Chlamydia are Gram-negative, obligate intracellular bacteria responsible for significant diseases in humans and economically important domestic animals. These pathogens undergo a unique biphasic developmental cycle transitioning between the environmentally stable elementary body (EB) and the replicative intracellular reticulate body (RB), a conversion that appears to require extensive regulation of protein synthesis and function. However, Chlamydia possess a limited number of canonical mechanisms of transcriptional regulation. Ser/Thr/Tyr phosphorylation of proteins in bacteria has been increasingly recognized as an important mechanism of post-translational control of protein function. We utilized 2D gel electrophoresis coupled with phosphoprotein staining and MALDI-TOF/TOF analysis to map the phosphoproteome of the EB and RB forms of Chlamydia caviae. Forty-two non-redundant phosphorylated proteins were identified (some proteins were present in multiple locations within the gels). Thirty-four phosphorylated proteins were identified in EBs, including proteins found in central metabolism and protein synthesis, Chlamydia-specific hypothetical proteins and virulence-related proteins. Eleven phosphorylated proteins were identified in RBs, mostly involved in protein synthesis and folding and a single virulence-related protein. Only three phosphoproteins were found in both EB and RB phosphoproteomes. Collectively, 41 of 42 C. caviae phosphoproteins were present across Chlamydia species, consistent with the existence of a conserved chlamydial phosphoproteome. The abundance of stage-specific phosphoproteins suggests that protein phosphorylation may play a role in regulating the function of developmental-stage-specific proteins and/or may function in concert with other factors in directing EB-RB transitions.
WEBnm@ v2.0: Web server and services for comparing protein flexibility.
Tiwari, Sandhya P; Fuglebakk, Edvin; Hollup, Siv M; Skjærven, Lars; Cragnolini, Tristan; Grindhaug, Svenn H; Tekle, Kidane M; Reuter, Nathalie
2014-12-30
Normal mode analysis (NMA) using elastic network models is a reliable and cost-effective computational method to characterise protein flexibility and by extension, their dynamics. Further insight into the dynamics-function relationship can be gained by comparing protein motions between protein homologs and functional classifications. This can be achieved by comparing normal modes obtained from sets of evolutionary related proteins. We have developed an automated tool for comparative NMA of a set of pre-aligned protein structures. The user can submit a sequence alignment in the FASTA format and the corresponding coordinate files in the Protein Data Bank (PDB) format. The computed normalised squared atomic fluctuations and atomic deformation energies of the submitted structures can be easily compared on graphs provided by the web user interface. The web server provides pairwise comparison of the dynamics of all proteins included in the submitted set using two measures: the Root Mean Squared Inner Product and the Bhattacharyya Coefficient. The Comparative Analysis has been implemented on our web server for NMA, WEBnm@, which also provides recently upgraded functionality for NMA of single protein structures. This includes new visualisations of protein motion, visualisation of inter-residue correlations and the analysis of conformational change using the overlap analysis. In addition, programmatic access to WEBnm@ is now available through a SOAP-based web service. Webnm@ is available at http://apps.cbu.uib.no/webnma . WEBnm@ v2.0 is an online tool offering unique capability for comparative NMA on multiple protein structures. Along with a convenient web interface, powerful computing resources, and several methods for mode analyses, WEBnm@ facilitates the assessment of protein flexibility within protein families and superfamilies. These analyses can give a good view of how the structures move and how the flexibility is conserved over the different structures.
Yang, Mei; Cong, Min; Peng, Xiuming; Wu, Junrui; Wu, Rina; Liu, Biao; Ye, Wenhui; Yue, Xiqing
2016-05-18
Milk fat globule membrane (MFGM) proteins have many functions. To explore the different proteomics of human and bovine MFGM, MFGM proteins were separated from human and bovine colostrum and mature milk, and analyzed by the iTRAQ proteomic approach. A total of 411 proteins were recognized and quantified. Among these, 232 kinds of differentially expressed proteins were identified. These differentially expressed proteins were analyzed based on multivariate analysis, gene ontology (GO) annotation and KEGG pathway. Biological processes involved were response to stimulus, localization, establishment of localization, and the immune system process. Cellular components engaged were the extracellular space, extracellular region parts, cell fractions, and vesicles. Molecular functions touched upon were protein binding, nucleotide binding, and enzyme inhibitor activity. The KEGG pathway analysis showed several pathways, including regulation of the actin cytoskeleton, focal adhesion, neurotrophin signaling pathway, leukocyte transendothelial migration, tight junction, complement and coagulation cascades, vascular endothelial growth factor signaling pathway, and adherens junction. These results enhance our understanding of different proteomes of human and bovine MFGM across different lactation phases, which could provide important information and potential directions for the infant milk powder and functional food industries.
Zurawski, S M; Zurawski, G
1988-01-01
We have analyzed structure--function relationships of the protein hormone murine interleukin 2 by fine structural deletion mapping. A total of 130 deletion mutant proteins, together with some substitution and insertion mutant proteins, was expressed in Escherichia coli and analyzed for their ability to sustain the proliferation of a cloned murine T cell line. This analysis has permitted a functional map of the protein to be drawn and classifies five segments of the protein, which together contain 48% of the sequence, as unessential to the biological activity of the protein. A further 26% of the protein is classified as important, but not crucial, for the activity. Three regions, consisting of amino acids 32-35, 66-77 and 119-141 contain the remaining 26% of the protein and are critical to the biological activity of the protein. The functional map is discussed in the context of the possible role of the identified critical regions in the structure of the hormone and its binding to the interleukin 2 receptor complex. Images PMID:3261239
Wang, Jingwen; Zhao, Yuqi; Wang, Yanjie; Huang, Jingfei
2013-01-16
Coevolution between proteins is crucial for understanding protein-protein interaction. Simultaneous changes allow a protein complex to maintain its overall structural-functional integrity. In this study, we combined statistical coupling analysis (SCA) and molecular dynamics simulations on the CDK6-CDKN2A protein complex to evaluate coevolution between proteins. We reconstructed an inter-protein residue coevolution network, consisting of 37 residues and 37 interactions. It shows that most of the coevolved residue pairs are spatially proximal. When the mutations happened, the stable local structures were broken up and thus the protein interaction was decreased or inhibited, with a following increased risk of melanoma. The identification of inter-protein coevolved residues in the CDK6-CDKN2A complex can be helpful for designing protein engineering experiments. Copyright © 2012 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved.
Protein Sectors: Statistical Coupling Analysis versus Conservation
Teşileanu, Tiberiu; Colwell, Lucy J.; Leibler, Stanislas
2015-01-01
Statistical coupling analysis (SCA) is a method for analyzing multiple sequence alignments that was used to identify groups of coevolving residues termed “sectors”. The method applies spectral analysis to a matrix obtained by combining correlation information with sequence conservation. It has been asserted that the protein sectors identified by SCA are functionally significant, with different sectors controlling different biochemical properties of the protein. Here we reconsider the available experimental data and note that it involves almost exclusively proteins with a single sector. We show that in this case sequence conservation is the dominating factor in SCA, and can alone be used to make statistically equivalent functional predictions. Therefore, we suggest shifting the experimental focus to proteins for which SCA identifies several sectors. Correlations in protein alignments, which have been shown to be informative in a number of independent studies, would then be less dominated by sequence conservation. PMID:25723535
Integrating protein structural dynamics and evolutionary analysis with Bio3D.
Skjærven, Lars; Yao, Xin-Qiu; Scarabelli, Guido; Grant, Barry J
2014-12-10
Popular bioinformatics approaches for studying protein functional dynamics include comparisons of crystallographic structures, molecular dynamics simulations and normal mode analysis. However, determining how observed displacements and predicted motions from these traditionally separate analyses relate to each other, as well as to the evolution of sequence, structure and function within large protein families, remains a considerable challenge. This is in part due to the general lack of tools that integrate information of molecular structure, dynamics and evolution. Here, we describe the integration of new methodologies for evolutionary sequence, structure and simulation analysis into the Bio3D package. This major update includes unique high-throughput normal mode analysis for examining and contrasting the dynamics of related proteins with non-identical sequences and structures, as well as new methods for quantifying dynamical couplings and their residue-wise dissection from correlation network analysis. These new methodologies are integrated with major biomolecular databases as well as established methods for evolutionary sequence and comparative structural analysis. New functionality for directly comparing results derived from normal modes, molecular dynamics and principal component analysis of heterogeneous experimental structure distributions is also included. We demonstrate these integrated capabilities with example applications to dihydrofolate reductase and heterotrimeric G-protein families along with a discussion of the mechanistic insight provided in each case. The integration of structural dynamics and evolutionary analysis in Bio3D enables researchers to go beyond a prediction of single protein dynamics to investigate dynamical features across large protein families. The Bio3D package is distributed with full source code and extensive documentation as a platform independent R package under a GPL2 license from http://thegrantlab.org/bio3d/ .
Proteins related to the functions of fibroblast-like synoviocytes identified by proteomic analysis.
Zhang, Hui; Fan, Lie Ying; Zong, Ming; Sun, Li Shan; Lu, Liu
2012-01-01
It is well known that the fibroblast-like synoviocytes (FLS) play a key role in pathogenesis of rheumatoid arthritis (RA). This study was performed to separate the differentially expressed proteins of FLS from the patients with RA or osteoarthritis (OA) by two-dimensional electrophoresis (2-DE), and found proteins associated with the functions of FLS by mass spectrometry (MS). Total proteins were extracted and quantified from the primary cultured FLS from patients of RA (n=8) or OA (n=6). Proteins were separated by high-resolution 2-DE, and identified the differentially expressed proteins by MS. Western blot analyses was used to validated the expression of candidate proteins. The mRNA of these proteins was detected by semi-quantitative fluorescent PCR. There are 1147 protein spots from RA and 1324 protein spots from OA showed on 2-DE graphs, respectively. We have selected 84 protein spots for MS analysis, and 27 protein spots were successfully identified. We have found that protein isoaspartyl methyltransferase (PIMT) and pirin (iron-binding nuclear protein, PIR) with lower expression in RA, and thioredoxin 1(Trx-1) only expressed in RA may be associated with functions of FLS. Western Blot confirmed the expression of PIMT and pirin lower in RA, and Trx-1 expressed only in RA. The results of semi-quantitative fluorescent PCR are also consistent with 2-DE graphs. PIMT, pirin and Trx-1 affect the functions of FLS in some style and can be the drug targets of RA.
Protein sectors: evolutionary units of three-dimensional structure
Halabi, Najeeb; Rivoire, Olivier; Leibler, Stanislas; Ranganathan, Rama
2011-01-01
Proteins display a hierarchy of structural features at primary, secondary, tertiary, and higher-order levels, an organization that guides our current understanding of their biological properties and evolutionary origins. Here, we reveal a structural organization distinct from this traditional hierarchy by statistical analysis of correlated evolution between amino acids. Applied to the S1A serine proteases, the analysis indicates a decomposition of the protein into three quasi-independent groups of correlated amino acids that we term “protein sectors”. Each sector is physically connected in the tertiary structure, has a distinct functional role, and constitutes an independent mode of sequence divergence in the protein family. Functionally relevant sectors are evident in other protein families as well, suggesting that they may be general features of proteins. We propose that sectors represent a structural organization of proteins that reflects their evolutionary histories. PMID:19703402
Xu, Shou-Ling; Chalkley, Robert J; Maynard, Jason C; Wang, Wenfei; Ni, Weimin; Jiang, Xiaoyue; Shin, Kihye; Cheng, Ling; Savage, Dasha; Hühmer, Andreas F R; Burlingame, Alma L; Wang, Zhi-Yong
2017-02-21
Genetic studies have shown essential functions of O-linked N -acetylglucosamine (O-GlcNAc) modification in plants. However, the proteins and sites subject to this posttranslational modification are largely unknown. Here, we report a large-scale proteomic identification of O-GlcNAc-modified proteins and sites in the model plant Arabidopsis thaliana Using lectin weak affinity chromatography to enrich modified peptides, followed by mass spectrometry, we identified 971 O-GlcNAc-modified peptides belonging to 262 proteins. The modified proteins are involved in cellular regulatory processes, including transcription, translation, epigenetic gene regulation, and signal transduction. Many proteins have functions in developmental and physiological processes specific to plants, such as hormone responses and flower development. Mass spectrometric analysis of phosphopeptides from the same samples showed that a large number of peptides could be modified by either O-GlcNAcylation or phosphorylation, but cooccurrence of the two modifications in the same peptide molecule was rare. Our study generates a snapshot of the O-GlcNAc modification landscape in plants, indicating functions in many cellular regulation pathways and providing a powerful resource for further dissecting these functions at the molecular level.
Hu, Gang; Wu, Zhonghua
2017-01-01
Some of the intrinsically disordered proteins and protein regions are promiscuous interactors that are involved in one-to-many and many-to-one binding. Several studies have analyzed enrichment of intrinsic disorder among the promiscuous hub proteins. We extended these works by providing a detailed functional characterization of the disorder-enriched hub protein-protein interactions (PPIs), including both hubs and their interactors, and by analyzing their enrichment among disease-associated proteins. We focused on the human interactome, given its high degree of completeness and relevance to the analysis of the disease-linked proteins. We quantified and investigated numerous functional and structural characteristics of the disorder-enriched hub PPIs, including protein binding, structural stability, evolutionary conservation, several categories of functional sites, and presence of over twenty types of posttranslational modifications (PTMs). We showed that the disorder-enriched hub PPIs have a significantly enlarged number of disordered protein binding regions and long intrinsically disordered regions. They also include high numbers of targeting, catalytic, and many types of PTM sites. We empirically demonstrated that these hub PPIs are significantly enriched among 11 out of 18 considered classes of human diseases that are associated with at least 100 human proteins. Finally, we also illustrated how over a dozen specific human hubs utilize intrinsic disorder for their promiscuous PPIs. PMID:29257115
Elam, W Austin; Schrank, Travis P; Campagnolo, Andrew J; Hilser, Vincent J
2013-04-01
Intrinsically disordered (ID) proteins function in the absence of a unique stable structure and appear to challenge the classic structure-function paradigm. The extent to which ID proteins take advantage of subtle conformational biases to perform functions, and whether signals for such mechanism can be identified in proteome-wide studies is not well understood. Of particular interest is the polyproline II (PII) conformation, suggested to be highly populated in unfolded proteins. We experimentally determine a complete calorimetric propensity scale for the PII conformation. Projection of the scale into representative eukaryotic proteomes reveals significant PII bias in regions coding for ID proteins. Importantly, enrichment of PII in ID proteins, or protein segments, is also captured by other PII scales, indicating that this enrichment is robustly encoded and universally detectable regardless of the method of PII propensity determination. Gene ontology (GO) terms obtained using our PII scale and other scales demonstrate a consensus for molecular functions performed by high PII proteins across the proteome. Perhaps the most striking result of the GO analysis is conserved enrichment (P < 10(-8) ) of phosphorylation sites in high PII regions found by all PII scales. Subsequent conformational analysis reveals a phosphorylation-dependent modulation of PII, suggestive of a conserved "tunability" within these regions. In summary, the application of an experimentally determined polyproline II (PII) propensity scale to proteome-wide sequence analysis and gene ontology reveals an enrichment of PII bias near disordered phosphorylation sites that is conserved throughout eukaryotes. Copyright © 2013 The Protein Society.
Knutson, Stacy T; Westwood, Brian M; Leuthaeuser, Janelle B; Turner, Brandon E; Nguyendac, Don; Shea, Gabrielle; Kumar, Kiran; Hayden, Julia D; Harper, Angela F; Brown, Shoshana D; Morris, John H; Ferrin, Thomas E; Babbitt, Patricia C; Fetrow, Jacquelyn S
2017-04-01
Protein function identification remains a significant problem. Solving this problem at the molecular functional level would allow mechanistic determinant identification-amino acids that distinguish details between functional families within a superfamily. Active site profiling was developed to identify mechanistic determinants. DASP and DASP2 were developed as tools to search sequence databases using active site profiling. Here, TuLIP (Two-Level Iterative clustering Process) is introduced as an iterative, divisive clustering process that utilizes active site profiling to separate structurally characterized superfamily members into functionally relevant clusters. Underlying TuLIP is the observation that functionally relevant families (curated by Structure-Function Linkage Database, SFLD) self-identify in DASP2 searches; clusters containing multiple functional families do not. Each TuLIP iteration produces candidate clusters, each evaluated to determine if it self-identifies using DASP2. If so, it is deemed a functionally relevant group. Divisive clustering continues until each structure is either a functionally relevant group member or a singlet. TuLIP is validated on enolase and glutathione transferase structures, superfamilies well-curated by SFLD. Correlation is strong; small numbers of structures prevent statistically significant analysis. TuLIP-identified enolase clusters are used in DASP2 GenBank searches to identify sequences sharing functional site features. Analysis shows a true positive rate of 96%, false negative rate of 4%, and maximum false positive rate of 4%. F-measure and performance analysis on the enolase search results and comparison to GEMMA and SCI-PHY demonstrate that TuLIP avoids the over-division problem of these methods. Mechanistic determinants for enolase families are evaluated and shown to correlate well with literature results. © 2017 The Authors Protein Science published by Wiley Periodicals, Inc. on behalf of The Protein Society.
Chakraborty, Sandeep; Nascimento, Rafael; Zaini, Paulo A; Gouran, Hossein; Rao, Basuthkar J; Goulart, Luiz R; Dandekar, Abhaya M
2016-01-01
Background. Xylella fastidiosa, the causative agent of various plant diseases including Pierce's disease in the US, and Citrus Variegated Chlorosis in Brazil, remains a continual source of concern and economic losses, especially since almost all commercial varieties are sensitive to this Gammaproteobacteria. Differential expression of proteins in infected tissue is an established methodology to identify key elements involved in plant defense pathways. Methods. In the current work, we developed a methodology named CHURNER that emphasizes relevant protein functions from proteomic data, based on identification of proteins with similar structures that do not necessarily have sequence homology. Such clustering emphasizes protein functions which have multiple copies that are up/down-regulated, and highlights similar proteins which are differentially regulated. As a working example we present proteomic data enumerating differentially expressed proteins in xylem sap from grapevines that were infected with X. fastidiosa. Results. Analysis of this data by CHURNER highlighted pathogenesis related PR-1 proteins, reinforcing this as the foremost protein function in xylem sap involved in the grapevine defense response to X. fastidiosa. β-1, 3-glucanase, which has both anti-microbial and anti-fungal activities, is also up-regulated. Simultaneously, chitinases are found to be both up and down-regulated by CHURNER, and thus the net gain of this protein function loses its significance in the defense response. Discussion. We demonstrate how structural data can be incorporated in the pipeline of proteomic data analysis prior to making inferences on the importance of individual proteins to plant defense mechanisms. We expect CHURNER to be applicable to any proteomic data set.
Proteomic analysis of bovine nucleolus.
Patel, Amrutlal K; Olson, Doug; Tikoo, Suresh K
2010-09-01
Nucleolus is the most prominent subnuclear structure, which performs a wide variety of functions in the eukaryotic cellular processes. In order to understand the structural and functional role of the nucleoli in bovine cells, we analyzed the proteomic composition of the bovine nucleoli. The nucleoli were isolated from Madin Darby bovine kidney cells and subjected to proteomic analysis by LC-MS/MS after fractionation by SDS-PAGE and strong cation exchange chromatography. Analysis of the data using the Mascot database search and the GPM database search identified 311 proteins in the bovine nucleoli, which contained 22 proteins previously not identified in the proteomic analysis of human nucleoli. Analysis of the identified proteins using the GoMiner software suggested that the bovine nucleoli contained proteins involved in ribosomal biogenesis, cell cycle control, transcriptional, translational and post-translational regulation, transport, and structural organization. Copyright © 2010 Beijing Genomics Institute. Published by Elsevier Ltd. All rights reserved.
Gene Fusion: A Genome Wide Survey
NASA Technical Reports Server (NTRS)
Liang, Ping; Riley, Monica
2001-01-01
As a well known fact, organisms form larger and complex multimodular (composite or chimeric) and mostly multi-functional proteins through gene fusion of two or more individual genes which have independent evolution histories and functions. We call each of these components a module. The existence of multimodular proteins may improves the efficiency in gene regulation and in cellular functions, and thus may give the host organism advantages in adaptation to environments. Analysis of all gene fusions in present-day organisms should allow us to examine the patterns of gene fusion in context with cellular functions, to trace back the evolution processes from the ancient smaller and uni-functional proteins to the present-day larger and complex multi-functional proteins, and to estimate the minimal number of ancestor proteins that existed in the last common ancestor for all life on earth. Although many multimodular proteins have been experimentally known, identification of gene fusion events systematically at genome scale had not been possible until recently when large number of completed genome sequences have been becoming available. In addition, technical difficulties for such analysis also exist due to the complexity of this biological and evolutionary process. We report from this study a new strategy to computationally identify multimodular proteins using completed genome sequences and the results surveyed from 22 organisms with the data from over 40 organisms to be presented during the meeting. Additional information is contained in the original extended abstract.
Assigning protein functions by comparative genome analysis protein phylogenetic profiles
Pellegrini, Matteo; Marcotte, Edward M.; Thompson, Michael J.; Eisenberg, David; Grothe, Robert; Yeates, Todd O.
2003-05-13
A computational method system, and computer program are provided for inferring functional links from genome sequences. One method is based on the observation that some pairs of proteins A' and B' have homologs in another organism fused into a single protein chain AB. A trans-genome comparison of sequences can reveal these AB sequences, which are Rosetta Stone sequences because they decipher an interaction between A' and B. Another method compares the genomic sequence of two or more organisms to create a phylogenetic profile for each protein indicating its presence or absence across all the genomes. The profile provides information regarding functional links between different families of proteins. In yet another method a combination of the above two methods is used to predict functional links.
Sailem, Heba Z.; Kümper, Sandra; Tape, Christopher J.; McCully, Ryan R.; Paul, Angela; Anjomani-Virmouni, Sara; Jørgensen, Claus; Poulogiannis, George; Marshall, Christopher J.
2017-01-01
Localisation and protein function are intimately linked in eukaryotes, as proteins are localised to specific compartments where they come into proximity of other functionally relevant proteins. Significant co-localisation of two proteins can therefore be indicative of their functional association. We here present COLA, a proteomics based strategy coupled with a bioinformatics framework to detect protein–protein co-localisations on a global scale. COLA reveals functional interactions by matching proteins with significant similarity in their subcellular localisation signatures. The rapid nature of COLA allows mapping of interactome dynamics across different conditions or treatments with high precision. PMID:27824369
Domain fusion analysis by applying relational algebra to protein sequence and domain databases.
Truong, Kevin; Ikura, Mitsuhiko
2003-05-06
Domain fusion analysis is a useful method to predict functionally linked proteins that may be involved in direct protein-protein interactions or in the same metabolic or signaling pathway. As separate domain databases like BLOCKS, PROSITE, Pfam, SMART, PRINTS-S, ProDom, TIGRFAMs, and amalgamated domain databases like InterPro continue to grow in size and quality, a computational method to perform domain fusion analysis that leverages on these efforts will become increasingly powerful. This paper proposes a computational method employing relational algebra to find domain fusions in protein sequence databases. The feasibility of this method was illustrated on the SWISS-PROT+TrEMBL sequence database using domain predictions from the Pfam HMM (hidden Markov model) database. We identified 235 and 189 putative functionally linked protein partners in H. sapiens and S. cerevisiae, respectively. From scientific literature, we were able to confirm many of these functional linkages, while the remainder offer testable experimental hypothesis. Results can be viewed at http://calcium.uhnres.utoronto.ca/pi. As the analysis can be computed quickly on any relational database that supports standard SQL (structured query language), it can be dynamically updated along with the sequence and domain databases, thereby improving the quality of predictions over time.
Random heteropolymers preserve protein function in foreign environments
NASA Astrophysics Data System (ADS)
Panganiban, Brian; Qiao, Baofu; Jiang, Tao; DelRe, Christopher; Obadia, Mona M.; Nguyen, Trung Dac; Smith, Anton A. A.; Hall, Aaron; Sit, Izaac; Crosby, Marquise G.; Dennis, Patrick B.; Drockenmuller, Eric; Olvera de la Cruz, Monica; Xu, Ting
2018-03-01
The successful incorporation of active proteins into synthetic polymers could lead to a new class of materials with functions found only in living systems. However, proteins rarely function under the conditions suitable for polymer processing. On the basis of an analysis of trends in protein sequences and characteristic chemical patterns on protein surfaces, we designed four-monomer random heteropolymers to mimic intrinsically disordered proteins for protein solubilization and stabilization in non-native environments. The heteropolymers, with optimized composition and statistical monomer distribution, enable cell-free synthesis of membrane proteins with proper protein folding for transport and enzyme-containing plastics for toxin bioremediation. Controlling the statistical monomer distribution in a heteropolymer, rather than the specific monomer sequence, affords a new strategy to interface with biological systems for protein-based biomaterials.
Huang, Hung-Jen; Chen, Wei-Yu; Wu, Jer-Horng
2014-01-01
Protein recovery is crucial for shotgun metaproteomics to study the in situ functionality of microbial populations from complex biofilms but still poorly addressed by far. To fill this knowledge gap, we systematically evaluated the sample preparation with extraction buffers comprising four detergents for the metaproteomics analysis of a terephthalate-degrading methanogenic biofilm using an on-line two-dimensional liquid chromatography tandem mass spectrometry (2D-LC-MS/MS) system. Totally, 1018 non-repeated proteins were identified with the four treatments. On the whole, each treatment could recover the biofilm proteins with specific distributions of molecular weight, hydrophobicity, and isoelectric point. The extraction buffers containing zwitterionic and anionic detergents were found to harvest the proteins with better efficiency and quality, allowing identification up to 76.2% of total identified proteins with the LC-MS/MS analysis. According to the annotation with a relevant metagenomic database, we further observed different taxonomic profiles of bacterial and archaeal members and discriminable patterns of the functional expression among the extraction buffers used. Overall, the finding of the present study provides first insight to the effect of the detergents on the characteristics of extractable proteins from biofilm and the developed protocol combined with nano 2D-LC/MS/MS analysis can improve the metaproteomics studies on microbial functionality of biofilms in the wastewater treatment systems. PMID:24914765
Physical Model of the Genotype-to-Phenotype Map of Proteins
NASA Astrophysics Data System (ADS)
Tlusty, Tsvi; Libchaber, Albert; Eckmann, Jean-Pierre
2017-04-01
How DNA is mapped to functional proteins is a basic question of living matter. We introduce and study a physical model of protein evolution which suggests a mechanical basis for this map. Many proteins rely on large-scale motion to function. We therefore treat protein as learning amorphous matter that evolves towards such a mechanical function: Genes are binary sequences that encode the connectivity of the amino acid network that makes a protein. The gene is evolved until the network forms a shear band across the protein, which allows for long-range, soft modes required for protein function. The evolution reduces the high-dimensional sequence space to a low-dimensional space of mechanical modes, in accord with the observed dimensional reduction between genotype and phenotype of proteins. Spectral analysis of the space of 1 06 solutions shows a strong correspondence between localization around the shear band of both mechanical modes and the sequence structure. Specifically, our model shows how mutations are correlated among amino acids whose interactions determine the functional mode.
Predicting protein functions from redundancies in large-scale protein interaction networks
NASA Technical Reports Server (NTRS)
Samanta, Manoj Pratim; Liang, Shoudan
2003-01-01
Interpreting data from large-scale protein interaction experiments has been a challenging task because of the widespread presence of random false positives. Here, we present a network-based statistical algorithm that overcomes this difficulty and allows us to derive functions of unannotated proteins from large-scale interaction data. Our algorithm uses the insight that if two proteins share significantly larger number of common interaction partners than random, they have close functional associations. Analysis of publicly available data from Saccharomyces cerevisiae reveals >2,800 reliable functional associations, 29% of which involve at least one unannotated protein. By further analyzing these associations, we derive tentative functions for 81 unannotated proteins with high certainty. Our method is not overly sensitive to the false positives present in the data. Even after adding 50% randomly generated interactions to the measured data set, we are able to recover almost all (approximately 89%) of the original associations.
Longo, Liam; Lee, Jihun; Blaber, Michael
2012-12-01
The acquisition of function is often associated with destabilizing mutations, giving rise to the stability-function tradeoff hypothesis. To test whether function is also accommodated at the expense of foldability, fibroblast growth factor-1 (FGF-1) was subjected to a comprehensive φ-value analysis at each of the 11 turn regions. FGF-1, a β-trefoil fold, represents an excellent model system with which to evaluate the influence of function on foldability: because of its threefold symmetric structure, analysis of FGF-1 allows for direct comparisons between symmetry-related regions of the protein that are associated with function to those that are not; thus, a structural basis for regions of foldability can potentially be identified. The resulting φ-value distribution of FGF-1 is highly polarized, with the majority of positions described as either folded-like or denatured-like in the folding transition state. Regions important for folding are shown to be asymmetrically distributed within the protein architecture; furthermore, regions associated with function (i.e., heparin-binding affinity and receptor-binding affinity) are localized to regions of the protein that fold after barrier crossing (late in the folding pathway). These results provide experimental support for the foldability-function tradeoff hypothesis in the evolution of FGF-1. Notably, the results identify the potential for folding redundancy in symmetric protein architecture with important implications for protein evolution and design. Copyright © 2012 The Protein Society.
Odronitz, Florian; Kollmar, Martin
2006-11-29
Annotation of protein sequences of eukaryotic organisms is crucial for the understanding of their function in the cell. Manual annotation is still by far the most accurate way to correctly predict genes. The classification of protein sequences, their phylogenetic relation and the assignment of function involves information from various sources. This often leads to a collection of heterogeneous data, which is hard to track. Cytoskeletal and motor proteins consist of large and diverse superfamilies comprising up to several dozen members per organism. Up to date there is no integrated tool available to assist in the manual large-scale comparative genomic analysis of protein families. Pfarao (Protein Family Application for Retrieval, Analysis and Organisation) is a database driven online working environment for the analysis of manually annotated protein sequences and their relationship. Currently, the system can store and interrelate a wide range of information about protein sequences, species, phylogenetic relations and sequencing projects as well as links to literature and domain predictions. Sequences can be imported from multiple sequence alignments that are generated during the annotation process. A web interface allows to conveniently browse the database and to compile tabular and graphical summaries of its content. We implemented a protein sequence-centric web application to store, organize, interrelate, and present heterogeneous data that is generated in manual genome annotation and comparative genomics. The application has been developed for the analysis of cytoskeletal and motor proteins (CyMoBase) but can easily be adapted for any protein.
Dynamical analysis of yeast protein interaction network during the sake brewing process.
Mirzarezaee, Mitra; Sadeghi, Mehdi; Araabi, Babak N
2011-12-01
Proteins interact with each other for performing essential functions of an organism. They change partners to get involved in various processes at different times or locations. Studying variations of protein interactions within a specific process would help better understand the dynamic features of the protein interactions and their functions. We studied the protein interaction network of Saccharomyces cerevisiae (yeast) during the brewing of Japanese sake. In this process, yeast cells are exposed to several stresses. Analysis of protein interaction networks of yeast during this process helps to understand how protein interactions of yeast change during the sake brewing process. We used gene expression profiles of yeast cells for this purpose. Results of our experiments revealed some characteristics and behaviors of yeast hubs and non-hubs and their dynamical changes during the brewing process. We found that just a small portion of the proteins (12.8 to 21.6%) is responsible for the functional changes of the proteins in the sake brewing process. The changes in the number of edges and hubs of the yeast protein interaction networks increase in the first stages of the process and it then decreases at the final stages.
Protein profile in HBx transfected cells: a comparative iTRAQ-coupled 2D LC-MS/MS analysis.
Feng, Huixing; Li, Xi; Niu, Dandan; Chen, Wei Ning
2010-06-16
The x protein of HBV (HBx) has been involved in the development of hepatocellular carcinoma (HCC), with a possible link to individual genotypes. Nevertheless, the underlying mechanism remains obscure. In this study, we aim to identify the HBx-induced protein profile in HepG2 cells by LC-MS/MS proteomics analysis. Our results indicated that proteins were differentially expressed in HepG2 cells transfected by HBx of various genotypes. Proteins associated with cytoskeleton were found to be either up-regulated (MACF1, HMGB1, Annexin A2) or down-regulated (Lamin A/C). These may in turn result in the decrease of focal adhesion and increase of cell migration in response to HBx. Levels of other cellular proteins with reported impact on the function of extracellular matrix (ECM) proteins and cell migration, including Ca(2+)-binding proteins (S100A11, S100A6, and S100A4) and proteasome protein (PSMA3), were affected by HBx. The differential protein profile identified in this study was also supported by our functional assay which indicated that cell migration was enhanced by HBx. Our preliminary study provided a new platform to establish a comprehensive cellular protein profile by LC-MS/MS proteomics analysis. Further downstream functional assays, including our reported cell migration assay, should provide new insights in the association between HCC and HBx. Copyright 2009 Elsevier B.V. All rights reserved.
Liu, Hua-Wei; Liang, Chao-Qiong; Liu, Peng-Fei; Luo, Lai-Xin; Li, Jian-Qiang
2015-12-15
Since it was first reported in 1935, Cucumber green mottle mosaic virus (CGMMV) has become a serious pathogen in a range of cucurbit crops. The virus is generally transmitted by propagation materials, and to date no effective chemical or cultural methods of control have been developed to combat its spread. The current study presents a preliminary analysis of the pathogenic mechanisms from the perspective of protein expression levels in an infected cucumber host, with the objective of elucidating the infection process and potential strategies to reduce both the economic and yield losses associated with CGMMV. Isobaric tags for relative and absolute quantitation (iTRAQ) technology coupled with liquid chromatography-tandem mass spectrometric (LC-MS/MS) were used to identify the differentially expressed proteins in cucumber plants infected with CGMMV compared with mock-inoculated plants. The functions of the proteins were deduced by functional annotation and their involvement in metabolic processes explored by KEGG pathway analysis to identify their interactions during CGMMV infection, while their in vivo expression was further verified by qPCR. Infection by CGMMV altered both the expression level and absolute quantity of 38 proteins (fold change >0.6) in cucumber hosts. Of these, 23 were found to be up-regulated, while 15 were down-regulated. Gene ontology (GO) analysis revealed that 22 of the proteins had a combined function and were associated with molecular function (MF), biological process (BP) and cellular component (CC). Several other proteins had a dual function with 1, 7, and 2 proteins being associated with BP/CC, BP/MF, CC/MF, respectively. The remaining 3 proteins were only involved in MF. In addition, Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis identified 18 proteins that were involved in 13 separate metabolic pathways. These pathways were subsequently merged to generate three network diagrams illustrating the interactions between the different pathways, while qPCR was used to track the changes in expression levels of the proteins identified at 3 time points during CGMMV infection. Taken together these results greatly expand our understanding of the relationships between CGMMV and cucumber hosts. The results of the study indicate that CGMMV infection significantly changes the physiology of cucumbers, affecting the expression levels of individual proteins as well as entire metabolic pathways. The bioinformatic analysis also identified several pathogenesis-related (PR) proteins that could be useful in the development of disease-resistant plants.
Proteomic analysis and food-grade enzymes of Moringa oleifer Lam. a Lam. flower.
Shi, Yanan; Wang, Xuefeng; Huang, Aixiang
2018-08-01
Moringa oleifer Lam. flower contain high-proteins and function nutrients. Many advances have been made to it, but there is still no proteomic information of this species. Total protein from the flowers applied shotgun 2DLC-MS/MS proteomic identified 9443 peptides corresponding to 4004 high-confidence proteins by Proteome Discoverer™ Software 2.1. These proteins were mostly distributed ranging between 40 and 70 kDa. Gene Ontology (GO) analysis indicated that the largest of the proteins were cytoplasm 72.7%, catalytic activity 61.5% and macromolecule metabolism 43.7%, and KEGG analysis revealed that the largest group of 129 proteins was involved in Ribosome to directing protein synthesis (translation). Moreover, a number of commercially important food-grade enzymes were commented, 261 proteins were annotated as carbohydrate-active enzymes, 16 protease, 22 proteins are assigned to the citrate cycle, which the top proteins were assigned to GH family, cysteine synthase and serine/threonine-protein phosphatase. These enzymes indicated that is a new source with potential use for fermentation and brewing industry, fruit and vegetable storage and the development of function peptides. Copyright © 2018 Elsevier B.V. All rights reserved.
Morais do Amaral, Alexandre; Antoniw, John; Rudd, Jason J.; Hammond-Kosack, Kim E.
2012-01-01
The Dothideomycete fungus Mycosphaerella graminicola is the causal agent of Septoria tritici blotch, a devastating disease of wheat leaves that causes dramatic decreases in yield. Infection involves an initial extended period of symptomless intercellular colonisation prior to the development of visible necrotic disease lesions. Previous functional genomics and gene expression profiling studies have implicated the production of secreted virulence effector proteins as key facilitators of the initial symptomless growth phase. In order to identify additional candidate virulence effectors, we re-analysed and catalogued the predicted protein secretome of M. graminicola isolate IPO323, which is currently regarded as the reference strain for this species. We combined several bioinformatic approaches in order to increase the probability of identifying truly secreted proteins with either a predicted enzymatic function or an as yet unknown function. An initial secretome of 970 proteins was predicted, whilst further stringent selection criteria predicted 492 proteins. Of these, 321 possess some functional annotation, the composition of which may reflect the strictly intercellular growth habit of this pathogen, leaving 171 with no functional annotation. This analysis identified a protein family encoding secreted peroxidases/chloroperoxidases (PF01328) which is expanded within all members of the family Mycosphaerellaceae. Further analyses were done on the non-annotated proteins for size and cysteine content (effector protein hallmarks), and then by studying the distribution of homologues in 17 other sequenced Dothideomycete fungi within an overall total of 91 predicted proteomes from fungal, oomycete and nematode species. This detailed M. graminicola secretome analysis provides the basis for further functional and comparative genomics studies. PMID:23236356
Ujang, Jorim Anak; Kwan, Soon Hong; Ismail, Mohd Nazri; Lim, Boon Huat; Noordin, Rahmah; Othman, Nurulhasanah
2016-01-01
Excretory-secretory (ES) proteins of E. histolytica are thought to play important roles in the host invasion, metabolism, and defence. Elucidation of the types and functions of E. histolytica ES proteins can further our understanding of the disease pathogenesis. Thus, the aim of this study is to use proteomics approach to better understand the complex ES proteins of the protozoa. E. histolytica ES proteins were prepared by culturing the trophozoites in protein-free medium. The ES proteins were identified using two mass spectrometry tools, namely, LC-ESI-MS/MS and LC-MALDI-TOF/TOF. The identified proteins were then classified according to their biological processes, molecular functions, and cellular components using the Panther classification system (PantherDB). A complementary list of 219 proteins was identified; this comprised 201 proteins detected by LC-ESI-MS/MS and 107 proteins by LC-MALDI-TOF/TOF. Of the 219 proteins, 89 were identified by both mass-spectrometry systems, while 112 and 18 proteins were detected exclusively by LC-ESI-MS/MS and LC-MALDI-TOF/TOF respectively. Biological protein functional analysis using PantherDB showed that 27% of the proteins were involved in metabolic processes. Using molecular functional and cellular component analyses, 35% of the proteins were found to be involved in catalytic activity, and 21% were associated with the cell parts. This study showed that complementary use of LC-ESI-MS/MS and LC-MALDI-TOF/TOF has improved the identification of ES proteins. The results have increased our understanding of the types of proteins excreted/secreted by the amoeba and provided further evidence of the involvement of ES proteins in intestinal colonisation and evasion of the host immune system, as well as in encystation and excystation of the parasite.
Westernberg, Luise; Pham, John; Lane, Jerome; Paul, Sinu; Greenbaum, Jason; Stranzl, Thomas; Lund, Gitte; Hoof, Ilka; Holm, Jens; Würtzen, Peter A; Meno, Kåre H.; Frazier, April; Schulten, Veronique; Andersen, Peter S.; Peters, Bjoern; Sette, Alessandro
2016-01-01
BACKGROUND House dust mite (HDM) allergens are a common cause of allergy and allergic asthma. A comprehensive analysis of proteins targeted by T cells, which are implicated in the development and regulation of allergic disease independent of their antibody reactivity, is still lacking. OBJECTIVE To comprehensively analyze the HDM-derived protein targets of T cell responses in HDM-allergic individuals, and investigate their correlation with IgE/IgG responses and protein function. METHODS Proteomic analysis (liquid chromatography tandem mass spectrometry) of HDM extracts identified 90 distinct protein clusters, corresponding to 29 known allergens and 61 novel proteins. Peripheral blood mononuclear cells (PBMC) from 20 HDM-allergic individuals were stimulated with HDM extracts and assayed with a set of ~2500 peptides derived from these 90 protein clusters and predicted to bind the most common HLA class II types. 2D immunoblots were made in parallel to elucidate IgE and IgG reactivity and putative function analyses were performed in silico according to gene ontology (GO) annotations. RESULTS Analysis of T cell reactivity revealed a large number of T cell epitopes. Overall response magnitude and frequency was comparable for known and novel proteins, with 15 antigens (nine of which were novel) dominating the total T cell response. Most of the known allergens that were dominant at the T cell level were also IgE-reactive, as expected, while few novel dominant T cell antigens were IgE reactive. Among known allergens, hydrolase activity and detectable IgE/IgG reactivity are strongly correlated, while no protein function correlates with immunogenicity of novel proteins. A total of 106 epitopes accounted for half of the total T-cell response, underlining the heterogeneity of T cell responses to HDM allergens. CONCLUSIONS AND CLINICAL RELEVANCE Herein, we define the T cell targets for both known allergens and novel proteins, which may inform future diagnostics and immunotherapeutics for allergy to HDM. PMID:27684489
Multi-Harmony: detecting functional specificity from sequence alignment
Brandt, Bernd W.; Feenstra, K. Anton; Heringa, Jaap
2010-01-01
Many protein families contain sub-families with functional specialization, such as binding different ligands or being involved in different protein–protein interactions. A small number of amino acids generally determine functional specificity. The identification of these residues can aid the understanding of protein function and help finding targets for experimental analysis. Here, we present multi-Harmony, an interactive web sever for detecting sub-type-specific sites in proteins starting from a multiple sequence alignment. Combining our Sequence Harmony (SH) and multi-Relief (mR) methods in one web server allows simultaneous analysis and comparison of specificity residues; furthermore, both methods have been significantly improved and extended. SH has been extended to cope with more than two sub-groups. mR has been changed from a sampling implementation to a deterministic one, making it more consistent and user friendly. For both methods Z-scores are reported. The multi-Harmony web server produces a dynamic output page, which includes interactive connections to the Jalview and Jmol applets, thereby allowing interactive analysis of the results. Multi-Harmony is available at http://www.ibi.vu.nl/ programs/shmrwww. PMID:20525785
COGNAT: a web server for comparative analysis of genomic neighborhoods.
Klimchuk, Olesya I; Konovalov, Kirill A; Perekhvatov, Vadim V; Skulachev, Konstantin V; Dibrova, Daria V; Mulkidjanian, Armen Y
2017-11-22
In prokaryotic genomes, functionally coupled genes can be organized in conserved gene clusters enabling their coordinated regulation. Such clusters could contain one or several operons, which are groups of co-transcribed genes. Those genes that evolved from a common ancestral gene by speciation (i.e. orthologs) are expected to have similar genomic neighborhoods in different organisms, whereas those copies of the gene that are responsible for dissimilar functions (i.e. paralogs) could be found in dissimilar genomic contexts. Comparative analysis of genomic neighborhoods facilitates the prediction of co-regulated genes and helps to discern different functions in large protein families. We intended, building on the attribution of gene sequences to the clusters of orthologous groups of proteins (COGs), to provide a method for visualization and comparative analysis of genomic neighborhoods of evolutionary related genes, as well as a respective web server. Here we introduce the COmparative Gene Neighborhoods Analysis Tool (COGNAT), a web server for comparative analysis of genomic neighborhoods. The tool is based on the COG database, as well as the Pfam protein families database. As an example, we show the utility of COGNAT in identifying a new type of membrane protein complex that is formed by paralog(s) of one of the membrane subunits of the NADH:quinone oxidoreductase of type 1 (COG1009) and a cytoplasmic protein of unknown function (COG3002). This article was reviewed by Drs. Igor Zhulin, Uri Gophna and Igor Rogozin.
A Survey of Computational Intelligence Techniques in Protein Function Prediction
Tiwari, Arvind Kumar; Srivastava, Rajeev
2014-01-01
During the past, there was a massive growth of knowledge of unknown proteins with the advancement of high throughput microarray technologies. Protein function prediction is the most challenging problem in bioinformatics. In the past, the homology based approaches were used to predict the protein function, but they failed when a new protein was different from the previous one. Therefore, to alleviate the problems associated with homology based traditional approaches, numerous computational intelligence techniques have been proposed in the recent past. This paper presents a state-of-the-art comprehensive review of various computational intelligence techniques for protein function predictions using sequence, structure, protein-protein interaction network, and gene expression data used in wide areas of applications such as prediction of DNA and RNA binding sites, subcellular localization, enzyme functions, signal peptides, catalytic residues, nuclear/G-protein coupled receptors, membrane proteins, and pathway analysis from gene expression datasets. This paper also summarizes the result obtained by many researchers to solve these problems by using computational intelligence techniques with appropriate datasets to improve the prediction performance. The summary shows that ensemble classifiers and integration of multiple heterogeneous data are useful for protein function prediction. PMID:25574395
Determining Protease Activity In Vivo by Fluorescence Cross-Correlation Analysis
Kohl, Tobias; Haustein, Elke; Schwille, Petra
2005-01-01
To date, most biochemical approaches to unravel protein function have focused on purified proteins in vitro. Whereas they analyze enzyme performance under assay conditions, they do not necessarily tell us what is relevant within a living cell. Ideally, cellular functions should be examined in situ. In particular, association/dissociation reactions are ubiquitous, but so far there is no standard technique permitting online analysis of these processes in vivo. Featuring single-molecule sensitivity combined with intrinsic averaging, fluorescence correlation spectroscopy is a minimally invasive technique ideally suited to monitor proteins. Moreover, endogenous fluorescence-based assays can be established by genetically encoding fusions of autofluorescent proteins and cellular proteins, thus avoiding the disadvantages of in vitro protein labeling and subsequent delivery to cells. Here, we present an in vivo protease assay as a model system: Green and red autofluorescent proteins were connected by Caspase-3- sensitive and insensitive protein linkers to create double-labeled protease substrates. Then, dual-color fluorescence cross-correlation spectroscopy was employed to study the protease reaction in situ. Allowing assessment of multiple dynamic parameters simultaneously, this method provided internal calibration and improved experimental resolution for quantifying protein stability. This approach, which is easily extended to reversible protein-protein interactions, seems very promising for elucidating intracellular protein functions. PMID:16055538
Andersen, Tonni Grube; Nintemann, Sebastian J.; Marek, Magdalena; Halkier, Barbara A.; Schulz, Alexander; Burow, Meike
2016-01-01
When investigating interactions between two proteins with complementary reporter tags in yeast two-hybrid or split GFP assays, it remains troublesome to discriminate true- from false-negative results and challenging to compare the level of interaction across experiments. This leads to decreased sensitivity and renders analysis of weak or transient interactions difficult to perform. In this work, we describe the development of reporters that can be chemically induced to dimerize independently of the investigated interactions and thus alleviate these issues. We incorporated our reporters into the widely used split ubiquitin-, bimolecular fluorescence complementation (BiFC)- and Förster resonance energy transfer (FRET)- based methods and investigated different protein-protein interactions in yeast and plants. We demonstrate the functionality of this concept by the analysis of weakly interacting proteins from specialized metabolism in the model plant Arabidopsis thaliana. Our results illustrate that chemically induced dimerization can function as a built-in control for split-based systems that is easily implemented and allows for direct evaluation of functionality. PMID:27282591
Practical analysis of specificity-determining residues in protein families.
Chagoyen, Mónica; García-Martín, Juan A; Pazos, Florencio
2016-03-01
Determining the residues that are important for the molecular activity of a protein is a topic of broad interest in biomedicine and biotechnology. This knowledge can help understanding the protein's molecular mechanism as well as to fine-tune its natural function eventually with biotechnological or therapeutic implications. Some of the protein residues are essential for the function common to all members of a family of proteins, while others explain the particular specificities of certain subfamilies (like binding on different substrates or cofactors and distinct binding affinities). Owing to the difficulty in experimentally determining them, a number of computational methods were developed to detect these functional residues, generally known as 'specificity-determining positions' (or SDPs), from a collection of homologous protein sequences. These methods are mature enough for being routinely used by molecular biologists in directing experiments aimed at getting insight into the functional specificity of a family of proteins and eventually modifying it. In this review, we summarize some of the recent discoveries achieved through SDP computational identification in a number of relevant protein families, as well as the main approaches and software tools available to perform this type of analysis. © The Author 2015. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.
Naqvi, Ahmad Abu Turab; Shahbaaz, Mohd; Ahmad, Faizan; Hassan, Md Imtaiyaz
2015-01-01
Syphilis is a globally occurring venereal disease, and its infection is propagated through sexual contact. The causative agent of syphilis, Treponema pallidum ssp. pallidum, a Gram-negative sphirochaete, is an obligate human parasite. Genome of T. pallidum ssp. pallidum SS14 strain (RefSeq NC_010741.1) encodes 1,027 proteins, of which 444 proteins are known as hypothetical proteins (HPs), i.e., proteins of unknown functions. Here, we performed functional annotation of HPs of T. pallidum ssp. pallidum using various database, domain architecture predictors, protein function annotators and clustering tools. We have analyzed the sequences of 444 HPs of T. pallidum ssp. pallidum and subsequently predicted the function of 207 HPs with a high level of confidence. However, functions of 237 HPs are predicted with less accuracy. We found various enzymes, transporters, binding proteins in the annotated group of HPs that may be possible molecular targets, facilitating for the survival of pathogen. Our comprehensive analysis helps to understand the mechanism of pathogenesis to provide many novel potential therapeutic interventions.
Kristiansen, Lars V; Velasquez, Emma; Romani, Susana; Baars, Sigrid; Berezin, Vladimir; Bock, Elisabeth; Hortsch, Michael; Garcia-Alonso, Luis
2005-01-01
L1- and NCAM-type cell adhesion molecules represent distinct protein families that function as specific receptors for different axon guidance cues. However, both L1 and NCAM proteins promote axonal growth by inducing neuronal tyrosine kinase activity and are coexpressed in subsets of axon tracts in arthropods and vertebrates. We have studied the functional requirements for the Drosophila L1- and NCAM-type proteins, Neuroglian (Nrg) and Fasciclin II (FasII), during postembryonic sensory axon guidance. The rescue of the Neuroglian loss-of-function (LOF) phenotype by transgenically expressed L1- and NCAM-type proteins demonstrates a functional interchangeability between these proteins in Drosophila photoreceptor pioneer axons, where both proteins are normally coexpressed. In contrast, the ectopic expression of Fasciclin II in mechanosensory neurons causes a strong enhancement of the axonal misguidance phenotype. Moreover, our findings demonstrate that this functionally redundant specificity to mediate axon guidance has been conserved in their vertebrate homologs, L1-CAM and NCAM.
Zhan, Shaohua; Zhang, Wenhao; Xiong, Feng; Ge, Wei
2017-01-01
Phagocytosis and autophagy in macrophages have been shown to be essential to both innate and adaptive immunity. Lysosomes are the main catabolic subcellular organelles responsible for degradation and recycling of both extracellular and intracellular material, which are the final steps in phagocytosis and autophagy. However, the molecular mechanisms underlying lysosomal functions after infection remain obscure. In this study, we conducted a quantitative proteomics analysis of the changes in constitution and glycosylation of proteins in lysosomes derived from murine RAW 264.7 macrophage cells treated with different types of pathogens comprising examples of bacteria (Listeria monocytogenes, L. m), DNA viruses (herpes simplex virus type-1, HSV-1) and RNA viruses (vesicular stomatitis virus, VSV). In total, 3,704 lysosome-related proteins and 300 potential glycosylation sites on 193 proteins were identified. Comparative analysis showed that the aforementioned pathogens induced distinct alterations in the proteome of the lysosome, which is closely associated with the immune functions of macrophages, such as toll-like receptor activation, inflammation and antigen-presentation. The most significant changes in proteins and fluctuations in glycosylation were also determined. Furthermore, Western blot analysis showed that the changes in expression of these proteins were undetectable at the whole cell level. Thus, our study provides unique insights into the function of lysosomes in macrophage activation and immune responses. PMID:28088779
2008-01-25
limitations and plans for improvement Perhaps, one of PIPA’s main limitations is that all of its currently integrated resources to predict protein function...are planning on expending PIPA’s function prediction capabilities by incorporating comparative analysis approaches, e.g., phy- logenetic tree analysis...tools and services. Nucleic Acids Res 2005/12/31 edition. 2006, 34(Database issue):D247-51. 6. Bru C, Courcelle E, Carrere S, Beausse Y, Dalmar S
Havugimana, Pierre C; Hu, Pingzhao; Emili, Andrew
2017-10-01
Elucidation of the networks of physical (functional) interactions present in cells and tissues is fundamental for understanding the molecular organization of biological systems, the mechanistic basis of essential and disease-related processes, and for functional annotation of previously uncharacterized proteins (via guilt-by-association or -correlation). After a decade in the field, we felt it timely to document our own experiences in the systematic analysis of protein interaction networks. Areas covered: Researchers worldwide have contributed innovative experimental and computational approaches that have driven the rapidly evolving field of 'functional proteomics'. These include mass spectrometry-based methods to characterize macromolecular complexes on a global-scale and sophisticated data analysis tools - most notably machine learning - that allow for the generation of high-quality protein association maps. Expert commentary: Here, we recount some key lessons learned, with an emphasis on successful workflows, and challenges, arising from our own and other groups' ongoing efforts to generate, interpret and report proteome-scale interaction networks in increasingly diverse biological contexts.
A domain-centric solution to functional genomics via dcGO Predictor
2013-01-01
Background Computational/manual annotations of protein functions are one of the first routes to making sense of a newly sequenced genome. Protein domain predictions form an essential part of this annotation process. This is due to the natural modularity of proteins with domains as structural, evolutionary and functional units. Sometimes two, three, or more adjacent domains (called supra-domains) are the operational unit responsible for a function, e.g. via a binding site at the interface. These supra-domains have contributed to functional diversification in higher organisms. Traditionally functional ontologies have been applied to individual proteins, rather than families of related domains and supra-domains. We expect, however, to some extent functional signals can be carried by protein domains and supra-domains, and consequently used in function prediction and functional genomics. Results Here we present a domain-centric Gene Ontology (dcGO) perspective. We generalize a framework for automatically inferring ontological terms associated with domains and supra-domains from full-length sequence annotations. This general framework has been applied specifically to primary protein-level annotations from UniProtKB-GOA, generating GO term associations with SCOP domains and supra-domains. The resulting 'dcGO Predictor', can be used to provide functional annotation to protein sequences. The functional annotation of sequences in the Critical Assessment of Function Annotation (CAFA) has been used as a valuable opportunity to validate our method and to be assessed by the community. The functional annotation of all completely sequenced genomes has demonstrated the potential for domain-centric GO enrichment analysis to yield functional insights into newly sequenced or yet-to-be-annotated genomes. This generalized framework we have presented has also been applied to other domain classifications such as InterPro and Pfam, and other ontologies such as mammalian phenotype and disease ontology. The dcGO and its predictor are available at http://supfam.org/SUPERFAMILY/dcGO including an enrichment analysis tool. Conclusions As functional units, domains offer a unique perspective on function prediction regardless of whether proteins are multi-domain or single-domain. The 'dcGO Predictor' holds great promise for contributing to a domain-centric functional understanding of genomes in the next generation sequencing era. PMID:23514627
Analysis of sDMA modifications of PIWI proteins
Honda, Shozo; Kirino, Yoriko; Kirino, Yohei
2015-01-01
Summary Arginine methylation is an important post-translational protein modification that modulates protein function for a wide range of biological processes. PIWI proteins, a subclade of the Argonaute family proteins, contain evolutionarily conserved symmetrical dimethylarginines (sDMAs). It has become increasingly apparent that the sDMAs of PIWI proteins serve as binding elements for TUDOR-domain containing proteins and that sDMA-dependent protein interactions play crucial roles in the biogenesis and function of PIWI-interacting RNAs (piRNAs). We describe a method for detecting PIWI sDMAs and purifying PIWI/piRNA complexes using anti-sDMA antibodies. PMID:24178562
Gong, Wei; He, Kun; Covington, Mike; Dinesh-Kumar, S. P.; Snyder, Michael; Harmer, Stacey L.; Zhu, Yu-Xian; Deng, Xing Wang
2009-01-01
We used our collection of Arabidopsis transcription factor (TF) ORFeome clones to construct protein microarrays containing as many as 802 TF proteins. These protein microarrays were used for both protein-DNA and protein-protein interaction analyses. For protein-DNA interaction studies, we examined AP2/ERF family TFs and their cognate cis-elements. By careful comparison of the DNA-binding specificity of 13 TFs on the protein microarray with previous non-microarray data, we showed that protein microarrays provide an efficient and high throughput tool for genome-wide analysis of TF-DNA interactions. This microarray protein-DNA interaction analysis allowed us to derive a comprehensive view of DNA-binding profiles of AP2/ERF family proteins in Arabidopsis. It also revealed four TFs that bound the EE (evening element) and had the expected phased gene expression under clock-regulation, thus providing a basis for further functional analysis of their roles in clock regulation of gene expression. We also developed procedures for detecting protein interactions using this TF protein microarray and discovered four novel partners that interact with HY5, which can be validated by yeast two-hybrid assays. Thus, plant TF protein microarrays offer an attractive high-throughput alternative to traditional techniques for TF functional characterization on a global scale. PMID:19802365
Floris, Matteo; Orsini, Massimiliano; Thanaraj, Thangavel Alphonse
2008-10-02
It is often the case that mammalian genes are alternatively spliced; the resulting alternate transcripts often encode protein isoforms that differ in amino acid sequences. Changes among the protein isoforms can alter the cellular properties of proteins. The effect can range from a subtle modulation to a complete loss of function. (i) We examined human splice-mediated protein isoforms (as extracted from a manually curated data set, and from a computationally predicted data set) for differences in the annotation for protein signatures (Pfam domains and PRINTS fingerprints) and we characterized the differences & their effects on protein functionalities. An important question addressed relates to the extent of protein isoforms that may lack any known function in the cell. (ii) We present a database that reports differences in protein signatures among human splice-mediated protein isoform sequences. (i) Characterization: The work points to distinct sets of alternatively spliced genes with varying degrees of annotation for the splice-mediated protein isoforms. Protein molecular functions seen to be often affected are those that relate to: binding, catalytic, transcription regulation, structural molecule, transporter, motor, and antioxidant; and the processes that are often affected are nucleic acid binding, signal transduction, and protein-protein interactions. Signatures are often included/excluded and truncated in length among protein isoforms; truncation is seen as the predominant type of change. Analysis points to the following novel aspects: (a) Analysis using data from the manually curated Vega indicates that one in 8.9 genes can lead to a protein isoform of no "known" function; and one in 18 expressed protein isoforms can be such an "orphan" isoform; the corresponding numbers as seen with computationally predicted ASD data set are: one in 4.9 genes and one in 9.8 isoforms. (b) When swapping of signatures occurs, it is often between those of same functional classifications. (c) Pfam domains can occur in varying lengths, and PRINTS fingerprints can occur with varying number of constituent motifs among isoforms - since such a variation is seen in large number of genes, it could be a general mechanism to modulate protein function. (ii) The reported resource (at http://www.bioinformatica.crs4.org/tools/dbs/splivap/) provides the community ability to access data on splice-mediated protein isoforms (with value-added annotation such as association with diseases) through changes in protein signatures.
Functional equivalency inferred from "authoritative sources" in networks of homologous proteins.
Natarajan, Shreedhar; Jakobsson, Eric
2009-06-12
A one-on-one mapping of protein functionality across different species is a critical component of comparative analysis. This paper presents a heuristic algorithm for discovering the Most Likely Functional Counterparts (MoLFunCs) of a protein, based on simple concepts from network theory. A key feature of our algorithm is utilization of the user's knowledge to assign high confidence to selected functional identification. We show use of the algorithm to retrieve functional equivalents for 7 membrane proteins, from an exploration of almost 40 genomes form multiple online resources. We verify the functional equivalency of our dataset through a series of tests that include sequence, structure and function comparisons. Comparison is made to the OMA methodology, which also identifies one-on-one mapping between proteins from different species. Based on that comparison, we believe that incorporation of user's knowledge as a key aspect of the technique adds value to purely statistical formal methods.
Functional Equivalency Inferred from “Authoritative Sources” in Networks of Homologous Proteins
Natarajan, Shreedhar; Jakobsson, Eric
2009-01-01
A one-on-one mapping of protein functionality across different species is a critical component of comparative analysis. This paper presents a heuristic algorithm for discovering the Most Likely Functional Counterparts (MoLFunCs) of a protein, based on simple concepts from network theory. A key feature of our algorithm is utilization of the user's knowledge to assign high confidence to selected functional identification. We show use of the algorithm to retrieve functional equivalents for 7 membrane proteins, from an exploration of almost 40 genomes form multiple online resources. We verify the functional equivalency of our dataset through a series of tests that include sequence, structure and function comparisons. Comparison is made to the OMA methodology, which also identifies one-on-one mapping between proteins from different species. Based on that comparison, we believe that incorporation of user's knowledge as a key aspect of the technique adds value to purely statistical formal methods. PMID:19521530
Upadhyay, Atul Kumar; Sowdhamini, Ramanathan
2016-01-01
3D-domain swapping is one of the mechanisms of protein oligomerization and the proteins exhibiting this phenomenon have many biological functions. These proteins, which undergo domain swapping, have acquired much attention owing to their involvement in human diseases, such as conformational diseases, amyloidosis, serpinopathies, proteionopathies etc. Early realisation of proteins in the whole human genome that retain tendency to domain swap will enable many aspects of disease control management. Predictive models were developed by using machine learning approaches with an average accuracy of 78% (85.6% of sensitivity, 87.5% of specificity and an MCC value of 0.72) to predict putative domain swapping in protein sequences. These models were applied to many complete genomes with special emphasis on the human genome. Nearly 44% of the protein sequences in the human genome were predicted positive for domain swapping. Enrichment analysis was performed on the positively predicted sequences from human genome for their domain distribution, disease association and functional importance based on Gene Ontology (GO). Enrichment analysis was also performed to infer a better understanding of the functional importance of these sequences. Finally, we developed hinge region prediction, in the given putative domain swapped sequence, by using important physicochemical properties of amino acids.
Verma, Amit K; Diwan, Danish; Raut, Sandeep; Dobriyal, Neha; Brown, Rebecca E; Gowda, Vinita; Hines, Justin K; Sahi, Chandan
2017-06-07
Heat shock proteins of 70 kDa (Hsp70s) partner with structurally diverse Hsp40s (J proteins), generating distinct chaperone networks in various cellular compartments that perform myriad housekeeping and stress-associated functions in all organisms. Plants, being sessile, need to constantly maintain their cellular proteostasis in response to external environmental cues. In these situations, the Hsp70:J protein machines may play an important role in fine-tuning cellular protein quality control. Although ubiquitous, the functional specificity and complexity of the plant Hsp70:J protein network has not been studied. Here, we analyzed the J protein network in the cytosol of Arabidopsis thaliana and, using yeast genetics, show that the functional specificities of most plant J proteins in fundamental chaperone functions are conserved across long evolutionary timescales. Detailed phylogenetic and functional analysis revealed that increased number, regulatory differences, and neofunctionalization in J proteins together contribute to the emerging functional diversity and complexity in the Hsp70:J protein network in higher plants. Based on the data presented, we propose that higher plants have orchestrated their "chaperome," especially their J protein complement, according to their specialized cellular and physiological stipulations. Copyright © 2017 Verma et al.
Coagulation parameters and platelet function analysis in patients with acromegaly.
Colak, A; Yılmaz, H; Temel, Y; Demirpence, M; Simsek, N; Karademirci, İ; Bozkurt, U; Yasar, E
2016-01-01
Acromegaly is associated with increased cardiovascular morbidity and mortality. The data about the evaluation of coagulation and fibrinolysis in acromegalic patients are very limited and to our knowledge, platelet function analysis has never been investigated. So, we aimed to investigate the levels of protein C, protein S, fibrinogen, antithrombin 3 and platelet function analysis in patients with acromegaly. Thirty-nine patients with active acromegaly and 35 healthy subjects were included in the study. Plasma glucose and lipid profile, fibrinogen levels, GH and IGF-1 levels and protein C, protein S and antithrombin III activities were measured in all study subjects. Also, platelet function analysis was evaluated with collagen/ADP and collagen-epinephrine-closure times. Demographic characteristics of the patient and the control were similar. As expected, fasting blood glucose levels and serum GH and IGF-1 levels were significantly higher in the patient group compared with the control group (pglc: 0.002, pGH: 0.006, pIGF-1: 0.001, respectively). But lipid parameters were similar between the two groups. While serum fibrinogen and antithrombin III levels were found to be significantly higher in acromegaly group (p fibrinogen: 0.005 and pantithrombin III: 0.001), protein S and protein C activity values were significantly lower in the patient group (p protein S: 0.001, p protein C: 0.001). Also significantly enhanced platelet function (measured by collagen/ADP- and collagen/epinephrine-closure times) was demonstrated in acromegaly (p col-ADP: 0.002, p col-epinephrine: 0.002). The results did not change, when we excluded six patients with type 2 diabetes in the acromegaly group. There was a negative correlation between serum GH levels and protein S (r: -0.25, p: 0.04)) and protein C (r: -0.26, p: 0.04) values. Likewise, there was a negative correlation between IGF-1 levels and protein C values (r: -0.39, p: 0.002), protein S values (r: -0.39, p: 0.001), collagen/ADP-closure times (r: -0.28, p: 0.02) and collagen/epinephrine-closure times (r:-0.26, p: 0.04). Also, we observed a positive correlation between IGF-1 levels and fibrinogen levels (r: 0.31, p: 0.01). Acromegaly was found to be associated with increased tendency to coagulation and enhanced platelet activity. This hypercoagulable state might increase the risk for cardiovascular and cerebrovascular events in acromegaly.
NASA Astrophysics Data System (ADS)
Rauf, Muhammad; Saeed, Nasir A.; Habib, Imran; Ahmed, Moddassir; Shahzad, Khurram; Mansoor, Shahid; Ali, Rashid
2017-02-01
Structure prediction can provide information about function and active sites of protein which helps to design new functional proteins. H+-pyrophosphatase is transmembrane protein involved in establishing proton motive force for active transport of Na+ across membrane by Na+/H+ antiporters. A full length novel H+-pyrophosphatase gene was isolated from halophytic grass Leptochloa fusca using RT-PCR and RACE method. Full length LfVP1 gene sequence of 2292 nucleotides encodes protein of 764 amino acids. DNA and protein sequences were used for characterization using bioinformatics tools. Various important potential sites were predicted by PROSITE webserver. Primary structural analysis showed LfVP1 as stable protein and Grand average hydropathy (GRAVY) indicated that LfVP1 protein has good hydrosolubility. Secondary structure analysis showed that LfVP1 protein sequence contains significant proportion of alpha helix and random coil. Protein membrane topology suggested the presence of 14 transmembrane domains and presence of catalytic domain in TM3. Three dimensional structure from LfVP1 protein sequence also indicated the presence of 14 transmembrane domains and hydrophobicity surface model showed amino acid hydrophobicity. Ramachandran plot showed that 98% amino acid residues were predicted in the favored region.
Analysis of functional redundancies within the Arabidopsis TCP transcription factor family
Danisman, Selahattin; de Folter, Stefan; Immink, Richard G. H.
2013-01-01
Analyses of the functions of TEOSINTE-LIKE1, CYCLOIDEA, and PROLIFERATING CELL FACTOR1 (TCP) transcription factors have been hampered by functional redundancy between its individual members. In general, putative functionally redundant genes are predicted based on sequence similarity and confirmed by genetic analysis. In the TCP family, however, identification is impeded by relatively low overall sequence similarity. In a search for functionally redundant TCP pairs that control Arabidopsis leaf development, this work performed an integrative bioinformatics analysis, combining protein sequence similarities, gene expression data, and results of pair-wise protein–protein interaction studies for the 24 members of the Arabidopsis TCP transcription factor family. For this, the work completed any lacking gene expression and protein–protein interaction data experimentally and then performed a comprehensive prediction of potential functional redundant TCP pairs. Subsequently, redundant functions could be confirmed for selected predicted TCP pairs by genetic and molecular analyses. It is demonstrated that the previously uncharacterized class I TCP19 gene plays a role in the control of leaf senescence in a redundant fashion with TCP20. Altogether, this work shows the power of combining classical genetic and molecular approaches with bioinformatics predictions to unravel functional redundancies in the TCP transcription factor family. PMID:24129704
Topology-function conservation in protein-protein interaction networks.
Davis, Darren; Yaveroğlu, Ömer Nebil; Malod-Dognin, Noël; Stojmirovic, Aleksandar; Pržulj, Nataša
2015-05-15
Proteins underlay the functioning of a cell and the wiring of proteins in protein-protein interaction network (PIN) relates to their biological functions. Proteins with similar wiring in the PIN (topology around them) have been shown to have similar functions. This property has been successfully exploited for predicting protein functions. Topological similarity is also used to guide network alignment algorithms that find similarly wired proteins between PINs of different species; these similarities are used to transfer annotation across PINs, e.g. from model organisms to human. To refine these functional predictions and annotation transfers, we need to gain insight into the variability of the topology-function relationships. For example, a function may be significantly associated with specific topologies, while another function may be weakly associated with several different topologies. Also, the topology-function relationships may differ between different species. To improve our understanding of topology-function relationships and of their conservation among species, we develop a statistical framework that is built upon canonical correlation analysis. Using the graphlet degrees to represent the wiring around proteins in PINs and gene ontology (GO) annotations to describe their functions, our framework: (i) characterizes statistically significant topology-function relationships in a given species, and (ii) uncovers the functions that have conserved topology in PINs of different species, which we term topologically orthologous functions. We apply our framework to PINs of yeast and human, identifying seven biological process and two cellular component GO terms to be topologically orthologous for the two organisms. © The Author 2015. Published by Oxford University Press.
Functional and Structural Analysis of the Conserved EFhd2 Protein
Acosta, Yancy Ferrer; Rodríguez Cruz, Eva N.; Vaquer, Ana del C.; Vega, Irving E.
2013-01-01
EFhd2 is a novel protein conserved from C. elegans to H. sapiens. This novel protein was originally identified in cells of the immune and central nervous systems. However, it is most abundant in the central nervous system, where it has been found associated with pathological forms of the microtubule-associated protein tau. The physiological or pathological roles of EFhd2 are poorly understood. In this study, a functional and structural analysis was carried to characterize the molecular requirements for EFhd2’s calcium binding activity. The results showed that mutations of a conserved aspartate on either EF-hand motif disrupted the calcium binding activity, indicating that these motifs work in pair as a functional calcium binding domain. Furthermore, characterization of an identified single-nucleotide polymorphisms (SNP) that introduced a missense mutation indicates the importance of a conserved phenylalanine on EFhd2 calcium binding activity. Structural analysis revealed that EFhd2 is predominantly composed of alpha helix and random coil structures and that this novel protein is thermostable. EFhd2’s thermo stability depends on its N-terminus. In the absence of the N-terminus, calcium binding restored EFhd2’s thermal stability. Overall, these studies contribute to our understanding on EFhd2 functional and structural properties, and introduce it into the family of canonical EF-hand domain containing proteins. PMID:22973849
Algorithm, applications and evaluation for protein comparison by Ramanujan Fourier transform.
Zhao, Jian; Wang, Jiasong; Hua, Wei; Ouyang, Pingkai
2015-12-01
The amino acid sequence of a protein determines its chemical properties, chain conformation and biological functions. Protein sequence comparison is of great importance to identify similarities of protein structures and infer their functions. Many properties of a protein correspond to the low-frequency signals within the sequence. Low frequency modes in protein sequences are linked to the secondary structures, membrane protein types, and sub-cellular localizations of the proteins. In this paper, we present Ramanujan Fourier transform (RFT) with a fast algorithm to analyze the low-frequency signals of protein sequences. The RFT method is applied to similarity analysis of protein sequences with the Resonant Recognition Model (RRM). The results show that the proposed fast RFT method on protein comparison is more efficient than commonly used discrete Fourier transform (DFT). RFT can detect common frequencies as significant feature for specific protein families, and the RFT spectrum heat-map of protein sequences demonstrates the information conservation in the sequence comparison. The proposed method offers a new tool for pattern recognition, feature extraction and structural analysis on protein sequences. Copyright © 2015 Elsevier Ltd. All rights reserved.
Conformational diversity analysis reveals three functional mechanisms in proteins
Fornasari, María Silvina
2017-01-01
Protein motions are a key feature to understand biological function. Recently, a large-scale analysis of protein conformational diversity showed a positively skewed distribution with a peak at 0.5 Å C-alpha root-mean-square-deviation (RMSD). To understand this distribution in terms of structure-function relationships, we studied a well curated and large dataset of ~5,000 proteins with experimentally determined conformational diversity. We searched for global behaviour patterns studying how structure-based features change among the available conformer population for each protein. This procedure allowed us to describe the RMSD distribution in terms of three main protein classes sharing given properties. The largest of these protein subsets (~60%), which we call “rigid” (average RMSD = 0.83 Å), has no disordered regions, shows low conformational diversity, the largest tunnels and smaller and buried cavities. The two additional subsets contain disordered regions, but with differential sequence composition and behaviour. Partially disordered proteins have on average 67% of their conformers with disordered regions, average RMSD = 1.1 Å, the highest number of hinges and the longest disordered regions. In contrast, malleable proteins have on average only 25% of disordered conformers and average RMSD = 1.3 Å, flexible cavities affected in size by the presence of disordered regions and show the highest diversity of cognate ligands. Proteins in each set are mostly non-homologous to each other, share no given fold class, nor functional similarity but do share features derived from their conformer population. These shared features could represent conformational mechanisms related with biological functions. PMID:28192432
Structure and function of seed storage proteins in faba bean (Vicia faba L.).
Liu, Yujiao; Wu, Xuexia; Hou, Wanwei; Li, Ping; Sha, Weichao; Tian, Yingying
2017-05-01
The protein subunit is the most important basic unit of protein, and its study can unravel the structure and function of seed storage proteins in faba bean. In this study, we identified six specific protein subunits in Faba bean (cv. Qinghai 13) combining liquid chromatography (LC), liquid chromatography-electronic spray ionization mass (LC-ESI-MS/MS) and bio-information technology. The results suggested a diversity of seed storage proteins in faba bean, and a total of 16 proteins (four GroEL molecular chaperones and 12 plant-specific proteins) were identified from 97-, 96-, 64-, 47-, 42-, and 38-kD-specific protein subunits in faba bean based on the peptide sequence. We also analyzed the composition and abundance of the amino acids, the physicochemical characteristics, secondary structure, three-dimensional structure, transmembrane domain, and possible subcellular localization of these identified proteins in faba bean seed, and finally predicted function and structure. The three-dimensional structures were generated based on homologous modeling, and the protein function was analyzed based on the annotation from the non-redundant protein database (NR database, NCBI) and function analysis of optimal modeling. The objective of this study was to identify the seed storage proteins in faba bean and confirm the structure and function of these proteins. Our results can be useful for the study of protein nutrition and achieve breeding goals for optimal protein quality in faba bean.
2014-01-01
Background Bacteroides spp. form a significant part of our gut microbiome and are well known for optimized metabolism of diverse polysaccharides. Initial analysis of the archetypal Bacteroides thetaiotaomicron genome identified 172 glycosyl hydrolases and a large number of uncharacterized proteins associated with polysaccharide metabolism. Results BT_1012 from Bacteroides thetaiotaomicron VPI-5482 is a protein of unknown function and a member of a large protein family consisting entirely of uncharacterized proteins. Initial sequence analysis predicted that this protein has two domains, one on the N- and one on the C-terminal. A PSI-BLAST search found over 150 full length and over 90 half size homologs consisting only of the N-terminal domain. The experimentally determined three-dimensional structure of the BT_1012 protein confirms its two-domain architecture and structural analysis of both domains suggests their specific functions. The N-terminal domain is a putative catalytic domain with significant similarity to known glycoside hydrolases, the C-terminal domain has a beta-sandwich fold typically found in C-terminal domains of other glycosyl hydrolases, however these domains are typically involved in substrate binding. We describe the structure of the BT_1012 protein and discuss its sequence-structure relationship and their possible functional implications. Conclusions Structural and sequence analyses of the BT_1012 protein identifies it as a glycosyl hydrolase, expanding an already impressive catalog of enzymes involved in polysaccharide metabolism in Bacteroides spp. Based on this we have renamed the Pfam families representing the two domains found in the BT_1012 protein, PF13204 and PF12904, as putative glycoside hydrolase and glycoside hydrolase-associated C-terminal domain respectively. PMID:24742328
Analysis of Nuclear Lamina Proteins in Myoblast Differentiation by Functional Complementation.
Tapia, Olga; Gerace, Larry
2016-01-01
We describe straightforward methodology for structure-function mapping of nuclear lamina proteins in myoblast differentiation, using populations of C2C12 myoblasts in which the endogenous lamina components are replaced with ectopically expressed mutant versions of the proteins. The procedure involves bulk isolation of C2C12 cell populations expressing the ectopic proteins by lentiviral transduction, followed by depletion of the endogenous proteins using siRNA, and incubation of cells under myoblast differentiation conditions. Similar methodology may be applied to mouse embryo fibroblasts or to other cell types as well, for the identification and characterization of sequences of lamina proteins involved in functions that can be measured biochemically or cytologically.
Bhadra, Pratiti; Pal, Debnath
2017-04-01
Dynamics is integral to the function of proteins, yet the use of molecular dynamics (MD) simulation as a technique remains under-explored for molecular function inference. This is more important in the context of genomics projects where novel proteins are determined with limited evolutionary information. Recently we developed a method to match the query protein's flexible segments to infer function using a novel approach combining analysis of residue fluctuation-graphs and auto-correlation vectors derived from coarse-grained (CG) MD trajectory. The method was validated on a diverse dataset with sequence identity between proteins as low as 3%, with high function-recall rates. Here we share its implementation as a publicly accessible web service, named DynFunc (Dynamics Match for Function) to query protein function from ≥1 µs long CG dynamics trajectory information of protein subunits. Users are provided with the custom-developed coarse-grained molecular mechanics (CGMM) forcefield to generate the MD trajectories for their protein of interest. On upload of trajectory information, the DynFunc web server identifies specific flexible regions of the protein linked to putative molecular function. Our unique application does not use evolutionary information to infer molecular function from MD information and can, therefore, work for all proteins, including moonlighting and the novel ones, whenever structural information is available. Our pipeline is expected to be of utility to all structural biologists working with novel proteins and interested in moonlighting functions. Copyright © 2017 Elsevier Ltd. All rights reserved.
Recent advances in proteomics of cereals.
Bansal, Monika; Sharma, Madhu; Kanwar, Priyanka; Goyal, Aakash
Cereals contribute a major part of human nutrition and are considered as an integral source of energy for human diets. With genomic databases already available in cereals such as rice, wheat, barley, and maize, the focus has now moved to proteome analysis. Proteomics studies involve the development of appropriate databases based on developing suitable separation and purification protocols, identification of protein functions, and can confirm their functional networks based on already available data from other sources. Tremendous progress has been made in the past decade in generating huge data-sets for covering interactions among proteins, protein composition of various organs and organelles, quantitative and qualitative analysis of proteins, and to characterize their modulation during plant development, biotic, and abiotic stresses. Proteomics platforms have been used to identify and improve our understanding of various metabolic pathways. This article gives a brief review of efforts made by different research groups on comparative descriptive and functional analysis of proteomics applications achieved in the cereal science so far.
Torres, Matthew P; Dewhurst, Henry; Sundararaman, Niveda
2016-11-01
Post-translational modifications (PTMs) regulate protein behavior through modulation of protein-protein interactions, enzymatic activity, and protein stability essential in the translation of genotype to phenotype in eukaryotes. Currently, less than 4% of all eukaryotic PTMs are reported to have biological function - a statistic that continues to decrease with an increasing rate of PTM detection. Previously, we developed SAPH-ire (Structural Analysis of PTM Hotspots) - a method for the prioritization of PTM function potential that has been used effectively to reveal novel PTM regulatory elements in discrete protein families (Dewhurst et al., 2015). Here, we apply SAPH-ire to the set of eukaryotic protein families containing experimental PTM and 3D structure data - capturing 1,325 protein families with 50,839 unique PTM sites organized into 31,747 modified alignment positions (MAPs), of which 2010 (∼6%) possess known biological function. Here, we show that using an artificial neural network model (SAPH-ire NN) trained to identify MAP hotspots with biological function results in prediction outcomes that far surpass the use of single hotspot features, including nearest neighbor PTM clustering methods. We find the greatest enhancement in prediction for positions with PTM counts of five or less, which represent 98% of all MAPs in the eukaryotic proteome and 90% of all MAPs found to have biological function. Analysis of the top 1092 MAP hotspots revealed 267 of truly unknown function (containing 5443 distinct PTMs). Of these, 165 hotspots could be mapped to human KEGG pathways for normal and/or disease physiology. Many high-ranking hotspots were also found to be disease-associated pathogenic sites of amino acid substitution despite the lack of observable PTM in the human protein family member. Taken together, these experiments demonstrate that the functional relevance of a PTM can be predicted very effectively by neural network models, revealing a large but testable body of potential regulatory elements that impact hundreds of different biological processes important in eukaryotic biology and human health. © 2016 by The American Society for Biochemistry and Molecular Biology, Inc.
Dewhurst, Henry; Sundararaman, Niveda
2016-01-01
Post-translational modifications (PTMs) regulate protein behavior through modulation of protein-protein interactions, enzymatic activity, and protein stability essential in the translation of genotype to phenotype in eukaryotes. Currently, less than 4% of all eukaryotic PTMs are reported to have biological function - a statistic that continues to decrease with an increasing rate of PTM detection. Previously, we developed SAPH-ire (Structural Analysis of PTM Hotspots) - a method for the prioritization of PTM function potential that has been used effectively to reveal novel PTM regulatory elements in discrete protein families (Dewhurst et al., 2015). Here, we apply SAPH-ire to the set of eukaryotic protein families containing experimental PTM and 3D structure data - capturing 1,325 protein families with 50,839 unique PTM sites organized into 31,747 modified alignment positions (MAPs), of which 2010 (∼6%) possess known biological function. Here, we show that using an artificial neural network model (SAPH-ire NN) trained to identify MAP hotspots with biological function results in prediction outcomes that far surpass the use of single hotspot features, including nearest neighbor PTM clustering methods. We find the greatest enhancement in prediction for positions with PTM counts of five or less, which represent 98% of all MAPs in the eukaryotic proteome and 90% of all MAPs found to have biological function. Analysis of the top 1092 MAP hotspots revealed 267 of truly unknown function (containing 5443 distinct PTMs). Of these, 165 hotspots could be mapped to human KEGG pathways for normal and/or disease physiology. Many high-ranking hotspots were also found to be disease-associated pathogenic sites of amino acid substitution despite the lack of observable PTM in the human protein family member. Taken together, these experiments demonstrate that the functional relevance of a PTM can be predicted very effectively by neural network models, revealing a large but testable body of potential regulatory elements that impact hundreds of different biological processes important in eukaryotic biology and human health. PMID:27697855
Roles of Apicomplexan protein kinases at each life cycle stage.
Kato, Kentaro; Sugi, Tatsuki; Iwanaga, Tatsuya
2012-06-01
Inhibitors of cellular protein kinases have been reported to inhibit the development of Apicomplexan parasites, suggesting that the functions of protozoan protein kinases are critical for their life cycle. However, the specific roles of these protein kinases cannot be determined using only these inhibitors without molecular analysis, including gene disruption. In this report, we describe the functions of Apicomplexan protein kinases in each parasite life stage and the potential of pre-existing protein kinase inhibitors as Apicomplexan drugs against, mainly, Plasmodium and Toxoplasma. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.
Strübbe, Gero; Popp, Christian; Schmidt, Alexander; Pauli, Andrea; Ringrose, Leonie; Beisel, Christian; Paro, Renato
2011-01-01
The maintenance of specific gene expression patterns during cellular proliferation is crucial for the identity of every cell type and the development of tissues in multicellular organisms. Such a cellular memory function is conveyed by the complex interplay of the Polycomb and Trithorax groups of proteins (PcG/TrxG). These proteins exert their function at the level of chromatin by establishing and maintaining repressed (PcG) and active (TrxG) chromatin domains. Past studies indicated that a core PcG protein complex is potentially associated with cell type or even cell stage-specific sets of accessory proteins. In order to better understand the dynamic aspects underlying PcG composition and function we have established an inducible version of the biotinylation tagging approach to purify Polycomb and associated factors from Drosophila embryos. This system enabled fast and efficient isolation of Polycomb containing complexes under near physiological conditions, thereby preserving substoichiometric interactions. Novel interacting proteins were identified by highly sensitive mass spectrometric analysis. We found many TrxG related proteins, suggesting a previously unrecognized extent of molecular interaction of the two counteracting chromatin regulatory protein groups. Furthermore, our analysis revealed an association of PcG protein complexes with the cohesin complex and showed that Polycomb-dependent silencing of a transgenic reporter depends on cohesin function. PMID:21415365
Roche, John P.; Alsharif, Peter; Graf, Ethan R.
2015-01-01
At synapses, the release of neurotransmitter is regulated by molecular machinery that aggregates at specialized presynaptic release sites termed active zones. The complement of active zone proteins at each site is a determinant of release efficacy and can be remodeled to alter synapse function. The small GTPase Rab3 was previously identified as playing a novel role that controls the distribution of active zone proteins to individual release sites at the Drosophila neuromuscular junction. Rab3 has been extensively studied for its role in the synaptic vesicle cycle; however, the mechanism by which Rab3 controls active zone development remains unknown. To explore this mechanism, we conducted a mutational analysis to determine the molecular and structural requirements of Rab3 function at Drosophila synapses. We find that GTP-binding is required for Rab3 to traffick to synapses and distribute active zone components across release sites. Conversely, the hydrolytic activity of Rab3 is unnecessary for this function. Through a structure-function analysis we identify specific residues within the effector-binding switch regions that are required for Rab3 function and determine that membrane attachment is essential. Our findings suggest that Rab3 controls the distribution of active zone components via a vesicle docking mechanism that is consistent with standard Rab protein function. PMID:26317909
Structure-functional prediction and analysis of cancer mutation effects in protein kinases.
Dixit, Anshuman; Verkhivker, Gennady M
2014-01-01
A central goal of cancer research is to discover and characterize the functional effects of mutated genes that contribute to tumorigenesis. In this study, we provide a detailed structural classification and analysis of functional dynamics for members of protein kinase families that are known to harbor cancer mutations. We also present a systematic computational analysis that combines sequence and structure-based prediction models to characterize the effect of cancer mutations in protein kinases. We focus on the differential effects of activating point mutations that increase protein kinase activity and kinase-inactivating mutations that decrease activity. Mapping of cancer mutations onto the conformational mobility profiles of known crystal structures demonstrated that activating mutations could reduce a steric barrier for the movement from the basal "low" activity state to the "active" state. According to our analysis, the mechanism of activating mutations reflects a combined effect of partial destabilization of the kinase in its inactive state and a concomitant stabilization of its active-like form, which is likely to drive tumorigenesis at some level. Ultimately, the analysis of the evolutionary and structural features of the major cancer-causing mutational hotspot in kinases can also aid in the correlation of kinase mutation effects with clinical outcomes.
Detecting coupled collective motions in protein by independent subspace analysis
NASA Astrophysics Data System (ADS)
Sakuraba, Shun; Joti, Yasumasa; Kitao, Akio
2010-11-01
Protein dynamics evolves in a high-dimensional space, comprising aharmonic, strongly correlated motional modes. Such correlation often plays an important role in analyzing protein function. In order to identify significantly correlated collective motions, here we employ independent subspace analysis based on the subspace joint approximate diagonalization of eigenmatrices algorithm for the analysis of molecular dynamics (MD) simulation trajectories. From the 100 ns MD simulation of T4 lysozyme, we extract several independent subspaces in each of which collective modes are significantly correlated, and identify the other modes as independent. This method successfully detects the modes along which long-tailed non-Gaussian probability distributions are obtained. Based on the time cross-correlation analysis, we identified a series of events among domain motions and more localized motions in the protein, indicating the connection between the functionally relevant phenomena which have been independently revealed by experiments.
Structural and functional analyses of genes encoding VQ proteins in apple.
Dong, Qinglong; Zhao, Shuang; Duan, Dingyue; Tian, Yi; Wang, Yanpeng; Mao, Ke; Zhou, Zongshan; Ma, Fengwang
2018-07-01
Recent studies with Arabidopsis and soybean have shown that a class of valine-glutamine (VQ) motif-containing proteins interacts with some WRKY transcription factors. However, little is known about the evolution, structures, and functions of those proteins in apple. Here, we examined their features and identified 49 apple VQ genes. Our evolutional analysis revealed that the proteins could be clustered into nine groups together with their homologues in 33 species. Historically, the main characteristics of proteins in Groups I, V, VI, VII, IX, and X were thought to have been generated before the monocot-dicot split, whereas those in Groups II, III + IV, and VIII were generated after that split. In the structural analysis, apple MdVQ proteins appeared to bind only with Group I and IIc MdWRKY proteins. Meanwhile, MdVQ1, MdVQ10, MdVQ15, and MdVQ36 interacted with multiple MdVQ proteins to form heterodimers but MdVQ15 formed a homodimer. The functional analysis indicated that overexpression of some apple MdVQs in Arabidopsis and tobacco plants effected their vegetative and reproductive growth. These results provide important information about the characteristics of apple MdVQ genes and can serve as a solid foundation for further studies about the role of WRKY-VQ interactions in regulating apple developmental and defense mechanisms. Copyright © 2018 Elsevier B.V. All rights reserved.
cncRNAs: Bi-functional RNAs with protein coding and non-coding functions
Kumari, Pooja; Sampath, Karuna
2015-01-01
For many decades, the major function of mRNA was thought to be to provide protein-coding information embedded in the genome. The advent of high-throughput sequencing has led to the discovery of pervasive transcription of eukaryotic genomes and opened the world of RNA-mediated gene regulation. Many regulatory RNAs have been found to be incapable of protein coding and are hence termed as non-coding RNAs (ncRNAs). However, studies in recent years have shown that several previously annotated non-coding RNAs have the potential to encode proteins, and conversely, some coding RNAs have regulatory functions independent of the protein they encode. Such bi-functional RNAs, with both protein coding and non-coding functions, which we term as ‘cncRNAs’, have emerged as new players in cellular systems. Here, we describe the functions of some cncRNAs identified from bacteria to humans. Because the functions of many RNAs across genomes remains unclear, we propose that RNAs be classified as coding, non-coding or both only after careful analysis of their functions. PMID:26498036
Function of alternative splicing
Kelemen, Olga; Convertini, Paolo; Zhang, Zhaiyi; Wen, Yuan; Shen, Manli; Falaleeva, Marina; Stamm, Stefan
2017-01-01
Almost all polymerase II transcripts undergo alternative pre-mRNA splicing. Here, we review the functions of alternative splicing events that have been experimentally determined. The overall function of alternative splicing is to increase the diversity of mRNAs expressed from the genome. Alternative splicing changes proteins encoded by mRNAs, which has profound functional effects. Experimental analysis of these protein isoforms showed that alternative splicing regulates binding between proteins, between proteins and nucleic acids as well as between proteins and membranes. Alternative splicing regulates the localization of proteins, their enzymatic properties and their interaction with ligands. In most cases, changes caused by individual splicing isoforms are small. However, cells typically coordinate numerous changes in ‘splicing programs’, which can have strong effects on cell proliferation, cell survival and properties of the nervous system. Due to its widespread usage and molecular versatility, alternative splicing emerges as a central element in gene regulation that interferes with almost every biological function analyzed. PMID:22909801
Vamparys, Lydie; Laurent, Benoist; Carbone, Alessandra; Sacquin-Mora, Sophie
2016-10-01
Protein-protein interactions play a key part in most biological processes and understanding their mechanism is a fundamental problem leading to numerous practical applications. The prediction of protein binding sites in particular is of paramount importance since proteins now represent a major class of therapeutic targets. Amongst others methods, docking simulations between two proteins known to interact can be a useful tool for the prediction of likely binding patches on a protein surface. From the analysis of the protein interfaces generated by a massive cross-docking experiment using the 168 proteins of the Docking Benchmark 2.0, where all possible protein pairs, and not only experimental ones, have been docked together, we show that it is also possible to predict a protein's binding residues without having any prior knowledge regarding its potential interaction partners. Evaluating the performance of cross-docking predictions using the area under the specificity-sensitivity ROC curve (AUC) leads to an AUC value of 0.77 for the complete benchmark (compared to the 0.5 AUC value obtained for random predictions). Furthermore, a new clustering analysis performed on the binding patches that are scattered on the protein surface show that their distribution and growth will depend on the protein's functional group. Finally, in several cases, the binding-site predictions resulting from the cross-docking simulations will lead to the identification of an alternate interface, which corresponds to the interaction with a biomolecular partner that is not included in the original benchmark. Proteins 2016; 84:1408-1421. © 2016 The Authors Proteins: Structure, Function, and Bioinformatics Published by Wiley Periodicals, Inc. © 2016 The Authors Proteins: Structure, Function, and Bioinformatics Published by Wiley Periodicals, Inc.
Interactions between late acting proteins required for peptidoglycan synthesis during sporulation
Fay, Allison; Meyer, Pablo; Dworkin, Jonathan
2010-01-01
The requirement of peptidoglycan synthesis for growth complicates the analysis of interactions between proteins involved in this pathway. In particular, the later steps that involve membrane-linked substrates have proven largely recalcitrant to in vivo analysis. Here we have taken advantage of the peptidoglycan synthesis that occurs during sporulation in Bacillus subtilis to examine the interactions between SpoVE, a non-essential, sporulation-specific homolog of the well-conserved and essential SEDS proteins, and SpoVD, a non-essential class B penicillin binding protein (PBP). We found that localization of SpoVD is dependent on SpoVE and that SpoVD protects SpoVE from in vivo proteolysis. Co-immunoprecipitations and Fluorescence Resonance Energy Transfer experiments indicated that SpoVE and SpoVD interact and co-affinity purification in E. coli demonstrated that this interaction is direct. Finally, we generated a functional protein consisting of a SpoVE-SpoVD fusion and found that a loss-of-function point mutation in either part of the fusion resulted in a loss of function of the entire fusion that was not complemented by a wild type protein. Thus, SpoVE has a direct and functional interaction with SpoVD and this conclusion will facilitate understanding the essential function SpoVE and related SEDS proteins such as FtsW and RodA play in bacterial growth and division. PMID:20417640
Analysis of Protein Kinetics Using Fluorescence Recovery After Photobleaching (FRAP).
Giakoumakis, Nickolaos Nikiforos; Rapsomaniki, Maria Anna; Lygerou, Zoi
2017-01-01
Fluorescence recovery after photobleaching (FRAP) is a cutting-edge live-cell functional imaging technique that enables the exploration of protein dynamics in individual cells and thus permits the elucidation of protein mobility, function, and interactions at a single-cell level. During a typical FRAP experiment, fluorescent molecules in a defined region of interest within the cell are bleached by a short and powerful laser pulse, while the recovery of the fluorescence in the region is monitored over time by time-lapse microscopy. FRAP experimental setup and image acquisition involve a number of steps that need to be carefully executed to avoid technical artifacts. Equally important is the subsequent computational analysis of FRAP raw data, to derive quantitative information on protein diffusion and binding parameters. Here we present an integrated in vivo and in silico protocol for the analysis of protein kinetics using FRAP. We focus on the most commonly encountered challenges and technical or computational pitfalls and their troubleshooting so that valid and robust insight into protein dynamics within living cells is gained.
Determining protein function and interaction from genome analysis
Eisenberg, David; Marcotte, Edward M.; Thompson, Michael J.; Pellegrini, Matteo; Yeates, Todd O.
2004-08-03
A computational method system, and computer program are provided for inferring functional links from genome sequences. One method is based on the observation that some pairs of proteins A' and B' have homologs in another organism fused into a single protein chain AB. A trans-genome comparison of sequences can reveal these AB sequences, which are Rosetta Stone sequences because they decipher an interaction between A' and B. Another method compares the genomic sequence of two or more organisms to create a phylogenetic profile for each protein indicating its presence or absence across all the genomes. The profile provides information regarding functional links between different families of proteins. In yet another method a combination of the above two methods is used to predict functional links.
Surface energetics and protein-protein interactions: analysis and mechanistic implications
Peri, Claudio; Morra, Giulia; Colombo, Giorgio
2016-01-01
Understanding protein-protein interactions (PPI) at the molecular level is a fundamental task in the design of new drugs, the prediction of protein function and the clarification of the mechanisms of (dis)regulation of biochemical pathways. In this study, we use a novel computational approach to investigate the energetics of aminoacid networks located on the surface of proteins, isolated and in complex with their respective partners. Interestingly, the analysis of individual proteins identifies patches of surface residues that, when mapped on the structure of their respective complexes, reveal regions of residue-pair couplings that extend across the binding interfaces, forming continuous motifs. An enhanced effect is visible across the proteins of the dataset forming larger quaternary assemblies. The method indicates the presence of energetic signatures in the isolated proteins that are retained in the bound form, which we hypothesize to determine binding orientation upon complex formation. We propose our method, BLUEPRINT, as a complement to different approaches ranging from the ab-initio characterization of PPIs, to protein-protein docking algorithms, for the physico-chemical and functional investigation of protein-protein interactions. PMID:27050828
Large-scale De Novo Prediction of Physical Protein-Protein Association*
Elefsinioti, Antigoni; Saraç, Ömer Sinan; Hegele, Anna; Plake, Conrad; Hubner, Nina C.; Poser, Ina; Sarov, Mihail; Hyman, Anthony; Mann, Matthias; Schroeder, Michael; Stelzl, Ulrich; Beyer, Andreas
2011-01-01
Information about the physical association of proteins is extensively used for studying cellular processes and disease mechanisms. However, complete experimental mapping of the human interactome will remain prohibitively difficult in the near future. Here we present a map of predicted human protein interactions that distinguishes functional association from physical binding. Our network classifies more than 5 million protein pairs predicting 94,009 new interactions with high confidence. We experimentally tested a subset of these predictions using yeast two-hybrid analysis and affinity purification followed by quantitative mass spectrometry. Thus we identified 462 new protein-protein interactions and confirmed the predictive power of the network. These independent experiments address potential issues of circular reasoning and are a distinctive feature of this work. Analysis of the physical interactome unravels subnetworks mediating between different functional and physical subunits of the cell. Finally, we demonstrate the utility of the network for the analysis of molecular mechanisms of complex diseases by applying it to genome-wide association studies of neurodegenerative diseases. This analysis provides new evidence implying TOMM40 as a factor involved in Alzheimer's disease. The network provides a high-quality resource for the analysis of genomic data sets and genetic association studies in particular. Our interactome is available via the hPRINT web server at: www.print-db.org. PMID:21836163
Odronitz, Florian; Kollmar, Martin
2006-01-01
Background Annotation of protein sequences of eukaryotic organisms is crucial for the understanding of their function in the cell. Manual annotation is still by far the most accurate way to correctly predict genes. The classification of protein sequences, their phylogenetic relation and the assignment of function involves information from various sources. This often leads to a collection of heterogeneous data, which is hard to track. Cytoskeletal and motor proteins consist of large and diverse superfamilies comprising up to several dozen members per organism. Up to date there is no integrated tool available to assist in the manual large-scale comparative genomic analysis of protein families. Description Pfarao (Protein Family Application for Retrieval, Analysis and Organisation) is a database driven online working environment for the analysis of manually annotated protein sequences and their relationship. Currently, the system can store and interrelate a wide range of information about protein sequences, species, phylogenetic relations and sequencing projects as well as links to literature and domain predictions. Sequences can be imported from multiple sequence alignments that are generated during the annotation process. A web interface allows to conveniently browse the database and to compile tabular and graphical summaries of its content. Conclusion We implemented a protein sequence-centric web application to store, organize, interrelate, and present heterogeneous data that is generated in manual genome annotation and comparative genomics. The application has been developed for the analysis of cytoskeletal and motor proteins (CyMoBase) but can easily be adapted for any protein. PMID:17134497
Shahbaaz, Mohd; Ahmad, Faizan; Imtaiyaz Hassan, Md
2015-06-01
Haemophilus influenzae is a small pleomorphic Gram-negative bacteria which causes several chronic diseases, including bacteremia, meningitis, cellulitis, epiglottitis, septic arthritis, pneumonia, and empyema. Here we extensively analyzed the sequenced genome of H. influenzae strain Rd KW20 using protein family databases, protein structure prediction, pathways and genome context methods to assign a precise function to proteins whose functions are unknown. These proteins are termed as hypothetical proteins (HPs), for which no experimental information is available. Function prediction of these proteins would surely be supportive to precisely understand the biochemical pathways and mechanism of pathogenesis of Haemophilus influenzae. During the extensive analysis of H. influenzae genome, we found the presence of eight HPs showing lyase activity. Subsequently, we modeled and analyzed three-dimensional structure of all these HPs to determine their functions more precisely. We found these HPs possess cystathionine-β-synthase, cyclase, carboxymuconolactone decarboxylase, pseudouridine synthase A and C, D-tagatose-1,6-bisphosphate aldolase and aminodeoxychorismate lyase-like features, indicating their corresponding functions in the H. influenzae. Lyases are actively involved in the regulation of biosynthesis of various hormones, metabolic pathways, signal transduction, and DNA repair. Lyases are also considered as a key player for various biological processes. These enzymes are critically essential for the survival and pathogenesis of H. influenzae and, therefore, these enzymes may be considered as a potential target for structure-based rational drug design. Our structure-function relationship analysis will be useful to search and design potential lead molecules based on the structure of these lyases, for drug design and discovery.
Yu, Yanbao; Leng, Taohua; Yun, Dong; Liu, Na; Yao, Jun; Dai, Ying; Yang, Pengyuan; Chen, Xian
2013-01-01
Emerging evidences indicate that blood platelets function in multiple biological processes including immune response, bone metastasis and liver regeneration in addition to their known roles in hemostasis and thrombosis. Global elucidation of platelet proteome will provide the molecular base of these platelet functions. Here, we set up a high throughput platform for maximum exploration of the rat/human platelet proteome using integrated proteomics technologies, and then applied to identify the largest number of the proteins expressed in both rat and human platelets. After stringent statistical filtration, a total of 837 unique proteins matched with at least two unique peptides were precisely identified, making it the first comprehensive protein database so far for rat platelets. Meanwhile, quantitative analyses of the thrombin-stimulated platelets offered great insights into the biological functions of platelet proteins and therefore confirmed our global profiling data. A comparative proteomic analysis between rat and human platelets was also conducted, which revealed not only a significant similarity, but also an across-species evolutionary link that the orthologous proteins representing ‘core proteome’, and the ‘evolutionary proteome’ is actually a relatively static proteome. PMID:20443191
Paparelli, Laura; Corthout, Nikky; Pavie, Benjamin; Annaert, Wim; Munck, Sebastian
2016-01-01
The spatial distribution of proteins within the cell affects their capability to interact with other molecules and directly influences cellular processes and signaling. At the plasma membrane, multiple factors drive protein compartmentalization into specialized functional domains, leading to the formation of clusters in which intermolecule interactions are facilitated. Therefore, quantifying protein distributions is a necessity for understanding their regulation and function. The recent advent of super-resolution microscopy has opened up the possibility of imaging protein distributions at the nanometer scale. In parallel, new spatial analysis methods have been developed to quantify distribution patterns in super-resolution images. In this chapter, we provide an overview of super-resolution microscopy and summarize the factors influencing protein arrangements on the plasma membrane. Finally, we highlight methods for analyzing clusterization of plasma membrane proteins, including examples of their applications.
Mallik, Mrinmay Kumar
2018-02-07
Biological networks can be analyzed using "Centrality Analysis" to identify the more influential nodes and interactions in the network. This study was undertaken to create and visualize a biological network comprising of protein-protein interactions (PPIs) amongst proteins which are preferentially over-expressed in glioma cancer stem cell component (GCSC) of glioblastomas as compared to the glioma non-stem cancer cell (GNSC) component and then to analyze this network through centrality analyses (CA) in order to identify the essential proteins in this network and their interactions. In addition, this study proposes a new centrality analysis method pertaining exclusively to transcription factors (TFs) and interactions amongst them. Moreover the relevant molecular functions, biological processes and biochemical pathways amongst these proteins were sought through enrichment analysis. A protein interaction network was created using a list of proteins which have been shown to be preferentially expressed or over-expressed in GCSCs isolated from glioblastomas as compared to the GNSCs. This list comprising of 38 proteins, created using manual literature mining, was submitted to the Reactome FIViz tool, a web based application integrated into Cytoscape, an open source software platform for visualizing and analyzing molecular interaction networks and biological pathways to produce the network. This network was subjected to centrality analyses utilizing ranked lists of six centrality measures using the FIViz application and (for the first time) a dedicated centrality analysis plug-in ; CytoNCA. The interactions exclusively amongst the transcription factors were nalyzed through a newly proposed centrality analysis method called "Gene Expression Associated Degree Centrality Analysis (GEADCA)". Enrichment analysis was performed using the "network function analysis" tool on Reactome. The CA was able to identify a small set of proteins with consistently high centrality ranks that is indicative of their strong influence in the protein protein interaction network. Similarly the newly proposed GEADCA helped identify the transcription factors with high centrality values indicative of their key roles in transcriptional regulation. The enrichment studies provided a list of molecular functions, biological processes and biochemical pathways associated with the constructed network. The study shows how pathway based databases may be used to create and analyze a relevant protein interaction network in glioma cancer stem cells and identify the essential elements within it to gather insights into the molecular interactions that regulate the properties of glioma stem cells. How these insights may be utilized to help the development of future research towards formulation of new management strategies have been discussed from a theoretical standpoint. Copyright © 2017 Elsevier Ltd. All rights reserved.
Global analyses of Ceratocystis cacaofunesta mitochondria: from genome to proteome.
Ambrosio, Alinne Batista; do Nascimento, Leandro Costa; Oliveira, Bruno V; Teixeira, Paulo José P L; Tiburcio, Ricardo A; Toledo Thomazella, Daniela P; Leme, Adriana F P; Carazzolle, Marcelo F; Vidal, Ramon O; Mieczkowski, Piotr; Meinhardt, Lyndel W; Pereira, Gonçalo A G; Cabrera, Odalys G
2013-02-11
The ascomycete fungus Ceratocystis cacaofunesta is the causal agent of wilt disease in cacao, which results in significant economic losses in the affected producing areas. Despite the economic importance of the Ceratocystis complex of species, no genomic data are available for any of its members. Given that mitochondria play important roles in fungal virulence and the susceptibility/resistance of fungi to fungicides, we performed the first functional analysis of this organelle in Ceratocystis using integrated "omics" approaches. The C. cacaofunesta mitochondrial genome (mtDNA) consists of a single, 103,147-bp circular molecule, making this the second largest mtDNA among the Sordariomycetes. Bioinformatics analysis revealed the presence of 15 conserved genes and 37 intronic open reading frames in C. cacaofunesta mtDNA. Here, we predicted the mitochondrial proteome (mtProt) of C. cacaofunesta, which is comprised of 1,124 polypeptides - 52 proteins that are mitochondrially encoded and 1,072 that are nuclearly encoded. Transcriptome analysis revealed 33 probable novel genes. Comparisons among the Gene Ontology results of the predicted mtProt of C. cacaofunesta, Neurospora crassa and Saccharomyces cerevisiae revealed no significant differences. Moreover, C. cacaofunesta mitochondria were isolated, and the mtProt was subjected to mass spectrometric analysis. The experimental proteome validated 27% of the predicted mtProt. Our results confirmed the existence of 110 hypothetical proteins and 7 novel proteins of which 83 and 1, respectively, had putative mitochondrial localization. The present study provides the first partial genomic analysis of a species of the Ceratocystis genus and the first predicted mitochondrial protein inventory of a phytopathogenic fungus. In addition to the known mitochondrial role in pathogenicity, our results demonstrated that the global function analysis of this organelle is similar in pathogenic and non-pathogenic fungi, suggesting that its relevance in the lifestyle of these organisms should be based on a small number of specific proteins and/or with respect to differential gene regulation. In this regard, particular interest should be directed towards mitochondrial proteins with unknown function and the novel protein that might be specific to this species. Further functional characterization of these proteins could enhance our understanding of the role of mitochondria in phytopathogenicity.
Global analyses of Ceratocystis cacaofunesta mitochondria: from genome to proteome
2013-01-01
Background The ascomycete fungus Ceratocystis cacaofunesta is the causal agent of wilt disease in cacao, which results in significant economic losses in the affected producing areas. Despite the economic importance of the Ceratocystis complex of species, no genomic data are available for any of its members. Given that mitochondria play important roles in fungal virulence and the susceptibility/resistance of fungi to fungicides, we performed the first functional analysis of this organelle in Ceratocystis using integrated “omics” approaches. Results The C. cacaofunesta mitochondrial genome (mtDNA) consists of a single, 103,147-bp circular molecule, making this the second largest mtDNA among the Sordariomycetes. Bioinformatics analysis revealed the presence of 15 conserved genes and 37 intronic open reading frames in C. cacaofunesta mtDNA. Here, we predicted the mitochondrial proteome (mtProt) of C. cacaofunesta, which is comprised of 1,124 polypeptides - 52 proteins that are mitochondrially encoded and 1,072 that are nuclearly encoded. Transcriptome analysis revealed 33 probable novel genes. Comparisons among the Gene Ontology results of the predicted mtProt of C. cacaofunesta, Neurospora crassa and Saccharomyces cerevisiae revealed no significant differences. Moreover, C. cacaofunesta mitochondria were isolated, and the mtProt was subjected to mass spectrometric analysis. The experimental proteome validated 27% of the predicted mtProt. Our results confirmed the existence of 110 hypothetical proteins and 7 novel proteins of which 83 and 1, respectively, had putative mitochondrial localization. Conclusions The present study provides the first partial genomic analysis of a species of the Ceratocystis genus and the first predicted mitochondrial protein inventory of a phytopathogenic fungus. In addition to the known mitochondrial role in pathogenicity, our results demonstrated that the global function analysis of this organelle is similar in pathogenic and non-pathogenic fungi, suggesting that its relevance in the lifestyle of these organisms should be based on a small number of specific proteins and/or with respect to differential gene regulation. In this regard, particular interest should be directed towards mitochondrial proteins with unknown function and the novel protein that might be specific to this species. Further functional characterization of these proteins could enhance our understanding of the role of mitochondria in phytopathogenicity. PMID:23394930
Identification and proteomic analysis of osteoblast-derived exosomes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ge, Min; Ke, Ronghu; Cai, Tianyi
Exosomes are nanometer-sized vesicles with the function of intercellular communication, and they are released by various cell types. To reveal the knowledge about the exosomes from osteoblast, and explore the potential functions of osteogenesis, we isolated microvesicles from supernatants of mouse Mc3t3 by ultracentrifugation, characterized exosomes by electron microscopy and immunoblotting and presented the protein profile by proteomic analysis. The result demonstrated that microvesicles were between 30 and 100 nm in diameter, round shape with cup-like concavity and expressed exosomal marker tumor susceptibility gene (TSG) 101 and flotillin (Flot) 1. We identified a total number of 1069 proteins among which 786more » proteins overlap with ExoCarta database. Gene Oncology analysis indicated that exosomes mostly derived from plasma membrane and mainly involved in protein localization and intracellular signaling. The Ingenuity Pathway Analysis showed pathways are mostly involved in exosome biogenesis, formation, uptake and osteogenesis. Among the pathways, eukaryotic initiation factor 2 pathways played an important role in osteogenesis. Our study identified osteoblast-derived exosomes, unveiled the content of them, presented potential osteogenesis-related proteins and pathways and provided a rich proteomics data resource that will be valuable for further studies of the functions of individual proteins in bone diseases. - Highlights: • We for the first time identified exosomes from mouse osteoblast. • Osteoblasts-derived exosomes contain osteoblast peculiar proteins. • Proteins from osteoblasts-derived exosomes are intently involved in EIF2 pathway. • EIF2α from the EIF2 pathway plays an important role in osteogenesis.« less
Sun, Chia-Tsen; Chiang, Austin W T; Hwang, Ming-Jing
2017-10-27
Proteome-scale bioinformatics research is increasingly conducted as the number of completely sequenced genomes increases, but analysis of protein domains (PDs) usually relies on similarity in their amino acid sequences and/or three-dimensional structures. Here, we present results from a bi-clustering analysis on presence/absence data for 6,580 unique PDs in 2,134 species with a sequenced genome, thus covering a complete set of proteins, for the three superkingdoms of life, Bacteria, Archaea, and Eukarya. Our analysis revealed eight distinctive PD clusters, which, following an analysis of enrichment of Gene Ontology functions and CATH classification of protein structures, were shown to exhibit structural and functional properties that are taxa-characteristic. For examples, the largest cluster is ubiquitous in all three superkingdoms, constituting a set of 1,472 persistent domains created early in evolution and retained in living organisms and characterized by basic cellular functions and ancient structural architectures, while an Archaea and Eukarya bi-superkingdom cluster suggests its PDs may have existed in the ancestor of the two superkingdoms, and others are single superkingdom- or taxa (e.g. Fungi)-specific. These results contribute to increase our appreciation of PD diversity and our knowledge of how PDs are used in species, yielding implications on species evolution.
Berjón-Otero, Mónica; Lechuga, Ana; Mehla, Jitender; Uetz, Peter; Salas, Margarita; Redrejo-Rodríguez, Modesto
2017-07-26
Tectiviridae comprises a group of tail-less, icosahedral, membrane-containing bacteriophages that can be divided into two groups by their hosts, either Gram-negative or Gram-positive bacteria. While the first group is composed of PRD1 and nearly identical well characterized lytic viruses, the second one includes more variable temperate phages, like GIL16 or Bam35, whose hosts are Bacillus cereus and related Gram-positive bacteria.In the genome of Bam35, nearly half of the 32 annotated open reading frames (ORFs) have no homologs in databases (ORFans), being putative proteins of unknown function, which hinders the understanding of their biology. With the aim of increasing the knowledge of the viral proteome, we carried out a comprehensive yeast two-hybrid analysis among all the putative proteins encoded by the Bam35 genome. The resulting protein interactome comprises 76 unique interactions among 24 proteins, of which 12 have an unknown function. These results suggested that the P17 protein is the minor capsid protein of Bam35 and P24 is the penton protein, being the latter also supported by iterative threading protein modeling. Moreover, the inner membrane transglycosylase protein P26 could have an additional structural role. We also detected interactions involving non-structural proteins, such as the DNA binding protein P1 and the genome terminal protein (P4), which was confirmed by co-immunoprecipitation of recombinant proteins. Altogether, our results provide a functional view of the Bam35 viral proteome, with a focus on the composition and organization of the viral particle. IMPORTANCE Tail-less viruses of the family Tectiviridae can infect commensal and pathogenic Gram-positive and Gram-negative bacteria. Moreover, they have been proposed to be at the evolutionary origin of several groups of large eukaryotic DNA viruses and self-replicating plasmids. However, due to their ancient origin and complex diversity, many tectiviral proteins are ORFans of unknown function.Comprehensive protein-protein interaction (PPI) analysis among viral proteins can eventually disclose biological mechanisms and thus provide new insights into protein function unattainable by studying proteins one by one. Here we comprehensively describe intraviral PPIs among tectivirus Bam35 proteins using multi-vector yeast two-hybrid screening that was further supported by co-immunoprecipitation assays and protein structural models. This approach allowed us to propose new functions for known proteins and hypothesize on the biological role localization within the viral particle of some viral ORFan proteins that will be helpful for understanding the biology of Gram-positive tectivirus. Copyright © 2017 American Society for Microbiology.
Berjón-Otero, Mónica; Lechuga, Ana; Mehla, Jitender; Uetz, Peter
2017-01-01
ABSTRACT The family Tectiviridae comprises a group of tailless, icosahedral, membrane-containing bacteriophages that can be divided into two groups by their hosts, either Gram-negative or Gram-positive bacteria. While the first group is composed of PRD1 and nearly identical well-characterized lytic viruses, the second one includes more variable temperate phages, like GIL16 or Bam35, whose hosts are Bacillus cereus and related Gram-positive bacteria. In the genome of Bam35, nearly half of the 32 annotated open reading frames (ORFs) have no homologs in databases (ORFans), being putative proteins of unknown function, which hinders the understanding of their biology. With the aim of increasing knowledge about the viral proteome, we carried out a comprehensive yeast two-hybrid analysis of all the putative proteins encoded by the Bam35 genome. The resulting protein interactome comprised 76 unique interactions among 24 proteins, of which 12 have an unknown function. These results suggest that the P17 protein is the minor capsid protein of Bam35 and P24 is the penton protein, with the latter finding also being supported by iterative threading protein modeling. Moreover, the inner membrane transglycosylase protein P26 could have an additional structural role. We also detected interactions involving nonstructural proteins, such as the DNA-binding protein P1 and the genome terminal protein (P4), which was confirmed by coimmunoprecipitation of recombinant proteins. Altogether, our results provide a functional view of the Bam35 viral proteome, with a focus on the composition and organization of the viral particle. IMPORTANCE Tailless viruses of the family Tectiviridae can infect commensal and pathogenic Gram-positive and Gram-negative bacteria. Moreover, they have been proposed to be at the evolutionary origin of several groups of large eukaryotic DNA viruses and self-replicating plasmids. However, due to their ancient origin and complex diversity, many tectiviral proteins are ORFans of unknown function. Comprehensive protein-protein interaction (PPI) analysis of viral proteins can eventually disclose biological mechanisms and thus provide new insights into protein function unattainable by studying proteins one by one. Here we comprehensively describe intraviral PPIs among tectivirus Bam35 proteins determined using multivector yeast two-hybrid screening, and these PPIs were further supported by the results of coimmunoprecipitation assays and protein structural models. This approach allowed us to propose new functions for known proteins and hypothesize about the biological role of the localization of some viral ORFan proteins within the viral particle that will be helpful for understanding the biology of tectiviruses infecting Gram-positive bacteria. PMID:28747494
Magnetic capture of polydopamine-encapsulated Hela cells for the analysis of cell surface proteins.
Liu, Yiying; Yan, Guoquan; Gao, Mingxia; Zhang, Xiangmin
2018-02-10
A novel method to characterize cell surface proteins and complexes has been developed. Polydopamine (PDA)-encapsulated Hela cells were prepared for plasma membrane proteome research. Since the PDA protection, the encapsulated cells could be maintained for more than two weeks. Amino groups functionalized magnetic nanoparticles were also used for cell capture by the reaction with the PDA coatings. Plasma membrane fragments were isolated and enriched with assistance of an external magnetic field after disruption of the coated cells by ultrasonic treatment. Plasma membrane proteins (PMPs) and complexes were well preserved on the fragments and identified by shot-gun proteomic analytical strategy. 385 PMPs and 1411 non-PMPs were identified using the method. 85.2% of these PMPs were lipid-raft associated proteins. Ingenuity Pathway Analysis was employed for bio-information extraction from the identified proteins. It was found that 653 non-PMPs had interactions with 140 PMPs. Among them, epidermal growth factor receptor and its complexes, and a series of important pathways including STAT3 pathway were observed. All these results demonstrated that the new approach is of great importance in applying to the research of physiological function and mechanism of the plasma membrane proteins. This work developed a novel strategy for the proteomic analysis of cell surface proteins. According to the results, 73.3% of total identified proteins were lipid-raft associated proteins, which imply that the proposed method is of great potential in the identification of lipid-raft associated proteins. In addition, a series of protein-protein interactions and pathways related to Hela cells were pointed out. All these results demonstrated that our proposed approach is of great importance and could well be applied to the physiological function and mechanism research of plasma membrane proteins. Copyright © 2017 Elsevier B.V. All rights reserved.
Molecular analysis of Hsp70 mechanisms in plants and their function in response to stress.
Usman, Magaji G; Rafii, Mohd Y; Martini, Mohammad Y; Yusuff, Oladosu A; Ismail, Mohd R; Miah, Gous
2017-04-01
Studying the strategies of improving abiotic stress tolerance is quite imperative and research under this field will increase our understanding of response mechanisms to abiotic stress such as heat. The Hsp70 is an essential regulator of protein having the tendency to maintain internal cell stability like proper folding protein and breakdown of unfolded proteins. Hsp70 holds together protein substrates to help in movement, regulation, and prevent aggregation under physical and or chemical pressure. However, this review reports the molecular mechanism of heat shock protein 70 kDa (Hsp70) action and its structural and functional analysis, research progress on the interaction of Hsp70 with other proteins and their interaction mechanisms as well as the involvement of Hsp70 in abiotic stress responses as an adaptive defense mechanism.
Label-free Quantitative Protein Profiling of vastus lateralis Muscle During Human Aging*
Théron, Laëtitia; Gueugneau, Marine; Coudy, Cécile; Viala, Didier; Bijlsma, Astrid; Butler-Browne, Gillian; Maier, Andrea; Béchet, Daniel; Chambon, Christophe
2014-01-01
Sarcopenia corresponds to the loss of muscle mass occurring during aging, and is associated with a loss of muscle functionality. Proteomic links the muscle functional changes with protein expression pattern. To better understand the mechanisms involved in muscle aging, we performed a proteomic analysis of Vastus lateralis muscle in mature and older women. For this, a shotgun proteomic method was applied to identify soluble proteins in muscle, using a combination of high performance liquid chromatography and mass spectrometry. A label-free protein profiling was then conducted to quantify proteins and compare profiles from mature and older women. This analysis showed that 35 of the 366 identified proteins were linked to aging in muscle. Most of the proteins were under-represented in older compared with mature women. We built a functional interaction network linking the proteins differentially expressed between mature and older women. The results revealed that the main differences between mature and older women were defined by proteins involved in energy metabolism and proteins from the myofilament and cytoskeleton. This is the first time that label-free quantitative proteomics has been applied to study of aging mechanisms in human skeletal muscle. This approach highlights new elements for elucidating the alterations observed during aging and may lead to novel sarcopenia biomarkers. PMID:24217021
Label-free quantitative protein profiling of vastus lateralis muscle during human aging.
Théron, Laëtitia; Gueugneau, Marine; Coudy, Cécile; Viala, Didier; Bijlsma, Astrid; Butler-Browne, Gillian; Maier, Andrea; Béchet, Daniel; Chambon, Christophe
2014-01-01
Sarcopenia corresponds to the loss of muscle mass occurring during aging, and is associated with a loss of muscle functionality. Proteomic links the muscle functional changes with protein expression pattern. To better understand the mechanisms involved in muscle aging, we performed a proteomic analysis of Vastus lateralis muscle in mature and older women. For this, a shotgun proteomic method was applied to identify soluble proteins in muscle, using a combination of high performance liquid chromatography and mass spectrometry. A label-free protein profiling was then conducted to quantify proteins and compare profiles from mature and older women. This analysis showed that 35 of the 366 identified proteins were linked to aging in muscle. Most of the proteins were under-represented in older compared with mature women. We built a functional interaction network linking the proteins differentially expressed between mature and older women. The results revealed that the main differences between mature and older women were defined by proteins involved in energy metabolism and proteins from the myofilament and cytoskeleton. This is the first time that label-free quantitative proteomics has been applied to study of aging mechanisms in human skeletal muscle. This approach highlights new elements for elucidating the alterations observed during aging and may lead to novel sarcopenia biomarkers.
Nakano, Shogo; Asano, Yasuhisa
2015-02-03
Development of software and methods for design of complete sequences of functional proteins could contribute to studies of protein engineering and protein evolution. To this end, we developed the INTMSAlign software, and used it to design functional proteins and evaluate their usefulness. The software could assign both consensus and correlation residues of target proteins. We generated three protein sequences with S-selective hydroxynitrile lyase (S-HNL) activity, which we call designed S-HNLs; these proteins folded as efficiently as the native S-HNL. Sequence and biochemical analysis of the designed S-HNLs suggested that accumulation of neutral mutations occurs during the process of S-HNLs evolution from a low-activity form to a high-activity (native) form. Taken together, our results demonstrate that our software and the associated methods could be applied not only to design of complete sequences, but also to predictions of protein evolution, especially within families such as esterases and S-HNLs.
NASA Astrophysics Data System (ADS)
Nakano, Shogo; Asano, Yasuhisa
2015-02-01
Development of software and methods for design of complete sequences of functional proteins could contribute to studies of protein engineering and protein evolution. To this end, we developed the INTMSAlign software, and used it to design functional proteins and evaluate their usefulness. The software could assign both consensus and correlation residues of target proteins. We generated three protein sequences with S-selective hydroxynitrile lyase (S-HNL) activity, which we call designed S-HNLs; these proteins folded as efficiently as the native S-HNL. Sequence and biochemical analysis of the designed S-HNLs suggested that accumulation of neutral mutations occurs during the process of S-HNLs evolution from a low-activity form to a high-activity (native) form. Taken together, our results demonstrate that our software and the associated methods could be applied not only to design of complete sequences, but also to predictions of protein evolution, especially within families such as esterases and S-HNLs.
Murine colon proteome and characterization of the protein pathways
2012-01-01
Background Most of the current proteomic researches focus on proteome alteration due to pathological disorders (i.e.: colorectal cancer) rather than normal healthy state when mentioning colon. As a result, there are lacks of information regarding normal whole tissue- colon proteome. Results We report here a detailed murine (mouse) whole tissue- colon protein reference dataset composed of 1237 confident protein (FDR < 2) with comprehensive insight on its peptide properties, cellular and subcellular localization, functional network GO annotation analysis, and its relative abundances. The presented dataset includes wide spectra of pI and Mw ranged from 3–12 and 4–600 KDa, respectively. Gravy index scoring predicted 19.5% membranous and 80.5% globularly located proteins. GO hierarchies and functional network analysis illustrated proteins function together with their relevance and implication of several candidates in malignancy such as Mitogen- activated protein kinase (Mapk8, 9) in colorectal cancer, Fibroblast growth factor receptor (Fgfr 2), Glutathione S-transferase (Gstp1) in prostate cancer, and Cell division control protein (Cdc42), Ras-related protein (Rac1,2) in pancreatic cancer. Protein abundances calculated with 3 different algorithms (NSAF, PAF and emPAI) provide a relative quantification under normal condition as guidance. Conclusions This highly confidence colon proteome catalogue will not only serve as a useful reference for further experiments characterizing differentially expressed proteins induced from diseased conditions, but also will aid in better understanding the ontology and functional absorptive mechanism of the colon as well. PMID:22929016
Fast and anisotropic flexibility-rigidity index for protein flexibility and fluctuation analysis
NASA Astrophysics Data System (ADS)
Opron, Kristopher; Xia, Kelin; Wei, Guo-Wei
2014-06-01
Protein structural fluctuation, typically measured by Debye-Waller factors, or B-factors, is a manifestation of protein flexibility, which strongly correlates to protein function. The flexibility-rigidity index (FRI) is a newly proposed method for the construction of atomic rigidity functions required in the theory of continuum elasticity with atomic rigidity, which is a new multiscale formalism for describing excessively large biomolecular systems. The FRI method analyzes protein rigidity and flexibility and is capable of predicting protein B-factors without resorting to matrix diagonalization. A fundamental assumption used in the FRI is that protein structures are uniquely determined by various internal and external interactions, while the protein functions, such as stability and flexibility, are solely determined by the structure. As such, one can predict protein flexibility without resorting to the protein interaction Hamiltonian. Consequently, bypassing the matrix diagonalization, the original FRI has a computational complexity of O(N^2). This work introduces a fast FRI (fFRI) algorithm for the flexibility analysis of large macromolecules. The proposed fFRI further reduces the computational complexity to O(N). Additionally, we propose anisotropic FRI (aFRI) algorithms for the analysis of protein collective dynamics. The aFRI algorithms permit adaptive Hessian matrices, from a completely global 3N × 3N matrix to completely local 3 × 3 matrices. These 3 × 3 matrices, despite being calculated locally, also contain non-local correlation information. Eigenvectors obtained from the proposed aFRI algorithms are able to demonstrate collective motions. Moreover, we investigate the performance of FRI by employing four families of radial basis correlation functions. Both parameter optimized and parameter-free FRI methods are explored. Furthermore, we compare the accuracy and efficiency of FRI with some established approaches to flexibility analysis, namely, normal mode analysis and Gaussian network model (GNM). The accuracy of the FRI method is tested using four sets of proteins, three sets of relatively small-, medium-, and large-sized structures and an extended set of 365 proteins. A fifth set of proteins is used to compare the efficiency of the FRI, fFRI, aFRI, and GNM methods. Intensive validation and comparison indicate that the FRI, particularly the fFRI, is orders of magnitude more efficient and about 10% more accurate overall than some of the most popular methods in the field. The proposed fFRI is able to predict B-factors for α-carbons of the HIV virus capsid (313 236 residues) in less than 30 seconds on a single processor using only one core. Finally, we demonstrate the application of FRI and aFRI to protein domain analysis.
Fast and anisotropic flexibility-rigidity index for protein flexibility and fluctuation analysis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Opron, Kristopher; Xia, Kelin; Wei, Guo-Wei, E-mail: wei@math.msu.edu
Protein structural fluctuation, typically measured by Debye-Waller factors, or B-factors, is a manifestation of protein flexibility, which strongly correlates to protein function. The flexibility-rigidity index (FRI) is a newly proposed method for the construction of atomic rigidity functions required in the theory of continuum elasticity with atomic rigidity, which is a new multiscale formalism for describing excessively large biomolecular systems. The FRI method analyzes protein rigidity and flexibility and is capable of predicting protein B-factors without resorting to matrix diagonalization. A fundamental assumption used in the FRI is that protein structures are uniquely determined by various internal and external interactions,more » while the protein functions, such as stability and flexibility, are solely determined by the structure. As such, one can predict protein flexibility without resorting to the protein interaction Hamiltonian. Consequently, bypassing the matrix diagonalization, the original FRI has a computational complexity of O(N{sup 2}). This work introduces a fast FRI (fFRI) algorithm for the flexibility analysis of large macromolecules. The proposed fFRI further reduces the computational complexity to O(N). Additionally, we propose anisotropic FRI (aFRI) algorithms for the analysis of protein collective dynamics. The aFRI algorithms permit adaptive Hessian matrices, from a completely global 3N × 3N matrix to completely local 3 × 3 matrices. These 3 × 3 matrices, despite being calculated locally, also contain non-local correlation information. Eigenvectors obtained from the proposed aFRI algorithms are able to demonstrate collective motions. Moreover, we investigate the performance of FRI by employing four families of radial basis correlation functions. Both parameter optimized and parameter-free FRI methods are explored. Furthermore, we compare the accuracy and efficiency of FRI with some established approaches to flexibility analysis, namely, normal mode analysis and Gaussian network model (GNM). The accuracy of the FRI method is tested using four sets of proteins, three sets of relatively small-, medium-, and large-sized structures and an extended set of 365 proteins. A fifth set of proteins is used to compare the efficiency of the FRI, fFRI, aFRI, and GNM methods. Intensive validation and comparison indicate that the FRI, particularly the fFRI, is orders of magnitude more efficient and about 10% more accurate overall than some of the most popular methods in the field. The proposed fFRI is able to predict B-factors for α-carbons of the HIV virus capsid (313 236 residues) in less than 30 seconds on a single processor using only one core. Finally, we demonstrate the application of FRI and aFRI to protein domain analysis.« less
Zhang, Peng; Shen, Yu; Guo, Jin-Song; Li, Chun; Wang, Han; Chen, You-Peng; Yan, Peng; Yang, Ji-Xiang; Fang, Fang
2015-07-10
In this work, proteins in extracellular polymeric substances extracted from anaerobic, anoxic and aerobic sludges of wastewater treatment plant (WWTP) were analyzed to probe their origins and functions. Extracellular proteins in WWTP sludges were identified using shotgun proteomics, and 130, 108 and 114 proteins in anaerobic, anoxic and aerobic samples were classified, respectively. Most proteins originated from cell and cell part, and their most major molecular functions were catalytic activity and binding activity. The results exhibited that the main roles of extracellular proteins in activated sludges were multivalence cations and organic molecules binding, as well as in catalysis and degradation. The catalytic activity proteins were more widespread in anaerobic sludge compared with those in anoxic and aerobic sludges. The structure difference between anaerobic and aerobic sludges could be associated with their catalytic activities proteins. The results also put forward a relation between the macro characteristics of activated sludges and micro functions of extracellular proteins in biological wastewater treatment process.
webPIPSA: a web server for the comparison of protein interaction properties
Richter, Stefan; Wenzel, Anne; Stein, Matthias; Gabdoulline, Razif R.; Wade, Rebecca C.
2008-01-01
Protein molecular interaction fields are key determinants of protein functionality. PIPSA (Protein Interaction Property Similarity Analysis) is a procedure to compare and analyze protein molecular interaction fields, such as the electrostatic potential. PIPSA may assist in protein functional assignment, classification of proteins, the comparison of binding properties and the estimation of enzyme kinetic parameters. webPIPSA is a web server that enables the use of PIPSA to compare and analyze protein electrostatic potentials. While PIPSA can be run with downloadable software (see http://projects.eml.org/mcm/software/pipsa), webPIPSA extends and simplifies a PIPSA run. This allows non-expert users to perform PIPSA for their protein datasets. With input protein coordinates, the superposition of protein structures, as well as the computation and analysis of electrostatic potentials, is automated. The results are provided as electrostatic similarity matrices from an all-pairwise comparison of the proteins which can be subjected to clustering and visualized as epograms (tree-like diagrams showing electrostatic potential differences) or heat maps. webPIPSA is freely available at: http://pipsa.eml.org. PMID:18420653
Hayashi, Shigehiko; Uchida, Yoshihiro; Hasegawa, Taisuke; Higashi, Masahiro; Kosugi, Takahiro; Kamiya, Motoshi
2017-05-05
Many remarkable molecular functions of proteins use their characteristic global and slow conformational dynamics through coupling of local chemical states in reaction centers with global conformational changes of proteins. To theoretically examine the functional processes of proteins in atomic detail, a methodology of quantum mechanical/molecular mechanical (QM/MM) free-energy geometry optimization is introduced. In the methodology, a geometry optimization of a local reaction center is performed with a quantum mechanical calculation on a free-energy surface constructed with conformational samples of the surrounding protein environment obtained by a molecular dynamics simulation with a molecular mechanics force field. Geometry optimizations on extensive free-energy surfaces by a QM/MM reweighting free-energy self-consistent field method designed to be variationally consistent and computationally efficient have enabled examinations of the multiscale molecular coupling of local chemical states with global protein conformational changes in functional processes and analysis and design of protein mutants with novel functional properties.
NASA Astrophysics Data System (ADS)
Hayashi, Shigehiko; Uchida, Yoshihiro; Hasegawa, Taisuke; Higashi, Masahiro; Kosugi, Takahiro; Kamiya, Motoshi
2017-05-01
Many remarkable molecular functions of proteins use their characteristic global and slow conformational dynamics through coupling of local chemical states in reaction centers with global conformational changes of proteins. To theoretically examine the functional processes of proteins in atomic detail, a methodology of quantum mechanical/molecular mechanical (QM/MM) free-energy geometry optimization is introduced. In the methodology, a geometry optimization of a local reaction center is performed with a quantum mechanical calculation on a free-energy surface constructed with conformational samples of the surrounding protein environment obtained by a molecular dynamics simulation with a molecular mechanics force field. Geometry optimizations on extensive free-energy surfaces by a QM/MM reweighting free-energy self-consistent field method designed to be variationally consistent and computationally efficient have enabled examinations of the multiscale molecular coupling of local chemical states with global protein conformational changes in functional processes and analysis and design of protein mutants with novel functional properties.
Organization and Regulation of Soybean SUMOylation System under Abiotic Stress Conditions
Li, Yanjun; Wang, Guixin; Xu, Zeqian; Li, Jing; Sun, Mengwei; Guo, Jingsong; Ji, Wei
2017-01-01
Covalent attachment of the small ubiquitin-related modifier, SUMO, to substrate proteins plays a significant role in plants under stress conditions, which can alter target proteins' function, location, and protein-protein interactions. Despite this importance, information about SUMOylation in the major legume crop, soybean, remains obscure. In this study, we performed a bioinformatics analysis of the entire soybean genome and identified 40 genes belonged to six families involved in a cascade of enzymatic reactions in soybean SUMOylation system. The cis-acting elements analysis revealed that promoters of SUMO pathway genes contained different combinations of stress and development-related cis-regulatory elements. RNA-seq data analysis showed that SUMO pathway components exhibited versatile tissue-specific expression patterns, indicating coordinated functioning during plant growth and development. qRT-PCR analysis of 13 SUMO pathway members indicated that majority of the SUMO pathway members were transcriptionally up-regulated by NaCl, heat and ABA stimuli during the 24 h period of treatment. Furthermore, SUMOylation dynamics in soybean roots under abiotic stress treatment were analyzed by western blot, which were characterized by regulation of SUMOylated proteins. Collectively, this study defined the organization of the soybean SUMOylation system and implied an essential function for SUMOylation in soybean abiotic stress responses. PMID:28878795
Cell-free protein synthesis: applications in proteomics and biotechnology.
He, Mingyue
2008-01-01
Protein production is one of the key steps in biotechnology and functional proteomics. Expression of proteins in heterologous hosts (such as in E. coli) is generally lengthy and costly. Cell-free protein synthesis is thus emerging as an attractive alternative. In addition to the simplicity and speed for protein production, cell-free expression allows generation of functional proteins that are difficult to produce by in vivo systems. Recent exploitation of cell-free systems enables novel development of technologies for rapid discovery of proteins with desirable properties from very large libraries. This article reviews the recent development in cell-free systems and their application in the large scale protein analysis.
Lua, Rhonald C; Wilson, Stephen J; Konecki, Daniel M; Wilkins, Angela D; Venner, Eric; Morgan, Daniel H; Lichtarge, Olivier
2016-01-04
The structure and function of proteins underlie most aspects of biology and their mutational perturbations often cause disease. To identify the molecular determinants of function as well as targets for drugs, it is central to characterize the important residues and how they cluster to form functional sites. The Evolutionary Trace (ET) achieves this by ranking the functional and structural importance of the protein sequence positions. ET uses evolutionary distances to estimate functional distances and correlates genotype variations with those in the fitness phenotype. Thus, ET ranks are worse for sequence positions that vary among evolutionarily closer homologs but better for positions that vary mostly among distant homologs. This approach identifies functional determinants, predicts function, guides the mutational redesign of functional and allosteric specificity, and interprets the action of coding sequence variations in proteins, people and populations. Now, the UET database offers pre-computed ET analyses for the protein structure databank, and on-the-fly analysis of any protein sequence. A web interface retrieves ET rankings of sequence positions and maps results to a structure to identify functionally important regions. This UET database integrates several ways of viewing the results on the protein sequence or structure and can be found at http://mammoth.bcm.tmc.edu/uet/. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Production of coconut protein powder from coconut wet processing waste and its characterization.
Naik, Aduja; Raghavendra, S N; Raghavarao, K S M S
2012-07-01
Virgin coconut oil (VCO) has been gaining popularity in recent times. During its production, byproducts such as coconut skim milk and insoluble protein are obtained which are underutilized or thrown away to the environment at present. This study deals with utilization of these byproducts to obtain a value-added product, namely, coconut protein powder. When coconut milk was subjected to centrifugation, three phases, namely, fat phase (coconut cream), aqueous phase (coconut skim milk), and solid phase (insoluble protein) were obtained. The coconut skim milk and insoluble protein were mixed and homogenized before spray drying to obtain a dehydrated protein powder. The proximate analysis of the powder showed high protein content (33 % w/w) and low fat content (3 % w/w). Protein solubility was studied as a function of pH and ionic content of solvent. Functional properties such as water hydration capacity, fat absorption capacity, emulsifying properties, wettability, and dispersibility of coconut protein powder were evaluated along with morphological characterization, polyphenol content, and color analysis. Coconut protein powder has shown to have good emulsifying properties and hence has potential to find applications in emulsified foods. Sensory analysis showed high overall quality of the product, indicating that coconut protein powder could be a useful food ingredient.
Zhao, Wei; Niu, Ke; Zhao, Jian; Jin, Yi-ming; Sui, Ting-ting; Wang, Wen
2013-09-01
Human astrovirus (HAstV) is one of the leading causes of actue virual diarrhea in infants. HAstV-induced epithdlial cell apoptosis plays an important role in the pathogenesis of HAstV infection. Our previous study indicated that HAstV non-structural protein nsPla C-terminal protein nsPla/4 was the major apoptosis functional protein and probably contained the main apoptosis domains. In order to screen for astrovirus encoded apoptotic protien, nsPla/4 and six turncated proteins, which possessed nsPla/4 protein different function domain ,were cloned into green fluorescent protein (GFP) vector pEG-FP-N3. After 24-72 h transfection, the fusion protein expression in BHK21 cells, was analysis by fluorescence microscope and Western blot. The results indicated seven fusion proteins were observed successfully in BHK21 cell after transfected for 24 h. Western blot analysis showed that the level of fusion protein expressed in BHK21 cells was increased significantly at 72h compared to 48h in transfected cells. The successful expression of deletion mutants of nsPla/4 protein was an important foundation to gain further insights into the function of apoptosis domains of nsPla/4 protein and it would also provide research platform to further confirm the molecule pathogenic mechanism of human astrovirus.
Ayyar, Vivaswath S; Almon, Richard R; DuBois, Debra C; Sukumaran, Siddharth; Qu, Jun; Jusko, William J
2017-05-08
Corticosteroids (CS) are anti-inflammatory agents that cause extensive pharmacogenomic and proteomic changes in multiple tissues. An understanding of the proteome-wide effects of CS in liver and its relationships to altered hepatic and systemic physiology remains incomplete. Here, we report the application of a functional pharmacoproteomic approach to gain integrated insight into the complex nature of CS responses in liver in vivo. An in-depth functional analysis was performed using rich pharmacodynamic (temporal-based) proteomic data measured over 66h in rat liver following a single dose of methylprednisolone (MPL). Data mining identified 451 differentially regulated proteins. These proteins were analyzed on the basis of temporal regulation, cellular localization, and literature-mined functional information. Of the 451 proteins, 378 were clustered into six functional groups based on major clinically-relevant effects of CS in liver. MPL-responsive proteins were highly localized in the mitochondria (20%) and cytosol (24%). Interestingly, several proteins were related to hepatic stress and signaling processes, which appear to be involved in secondary signaling cascades and in protecting the liver from CS-induced oxidative damage. Consistent with known adverse metabolic effects of CS, several rate-controlling enzymes involved in amino acid metabolism, gluconeogenesis, and fatty-acid metabolism were altered by MPL. In addition, proteins involved in the metabolism of endogenous compounds, xenobiotics, and therapeutic drugs including cytochrome P450 and Phase-II enzymes were differentially regulated. Proteins related to the inflammatory acute-phase response were up-regulated in response to MPL. Functionally-similar proteins showed large diversity in their temporal profiles, indicating complex mechanisms of regulation by CS. Clinical use of corticosteroid (CS) therapy is frequent and chronic. However, current knowledge on the proteome-level effects of CS in liver and other tissues is sparse. While transcriptomic regulation following methylprednisolone (MPL) dosing has been temporally examined in rat liver, proteomic assessments are needed to better characterize the tissue-specific functional aspects of MPL actions. This study describes a functional pharmacoproteomic analysis of dynamic changes in MPL-regulated proteins in liver and provides biological insight into how steroid-induced perturbations on a molecular level may relate to both adverse and therapeutic responses presented clinically. Copyright © 2017 Elsevier B.V. All rights reserved.
Jaiswal, Mamta; Dvorsky, Radovan; Ahmadian, Mohammad Reza
2013-02-08
The diffuse B-cell lymphoma (Dbl) family of the guanine nucleotide exchange factors is a direct activator of the Rho family proteins. The Rho family proteins are involved in almost every cellular process that ranges from fundamental (e.g. the establishment of cell polarity) to highly specialized processes (e.g. the contraction of vascular smooth muscle cells). Abnormal activation of the Rho proteins is known to play a crucial role in cancer, infectious and cognitive disorders, and cardiovascular diseases. However, the existence of 74 Dbl proteins and 25 Rho-related proteins in humans, which are largely uncharacterized, has led to increasing complexity in identifying specific upstream pathways. Thus, we comprehensively investigated sequence-structure-function-property relationships of 21 representatives of the Dbl protein family regarding their specificities and activities toward 12 Rho family proteins. The meta-analysis approach provides an unprecedented opportunity to broadly profile functional properties of Dbl family proteins, including catalytic efficiency, substrate selectivity, and signaling specificity. Our analysis has provided novel insights into the following: (i) understanding of the relative differences of various Rho protein members in nucleotide exchange; (ii) comparing and defining individual and overall guanine nucleotide exchange factor activities of a large representative set of the Dbl proteins toward 12 Rho proteins; (iii) grouping the Dbl family into functionally distinct categories based on both their catalytic efficiencies and their sequence-structural relationships; (iv) identifying conserved amino acids as fingerprints of the Dbl and Rho protein interaction; and (v) defining amino acid sequences conserved within, but not between, Dbl subfamilies. Therefore, the characteristics of such specificity-determining residues identified the regions or clusters conserved within the Dbl subfamilies.
Rebelling for a Reason: Protein Structural “Outliers”
Arumugam, Gandhimathi; Nair, Anu G.; Hariharaputran, Sridhar; Ramanathan, Sowdhamini
2013-01-01
Analysis of structural variation in domain superfamilies can reveal constraints in protein evolution which aids protein structure prediction and classification. Structure-based sequence alignment of distantly related proteins, organized in PASS2 database, provides clues about structurally conserved regions among different functional families. Some superfamily members show large structural differences which are functionally relevant. This paper analyses the impact of structural divergence on function for multi-member superfamilies, selected from the PASS2 superfamily alignment database. Functional annotations within superfamilies, with structural outliers or ‘rebels’, are discussed in the context of structural variations. Overall, these data reinforce the idea that functional similarities cannot be extrapolated from mere structural conservation. The implication for fold-function prediction is that the functional annotations can only be inherited with very careful consideration, especially at low sequence identities. PMID:24073209
Wang, Guo-Bao; Zheng, Qin; Shen, Yun-Wang; Wu, Xiao-Feng
2016-02-01
The insect brain plays crucial roles in the regulation of growth and development and in all types of behavior. We used sodium dodecyl sulfate polyacrylamide gel electrophoresis and high-performance liquid chromatography - electron spray ionization tandem mass spectrometry (ESI-MS/MS) shotgun to identify the proteome of the silkworm brain, to investigate its protein composition and to understand their biological functions. A total of 2210 proteins with molecular weights in the range of 5.64-1539.82 kDa and isoelectric points in the range of 3.78-12.55 were identified. These proteins were annotated according to Gene Ontology Annotation into the categories of molecular function, biological process and cellular component. We characterized two categories of proteins: one includes behavior-related proteins involved in the regulation of behaviors, such as locomotion, reproduction and learning; the other consists of proteins related to the development or function of the nervous system. The identified proteins were classified into 283 different pathways according to KEGG analysis, including the PI3K-Akt signaling pathway which plays a crucial role in mediating survival signals in a wide range of neuronal cell types. This extensive protein profile provides a basis for further understanding of the physiological functions in the silkworm brain. © 2014 Institute of Zoology, Chinese Academy of Sciences.
Popescu, Sorina C.; Popescu, George V.; Bachan, Shawn; Zhang, Zimei; Seay, Montrell; Gerstein, Mark; Snyder, Michael; Dinesh-Kumar, S. P.
2007-01-01
Calmodulins (CaMs) are the most ubiquitous calcium sensors in eukaryotes. A number of CaM-binding proteins have been identified through classical methods, and many proteins have been predicted to bind CaMs based on their structural homology with known targets. However, multicellular organisms typically contain many CaM-like (CML) proteins, and a global identification of their targets and specificity of interaction is lacking. In an effort to develop a platform for large-scale analysis of proteins in plants we have developed a protein microarray and used it to study the global analysis of CaM/CML interactions. An Arabidopsis thaliana expression collection containing 1,133 ORFs was generated and used to produce proteins with an optimized medium-throughput plant-based expression system. Protein microarrays were prepared and screened with several CaMs/CMLs. A large number of previously known and novel CaM/CML targets were identified, including transcription factors, receptor and intracellular protein kinases, F-box proteins, RNA-binding proteins, and proteins of unknown function. Multiple CaM/CML proteins bound many binding partners, but the majority of targets were specific to one or a few CaMs/CMLs indicating that different CaM family members function through different targets. Based on our analyses, the emergent CaM/CML interactome is more extensive than previously predicted. Our results suggest that calcium functions through distinct CaM/CML proteins to regulate a wide range of targets and cellular activities. PMID:17360592
Naqvi, Ahmad Abu Turab; Shahbaaz, Mohd; Ahmad, Faizan; Hassan, Md. Imtaiyaz
2015-01-01
Syphilis is a globally occurring venereal disease, and its infection is propagated through sexual contact. The causative agent of syphilis, Treponema pallidum ssp. pallidum, a Gram-negative sphirochaete, is an obligate human parasite. Genome of T. pallidum ssp. pallidum SS14 strain (RefSeq NC_010741.1) encodes 1,027 proteins, of which 444 proteins are known as hypothetical proteins (HPs), i.e., proteins of unknown functions. Here, we performed functional annotation of HPs of T. pallidum ssp. pallidum using various database, domain architecture predictors, protein function annotators and clustering tools. We have analyzed the sequences of 444 HPs of T. pallidum ssp. pallidum and subsequently predicted the function of 207 HPs with a high level of confidence. However, functions of 237 HPs are predicted with less accuracy. We found various enzymes, transporters, binding proteins in the annotated group of HPs that may be possible molecular targets, facilitating for the survival of pathogen. Our comprehensive analysis helps to understand the mechanism of pathogenesis to provide many novel potential therapeutic interventions. PMID:25894582
Proteomic characterization of a mouse model of familial Danish dementia.
Vitale, Monica; Renzone, Giovanni; Matsuda, Shuji; Scaloni, Andrea; D'Adamio, Luciano; Zambrano, Nicola
2012-01-01
A dominant mutation in the ITM2B/BRI2 gene causes familial Danish dementia (FDD) in humans. To model FDD in animal systems, a knock-in approach was recently implemented in mice expressing a wild-type and mutant allele, which bears the FDD-associated mutation. Since these FDD(KI) mice show behavioural alterations and impaired synaptic function, we characterized their synaptosomal proteome via two-dimensional differential in-gel electrophoresis. After identification by nanoliquid chromatography coupled to electrospray-linear ion trap tandem mass spectrometry, the differentially expressed proteins were classified according to their gene ontology descriptions and their predicted functional interactions. The Dlg4/Psd95 scaffold protein and additional signalling proteins, including protein phosphatases, were revealed by STRING analysis as potential players in the altered synaptic function of FDD(KI) mice. Immunoblotting analysis finally demonstrated the actual downregulation of the synaptosomal scaffold protein Dlg4/Psd95 and of the dual-specificity phosphatase Dusp3 in the synaptosomes of FDD(KI) mice.
Proteomic Characterization of a Mouse Model of Familial Danish Dementia
Vitale, Monica; Renzone, Giovanni; Matsuda, Shuji; Scaloni, Andrea; D'Adamio, Luciano; Zambrano, Nicola
2012-01-01
A dominant mutation in the ITM2B/BRI2 gene causes familial Danish dementia (FDD) in humans. To model FDD in animal systems, a knock-in approach was recently implemented in mice expressing a wild-type and mutant allele, which bears the FDD-associated mutation. Since these FDDKI mice show behavioural alterations and impaired synaptic function, we characterized their synaptosomal proteome via two-dimensional differential in-gel electrophoresis. After identification by nanoliquid chromatography coupled to electrospray-linear ion trap tandem mass spectrometry, the differentially expressed proteins were classified according to their gene ontology descriptions and their predicted functional interactions. The Dlg4/Psd95 scaffold protein and additional signalling proteins, including protein phosphatases, were revealed by STRING analysis as potential players in the altered synaptic function of FDDKI mice. Immunoblotting analysis finally demonstrated the actual downregulation of the synaptosomal scaffold protein Dlg4/Psd95 and of the dual-specificity phosphatase Dusp3 in the synaptosomes of FDDKI mice. PMID:22619496
High throughput protein production screening
Beernink, Peter T [Walnut Creek, CA; Coleman, Matthew A [Oakland, CA; Segelke, Brent W [San Ramon, CA
2009-09-08
Methods, compositions, and kits for the cell-free production and analysis of proteins are provided. The invention allows for the production of proteins from prokaryotic sequences or eukaryotic sequences, including human cDNAs using PCR and IVT methods and detecting the proteins through fluorescence or immunoblot techniques. This invention can be used to identify optimized PCR and WT conditions, codon usages and mutations. The methods are readily automated and can be used for high throughput analysis of protein expression levels, interactions, and functional states.
Revealing time bunching effect in single-molecule enzyme conformational dynamics.
Lu, H Peter
2011-04-21
In this perspective, we focus our discussion on how the single-molecule spectroscopy and statistical analysis are able to reveal enzyme hidden properties, taking the study of T4 lysozyme as an example. Protein conformational fluctuations and dynamics play a crucial role in biomolecular functions, such as in enzymatic reactions. Single-molecule spectroscopy is a powerful approach to analyze protein conformational dynamics under physiological conditions, providing dynamic perspectives on a molecular-level understanding of protein structure-function mechanisms. Using single-molecule fluorescence spectroscopy, we have probed T4 lysozyme conformational motions under the hydrolysis reaction of a polysaccharide of E. coli B cell walls by monitoring the fluorescence resonant energy transfer (FRET) between a donor-acceptor probe pair tethered to T4 lysozyme domains involving open-close hinge-bending motions. Based on the single-molecule spectroscopic results, molecular dynamics simulation, a random walk model analysis, and a novel 2D statistical correlation analysis, we have revealed a time bunching effect in protein conformational motion dynamics that is critical to enzymatic functions. Bunching effect implies that conformational motion times tend to bunch in a finite and narrow time window. We show that convoluted multiple Poisson rate processes give rise to the bunching effect in the enzymatic reaction dynamics. Evidently, the bunching effect is likely common in protein conformational dynamics involving in conformation-gated protein functions. In this perspective, we will also discuss a new approach of 2D regional correlation analysis capable of analyzing fluctuation dynamics of complex multiple correlated and anti-correlated fluctuations under a non-correlated noise background. Using this new method, we are able to map out any defined segments along the fluctuation trajectories and determine whether they are correlated, anti-correlated, or non-correlated; after which, a cross correlation analysis can be applied for each specific segment to obtain a detailed fluctuation dynamics analysis.
Zeeshan, Mohammad; Kaur, Inderjeet; Joy, Joseph; Saini, Ekta; Paul, Gourab; Kaushik, Abhinav; Dabral, Surbhi; Mohmmed, Asif; Gupta, Dinesh; Malhotra, Pawan
2017-02-03
Plasmodium falciparum undergoes a tightly regulated developmental process in human erythrocytes, and recent studies suggest an important regulatory role of post-translational modifications (PTMs). As compared with Plasmodium phosphoproteome, little is known about other PTMs in the parasite. In the present study, we performed a global analysis of asexual blood stages of Plasmodium falciparum to identify arginine-methylated proteins. Using two different methyl arginine-specific antibodies, we immunoprecipitated the arginine-methylated proteins from the stage-specific parasite lysates and identified 843 putative arginine-methylated proteins by LC-MS/MS. Motif analysis of the protein sequences unveiled that the methylation sites are associated with the previously known methylation motifs such as GRx/RGx, RxG, GxxR, or WxxxR. We identified Plasmodium homologues of known arginine-methylated proteins in trypanosomes, yeast, and human. Hydrophilic interaction liquid chromatography (HILIC) was performed on the immunoprecipitates from the trophozoite stage to enrich arginine-methylated peptides. Mass spectrometry analysis of immunoprecipitated and HILIC fractions identified 55 arginine-methylated peptides having 62 methylated arginine sites. Functional classification revealed that the arginine-methylated proteins are involved in RNA metabolism, protein synthesis, intracellular protein trafficking, proteolysis, protein folding, chromatin organization, hemoglobin metabolic process, and several other functions. Summarily, the findings suggest that protein methylation of arginine residues is a widespread phenomenon in Plasmodium, and the PTM may play an important regulatory role in a diverse set of biological pathways, including host-pathogen interactions.
Global profiling of lysine acetylation in human histoplasmosis pathogen Histoplasma capsulatum.
Xie, Longxiang; Fang, Wenjie; Deng, Wanyan; Yu, Zhaoxiao; Li, Juan; Chen, Min; Liao, Wanqing; Xie, Jianping; Pan, Weihua
2016-04-01
Histoplasma capsulatum is the causative agent of human histoplasmosis, which can cause respiratory and systemic mycosis in immune-compromised individuals. Lysine acetylation, a protein posttranslational protein modification, is widespread in both eukaryotes and prokaryotes. Although increasing evidence suggests that lysine acetylation may play critical roles in fungus physiology, very little is known about its extent and function in H. capsulatum. To comprehensively profile protein lysine acetylation in H. capsulatum, we performed a global acetylome analysis through peptide prefractionation, antibody enrichment, and LC-MS/MS analysis, identifying 775 acetylation sites on 456 acetylated proteins; and functionally analysis showing their involvement in different biological processes. We defined six types of acetylation site motifs, and the results imply that lysine residue of polypeptide with tyrosine at the -1 and +1 positions, histidine at the +1 position, and phenylalanine (F) at the +1 and +2 position is a preferred substrate of lysine acetyltransferase. Moreover, some virulence factors candidates including calmodulin and DnaK are acetylated. In conclusion, our data set may serve as an important resource for the elucidation of associations between functional protein lysine acetylation and virulence in H. capsulatum. Copyright © 2016 Elsevier Ltd. All rights reserved.
Lott, Kaylen; Li, Jun; Fisk, John C.; Wang, Hao; Aletta, John M.; Qu, Jun; Read, Laurie K.
2013-01-01
Arginine methylation is a common posttranslational modification with reported functions in transcription, RNA processing and translation, and DNA repair. Trypanosomes encode five protein arginine methyltransferases, suggesting that arginine methylation exerts widespread impacts on the biology of these organisms. Here, we performed a global proteomic analysis of T. brucei to identify arginine methylated proteins and their sites of modification. Using an approach entailing two-dimensional chromatographic separation, and alternating electron transfer dissociation and collision induced dissociation, we identified 1332 methylarginines in 676 proteins. The resulting data set represents the largest compilation of arginine methylated proteins in any organism to date. Functional classification revealed numerous arginine methylated proteins involved in flagellar function, RNA metabolism, DNA replication and repair, and intracellular protein trafficking. Thus, arginine methylation has the potential to impact aspects of T. brucei gene expression, cell biology, and pathogenesis. Interestingly, pathways with known methylated proteins in higher eukaryotes were identified in this study, but often different components of the pathway were methylated in trypanosomes. Methylarginines were often identified in glycine rich contexts, although exceptions to this rule were detected. Collectively, these data inform on a multitude of aspects of trypanosome biology and serve as a guide for the identification of homologous arginine methylated proteins in higher eukaryotes. PMID:23872088
Vijay, Sonam
2014-01-01
Salivary gland proteins of Anopheles mosquitoes offer attractive targets to understand interactions with sporozoites, blood feeding behavior, homeostasis, and immunological evaluation of malaria vectors and parasite interactions. To date limited studies have been carried out to elucidate salivary proteins of An. stephensi salivary glands. The aim of the present study was to provide detailed analytical attributives of functional salivary gland proteins of urban malaria vector An. stephensi. A proteomic approach combining one-dimensional electrophoresis (1DE), ion trap liquid chromatography mass spectrometry (LC/MS/MS), and computational bioinformatic analysis was adopted to provide the first direct insight into identification and functional characterization of known salivary proteins and novel salivary proteins of An. stephensi. Computational studies by online servers, namely, MASCOT and OMSSA algorithms, identified a total of 36 known salivary proteins and 123 novel proteins analysed by LC/MS/MS. This first report describes a baseline proteomic catalogue of 159 salivary proteins belonging to various categories of signal transduction, regulation of blood coagulation cascade, and various immune and energy pathways of An. stephensi sialotranscriptome by mass spectrometry. Our results may serve as basis to provide a putative functional role of proteins in concept of blood feeding, biting behavior, and other aspects of vector-parasite host interactions for parasite development in anopheline mosquitoes. PMID:25126571
Vijay, Sonam; Rawat, Manmeet; Sharma, Arun
2014-01-01
Salivary gland proteins of Anopheles mosquitoes offer attractive targets to understand interactions with sporozoites, blood feeding behavior, homeostasis, and immunological evaluation of malaria vectors and parasite interactions. To date limited studies have been carried out to elucidate salivary proteins of An. stephensi salivary glands. The aim of the present study was to provide detailed analytical attributives of functional salivary gland proteins of urban malaria vector An. stephensi. A proteomic approach combining one-dimensional electrophoresis (1DE), ion trap liquid chromatography mass spectrometry (LC/MS/MS), and computational bioinformatic analysis was adopted to provide the first direct insight into identification and functional characterization of known salivary proteins and novel salivary proteins of An. stephensi. Computational studies by online servers, namely, MASCOT and OMSSA algorithms, identified a total of 36 known salivary proteins and 123 novel proteins analysed by LC/MS/MS. This first report describes a baseline proteomic catalogue of 159 salivary proteins belonging to various categories of signal transduction, regulation of blood coagulation cascade, and various immune and energy pathways of An. stephensi sialotranscriptome by mass spectrometry. Our results may serve as basis to provide a putative functional role of proteins in concept of blood feeding, biting behavior, and other aspects of vector-parasite host interactions for parasite development in anopheline mosquitoes.
Nucleotide sequence and phylogenetic analysis of Cucurbit yellow stunting disorder virus RNA 2.
Livieratos, Ioannis C; Coutts, Robert H A
2002-06-01
The complete nucleotide sequence of Cucurbit yellow stunting disorder virus (CYSDV) RNA 2, a whitefly (Bemisia tabaci)-transmitted closterovirus with a bi-partite genome, is reported. CYSDV RNA 2 is 7,281 nucleotides long and contains the closterovirus hallmark gene array with a similar arrangement to the prototype member of the genus Crinivirus, Lettuce infectious yellows virus (LIYV). CYSDV RNA 2 contains open reading frames (ORFs) potentially encoding in a 5' to 3' direction for proteins of 5 kDa (ORF 1; hydrophobic protein), 62 kDa (ORF 2; heat shock protein 70 homolog, HSP70h), 59 kDa (ORF 3; protein of unknown function), 9 kDa (ORF 4; protein of unknown function), 28.5 kDa (ORF 5; coat protein, CP), 53 kDa (ORF 6; coat protein minor, CPm), and 26.5 kDa (ORF 7; protein of unknown function). Pairwise comparisons of CYSDV RNA 2-encoded proteins (HSP70h, p59 and CPm) among the closteroviruses showed that CYSDV is closely related to LIYV. Phylogenetic analysis based on the amino acid sequence of the HSP70h, indicated that CYSDV clusters with other members of the genus Crinivirus, and it is related to Little cherry virus-1 (LChV-1), but is distinct from the aphid- or mealybug-transmitted closteroviruses.
Bhagavat, Raghu; Sankar, Santhosh; Srinivasan, Narayanaswamy; Chandra, Nagasuma
2018-03-06
Protein-ligand interactions form the basis of most cellular events. Identifying ligand binding pockets in proteins will greatly facilitate rationalizing and predicting protein function. Ligand binding sites are unknown for many proteins of known three-dimensional (3D) structure, creating a gap in our understanding of protein structure-function relationships. To bridge this gap, we detect pockets in proteins of known 3D structures, using computational techniques. This augmented pocketome (PocketDB) consists of 249,096 pockets, which is about seven times larger than what is currently known. We deduce possible ligand associations for about 46% of the newly identified pockets. The augmented pocketome, when subjected to clustering based on similarities among pockets, yielded 2,161 site types, which are associated with 1,037 ligand types, together providing fold-site-type-ligand-type associations. The PocketDB resource facilitates a structure-based function annotation, delineation of the structural basis of ligand recognition, and provides functional clues for domains of unknown functions, allosteric proteins, and druggable pockets. Copyright © 2018 Elsevier Ltd. All rights reserved.
Sperm proteins in teleostean and chondrostean (sturgeon) fishes.
Li, Ping; Hulak, Martin; Linhart, Otomar
2009-11-01
Sperm proteins in the seminal plasma and spermatozoa of teleostean and chondrostean have evolved adaptations due to the changes in the reproductive environment. Analysis of the composition and functions of these proteins provides new insights into sperm motility and fertilising abilities, thereby creating possibilities for improving artificial reproduction and germplasm resource conservation technologies (e.g. cryopreservation). Seminal plasma proteins are involved in the protection of spermatozoa during storage in the reproductive system, whereas all spermatozoa proteins contribute to the swimming and fertilising abilities of sperm. Compared to mammalian species, little data are available on fish sperm proteins and their functions. We review here the current state of the art in this field and focus on relevant subjects that require attention. Future research should concentrate on protein functions and their mode of action in fish species, especially on the role of spermatozoa surface proteins during fertilisation and on a description of sturgeon sperm proteins.
USDA-ARS?s Scientific Manuscript database
Beef is a source of high quality protein for the human population, and beef tenderness has significant influence on beef palatability, consumer expectation and industry profitability. To further elucidate the factors affecting beef tenderness, functional proteomics and bioinformatics interactome ana...
Koromyslova, Anna D; Chugunov, Anton O; Efremov, Roman G
2014-04-28
Molecular surfaces are the key players in biomolecular recognition and interactions. Nowadays, it is trivial to visualize a molecular surface and surface-distributed properties in three-dimensional space. However, such a representation trends to be biased and ambiguous in case of thorough analysis. We present a new method to create 2D spherical projection maps of entire protein surfaces and manipulate with them--protein surface topography (PST). It permits visualization and thoughtful analysis of surface properties. PST helps to easily portray conformational transitions, analyze proteins' properties and their dynamic behavior, improve docking performance, and reveal common patterns and dissimilarities in molecular surfaces of related bioactive peptides. This paper describes basic usage of PST with an example of small G-proteins conformational transitions, mapping of caspase-1 intersubunit interface, and intrinsic "complementarity" in the conotoxin-acetylcholine binding protein complex. We suggest that PST is a beneficial approach for structure-function studies of bioactive peptides and small proteins.
Watching proteins function with picosecond X-ray crystallography and molecular dynamics simulations.
NASA Astrophysics Data System (ADS)
Anfinrud, Philip
2006-03-01
Time-resolved electron density maps of myoglobin, a ligand-binding heme protein, have been stitched together into movies that unveil with < 2-å spatial resolution and 150-ps time-resolution the correlated protein motions that accompany and/or mediate ligand migration within the hydrophobic interior of a protein. A joint analysis of all-atom molecular dynamics (MD) calculations and picosecond time-resolved X-ray structures provides single-molecule insights into mechanisms of protein function. Ensemble-averaged MD simulations of the L29F mutant of myoglobin following ligand dissociation reproduce the direction, amplitude, and timescales of crystallographically-determined structural changes. This close agreement with experiments at comparable resolution in space and time validates the individual MD trajectories, which identify and structurally characterize a conformational switch that directs dissociated ligands to one of two nearby protein cavities. This unique combination of simulation and experiment unveils functional protein motions and illustrates at an atomic level relationships among protein structure, dynamics, and function. In collaboration with Friedrich Schotte and Gerhard Hummer, NIH.
Identifying the missing proteins in human proteome by biological language model.
Dong, Qiwen; Wang, Kai; Liu, Xuan
2016-12-23
With the rapid development of high-throughput sequencing technology, the proteomics research becomes a trendy field in the post genomics era. It is necessary to identify all the native-encoding protein sequences for further function and pathway analysis. Toward that end, the Human Proteome Organization lunched the Human Protein Project in 2011. However many proteins are hard to be detected by experiment methods, which becomes one of the bottleneck in Human Proteome Project. In consideration of the complicatedness of detecting these missing proteins by using wet-experiment approach, here we use bioinformatics method to pre-filter the missing proteins. Since there are analogy between the biological sequences and natural language, the n-gram models from Natural Language Processing field has been used to filter the missing proteins. The dataset used in this study contains 616 missing proteins from the "uncertain" category of the neXtProt database. There are 102 proteins deduced by the n-gram model, which have high probability to be native human proteins. We perform a detail analysis on the predicted structure and function of these missing proteins and also compare the high probability proteins with other mass spectrum datasets. The evaluation shows that the results reported here are in good agreement with those obtained by other well-established databases. The analysis shows that 102 proteins may be native gene-coding proteins and some of the missing proteins are membrane or natively disordered proteins which are hard to be detected by experiment methods.
Analysis of Membrane Protein Topology in the Plant Secretory Pathway.
Guo, Jinya; Miao, Yansong; Cai, Yi
2017-01-01
Topology of membrane proteins provides important information for the understanding of protein function and intermolecular associations. Integrate membrane proteins are generally transported from endoplasmic reticulum (ER) to Golgi and downstream compartments in the plant secretory pathway. Here, we describe a simple method to study membrane protein topology along the plant secretory pathway by transiently coexpressing a fluorescent protein (XFP)-tagged membrane protein and an ER export inhibitor protein, ARF1 (T31N), in tobacco BY-2 protoplast. By fractionation, microsome isolation, and trypsin digestion, membrane protein topology could be easily detected by either direct confocal microscopy imaging or western-blot analysis using specific XFP antibodies. A similar strategy in determining membrane protein topology could be widely adopted and applied to protein analysis in a broad range of eukaryotic systems, including yeast cells and mammalian cells.
FunSimMat: a comprehensive functional similarity database
Schlicker, Andreas; Albrecht, Mario
2008-01-01
Functional similarity based on Gene Ontology (GO) annotation is used in diverse applications like gene clustering, gene expression data analysis, protein interaction prediction and evaluation. However, there exists no comprehensive resource of functional similarity values although such a database would facilitate the use of functional similarity measures in different applications. Here, we describe FunSimMat (Functional Similarity Matrix, http://funsimmat.bioinf.mpi-inf.mpg.de/), a large new database that provides several different semantic similarity measures for GO terms. It offers various precomputed functional similarity values for proteins contained in UniProtKB and for protein families in Pfam and SMART. The web interface allows users to efficiently perform both semantic similarity searches with GO terms and functional similarity searches with proteins or protein families. All results can be downloaded in tab-delimited files for use with other tools. An additional XML–RPC interface gives automatic online access to FunSimMat for programs and remote services. PMID:17932054
Duncan, R; Horne, D; Strong, J E; Leone, G; Pon, R T; Yeung, M C; Lee, P W
1991-06-01
We have been investigating structure-function relationships in the reovirus cell attachment protein sigma 1 using various deletion mutants and protease analysis. In the present study, a series of deletion mutants were constructed which lacked 90, 44, 30, 12, or 4 amino acids from the C-terminus of the 455-amino acid-long reovirus type 3 (T3) sigma 1 protein. The full-length and truncated sigma 1 proteins were expressed in an in vitro transcription/translation system and assayed for L cell binding activity. It was found that the removal of as few as four amino acids from the C-terminus drastically affected the cell binding function of the sigma 1 protein. The C-terminal-truncated proteins were further characterized using trypsin, chymotrypsin, and monoclonal and polyclonal antibodies. Our results indicated that the C-terminal portions of the mutant proteins were misfolded, leading to a loss in cell binding function. The N-terminal fibrous tail of the proteins was unaffected by the deletions as was sigma 1 oligomerization, further illustrating the discrete structural and functional roles of the N- and C-terminal domains of sigma 1. In an attempt to identify smaller, functional peptides, full-length sigma 1 expressed in vitro was digested with trypsin and subsequently with chymotrypsin under various conditions. The results clearly demonstrated the highly stable nature of the C-terminal globular head of sigma 1, even when separated from the N-terminal fibrous tail. We concluded that: (1) the C-terminal globular head of sigma 1 exists as a compact, protease-resistant oligomeric structure; (2) an intact C-terminus is required for proper head folding and generation of the conformationally dependent cell binding domain.
Yu, Heguo; Diao, Hua; Wang, Chunmei; Lin, Yan; Yu, Fudong; Lu, Hui; Xu, Wei; Li, Zheng; Shi, Huijuan; Zhao, Shimin; Zhou, Yuchuan; Zhang, Yonglian
2015-01-01
Male infertility is a medical condition that has been on the rise globally. Lysine acetylation of human sperm, an essential posttranslational modification involved in the etiology of sperm abnormality, is not fully understood. Therefore, we first generated a qualified pan-anti-acetyllysine monoclonal antibody to characterize the global lysine acetylation of uncapacitated normal human sperm with a proteomics approach. With high enrichment ratios that were up to 31%, 973 lysine-acetylated sites that matched to 456 human sperm proteins, including 671 novel lysine acetylation sites and 205 novel lysine-acetylated proteins, were identified. These proteins exhibited conserved motifs XXXKYXXX, XXXKFXXX, and XXXKHXXX, were annotated to function in multiple metabolic processes, and were localized predominantly in the mitochondrion and cytoplasmic fractions. Between the uncapacitated and capacitated sperm, different acetylation profiles in regard to functional proteins involved in sperm capacitation, sperm-egg recognition, sperm-egg plasma fusion, and fertilization were observed, indicating that acetylation of functional proteins may be required during sperm capacitation. Bioinformatics analysis revealed association of acetylated proteins with diseases and drugs. Novel acetylation of voltage-dependent anion channel proteins was also found. With clinical sperm samples, we observed differed lysine acetyltransferases and lysine deacetylases expression between normal sperm and abnormal sperm of asthenospermia or necrospermia. Furthermore, with sperm samples impaired by epigallocatechin gallate to mimic asthenospermia, we observed that inhibition of sperm motility was partly through the blockade of voltage-dependent anion channel 2 Lys-74 acetylation combined with reduced ATP levels and mitochondrial membrane potential. Taken together, we obtained a qualified pan-anti-acetyllysine monoclonal antibody, analyzed the acetylproteome of uncapacitated human sperm, and revealed associations between functional protein acetylation and sperm functions. PMID:25680958
Rahaman, Siti Nurulnabila A; Mat Yusop, Jastina; Mohamed-Hussein, Zeti-Azura; Ho, Kok Lian; Teh, Aik-Hong; Waterman, Jitka; Ng, Chyan Leong
2016-03-01
C1ORF123 is a human hypothetical protein found in open reading frame 123 of chromosome 1. The protein belongs to the DUF866 protein family comprising eukaryote-conserved proteins with unknown function. Recent proteomic and bioinformatic analyses identified the presence of C1ORF123 in brain, frontal cortex and synapses, as well as its involvement in endocrine function and polycystic ovary syndrome (PCOS), indicating the importance of its biological role. In order to provide a better understanding of the biological function of the human C1ORF123 protein, the characterization and analysis of recombinant C1ORF123 (rC1ORF123), including overexpression and purification, verification by mass spectrometry and a Western blot using anti-C1ORF123 antibodies, crystallization and X-ray diffraction analysis of the protein crystals, are reported here. The rC1ORF123 protein was crystallized by the hanging-drop vapor-diffusion method with a reservoir solution comprised of 20% PEG 3350, 0.2 M magnesium chloride hexahydrate, 0.1 M sodium citrate pH 6.5. The crystals diffracted to 1.9 Å resolution and belonged to an orthorhombic space group with unit-cell parameters a = 59.32, b = 65.35, c = 95.05 Å. The calculated Matthews coefficient (VM) value of 2.27 Å(3) Da(-1) suggests that there are two molecules per asymmetric unit, with an estimated solvent content of 45.7%.
Mutsuddi, Mousumi; Mukherjee, Ashim; Shen, Baohe; Manley, James L; Nambu, John R
2010-01-01
The Drosophila Dichaete gene encodes a member of the Sox family of high mobility group (HMG) domain proteins that have crucial gene regulatory functions in diverse developmental processes. The subcellular localization and transcriptional regulatory activities of Sox proteins can be regulated by several post-translational modifications. To identify genes that functionally interact with Dichaete, we undertook a genetic modifier screen based on a Dichaete gain-of-function phenotype in the adult eye. Mutations in several genes, including decapentaplegic, engrailed and pelle, behaved as dominant modifiers of this eye phenotype. Further analysis of pelle mutants revealed that loss of pelle function results in alterations in the distinctive cytoplasmic distribution of Dichaete protein within the developing oocyte, as well as defects in the elaboration of individual egg chambers. The death domain-containing region of the Pelle protein kinase was found to associate with both Dichaete and mouse Sox2 proteins, and Pelle can phosphorylate Dichaete protein in vitro. Overall, these findings reveal that maternal functions of pelle are essential for proper localization of Dichaete protein in the oocyte and normal egg chamber formation. Dichaete appears to be a novel phosphorylation substrate for Pelle and may function in a Pelle-dependent signaling pathway during oogenesis.
A Parametric Rosetta Energy Function Analysis with LK Peptides on SAM Surfaces.
Lubin, Joseph H; Pacella, Michael S; Gray, Jeffrey J
2018-05-08
Although structures have been determined for many soluble proteins and an increasing number of membrane proteins, experimental structure determination methods are limited for complexes of proteins and solid surfaces. An economical alternative or complement to experimental structure determination is molecular simulation. Rosetta is one software suite that models protein-surface interactions, but Rosetta is normally benchmarked on soluble proteins. For surface interactions, the validity of the energy function is uncertain because it is a combination of independent parameters from energy functions developed separately for solution proteins and mineral surfaces. Here, we assess the performance of the RosettaSurface algorithm and test the accuracy of its energy function by modeling the adsorption of leucine/lysine (LK)-repeat peptides on methyl- and carboxy-terminated self-assembled monolayers (SAMs). We investigated how RosettaSurface predictions for this system compare with the experimental results, which showed that on both surfaces, LK-α peptides folded into helices and LK-β peptides held extended structures. Utilizing this model system, we performed a parametric analysis of Rosetta's Talaris energy function and determined that adjusting solvation parameters offered improved predictive accuracy. Simultaneously increasing lysine carbon hydrophilicity and the hydrophobicity of the surface methyl head groups yielded computational predictions most closely matching the experimental results. De novo models still should be interpreted skeptically unless bolstered in an integrative approach with experimental data.
He, Qianru; Man, Lili; Ji, Yuhua; Zhang, Shuqiang; Jiang, Maorong; Ding, Fei; Gu, Xiaosong
2012-06-01
Peripheral sensory and motor nerves have different functions and different approaches to regeneration, especially their distinct ability to accurately reinervate terminal nerve pathways. To understand the molecular aspects underlying these differences, the proteomics technique by coupling isobaric tags for relative and absolute quantitation (iTRAQ) with online two-dimensional liquid chromatography tandem mass spectrometry (2D LC-MS/MS) was used to investigate the protein profile of sensory and motor nerve samples from rats. A total of 1472 proteins were identified in either sensory or motor nerve. Of them, 100 proteins showed differential expressions between both nerves, and some of them were validated by quantitative real time RT-PCR, Western blot analysis, and immunohistochemistry. In the light of functional categorization, the differentially expressed proteins in sensory and motor nerves, belonging to a broad range of classes, were related to a diverse array of biological functions, which included cell adhesion, cytoskeleton, neuronal plasticity, neurotrophic activity, calcium-binding, signal transduction, transport, enzyme catalysis, lipid metabolism, DNA-binding, synaptosome function, actin-binding, ATP-binding, extracellular matrix, and commitment to other lineages. The relatively higher expressed proteins in either sensory or motor nerve were tentatively discussed in combination with their specific molecular characteristics. It is anticipated that the database generated in this study will provide a solid foundation for further comprehensive investigation of functional differences between sensory and motor nerves, including the specificity of their regeneration.
Suplatov, Dmitry; Panin, Nikolay; Kirilin, Evgeny; Shcherbakova, Tatyana; Kudryavtsev, Pavel; Svedas, Vytas
2014-01-01
Protein stability provides advantageous development of novel properties and can be crucial in affording tolerance to mutations that introduce functionally preferential phenotypes. Consequently, understanding the determining factors for protein stability is important for the study of structure-function relationship and design of novel protein functions. Thermal stability has been extensively studied in connection with practical application of biocatalysts. However, little work has been done to explore the mechanism of pH-dependent inactivation. In this study, bioinformatic analysis of the Ntn-hydrolase superfamily was performed to identify functionally important subfamily-specific positions in protein structures. Furthermore, the involvement of these positions in pH-induced inactivation was studied. The conformational mobility of penicillin acylase in Escherichia coli was analyzed through molecular modeling in neutral and alkaline conditions. Two functionally important subfamily-specific residues, Gluβ482 and Aspβ484, were found. Ionization of these residues at alkaline pH promoted the collapse of a buried network of stabilizing interactions that consequently disrupted the functional protein conformation. The subfamily-specific position Aspβ484 was selected as a hotspot for mutation to engineer enzyme variant tolerant to alkaline medium. The corresponding Dβ484N mutant was produced and showed 9-fold increase in stability at alkaline conditions. Bioinformatic analysis of subfamily-specific positions can be further explored to study mechanisms of protein inactivation and to design more stable variants for the engineering of homologous Ntn-hydrolases with improved catalytic properties.
Suplatov, Dmitry; Panin, Nikolay; Kirilin, Evgeny; Shcherbakova, Tatyana; Kudryavtsev, Pavel; Švedas, Vytas
2014-01-01
Protein stability provides advantageous development of novel properties and can be crucial in affording tolerance to mutations that introduce functionally preferential phenotypes. Consequently, understanding the determining factors for protein stability is important for the study of structure-function relationship and design of novel protein functions. Thermal stability has been extensively studied in connection with practical application of biocatalysts. However, little work has been done to explore the mechanism of pH-dependent inactivation. In this study, bioinformatic analysis of the Ntn-hydrolase superfamily was performed to identify functionally important subfamily-specific positions in protein structures. Furthermore, the involvement of these positions in pH-induced inactivation was studied. The conformational mobility of penicillin acylase in Escherichia coli was analyzed through molecular modeling in neutral and alkaline conditions. Two functionally important subfamily-specific residues, Gluβ482 and Aspβ484, were found. Ionization of these residues at alkaline pH promoted the collapse of a buried network of stabilizing interactions that consequently disrupted the functional protein conformation. The subfamily-specific position Aspβ484 was selected as a hotspot for mutation to engineer enzyme variant tolerant to alkaline medium. The corresponding Dβ484N mutant was produced and showed 9-fold increase in stability at alkaline conditions. Bioinformatic analysis of subfamily-specific positions can be further explored to study mechanisms of protein inactivation and to design more stable variants for the engineering of homologous Ntn-hydrolases with improved catalytic properties. PMID:24959852
Molecular weights and subunit structure of LamB proteins.
Nakae, T; Ishii, J N
1982-01-01
Phage lambda-receptor proteins of Escherichia coli, LamB proteins, form oligomeric aggregates to build transmembrane diffusion pores selective for maltose and maltodextrins. The molecular weights (MW) of functional oligomers as well as dissociated monomers were determined by sedimentation equilibrium analysis in homogeneous non-ionic surfactant and deuterium oxide and in 6 M guanidine-HCl, respectively. The MW of oligomers and monomers appeared as 135 600 and 45 900, respectively. Thus, functional Lamb proteins consisted of three identical subunits.
Kirschnek, S; Vier, J; Gautam, S; Frankenberg, T; Rangelova, S; Eitz-Ferrer, P; Grespi, F; Ottina, E; Villunger, A; Häcker, H; Häcker, G
2011-11-01
Neutrophils enter the peripheral blood from the bone marrow and die after a short time. Molecular analysis of spontaneous neutrophil apoptosis is difficult as these cells die rapidly and cannot be easily manipulated. We use conditional Hoxb8 expression to generate mouse neutrophils and test the regulation of apoptosis by extensive manipulation of B-cell lymphoma protein 2 (Bcl-2)-family proteins. Spontaneous apoptosis was preceded by downregulation of anti-apoptotic Bcl-2 proteins. Loss of the pro-apoptotic Bcl-2 homology domain (BH3)-only protein Bcl-2-interacting mediator of cell death (Bim) gave some protection, but only neutrophils deficient in both BH3-only proteins, Bim and Noxa, were strongly protected against apoptosis. Function of Noxa was at least in part neutralization of induced myeloid leukemia cell differentiation protein (Mcl-1) in neutrophils and progenitors. Loss of Bim and Noxa preserved neutrophil function in culture, and apoptosis-resistant cells remained in circulation in mice. Apoptosis regulated by Bim- and Noxa-driven loss of Mcl-1 is thus the final step in neutrophil differentiation, required for the termination of neutrophil function and neutrophil-dependent inflammation.
Introduction to Protein Structure through Genetic Diseases
ERIC Educational Resources Information Center
Schneider, Tanya L.; Linton, Brian R.
2008-01-01
An illuminating way to learn about protein function is to explore high-resolution protein structures. Analysis of the proteins involved in genetic diseases has been used to introduce students to protein structure and the role that individual mutations can play in the onset of disease. Known mutations can be correlated to changes in protein…
Bravo-Alonso, Irene; Navarrete, Rosa; Arribas-Carreira, Laura; Perona, Almudena; Abia, David; Couce, María Luz; García-Cazorla, Angels; Morais, Ana; Domingo, Rosario; Ramos, María Antonia; Swanson, Michael A; Van Hove, Johan L K; Ugarte, Magdalena; Pérez, Belén; Pérez-Cerdá, Celia; Rodríguez-Pombo, Pilar
2017-06-01
The rapid analysis of genomic data is providing effective mutational confirmation in patients with clinical and biochemical hallmarks of a specific disease. This is the case for nonketotic hyperglycinemia (NKH), a Mendelian disorder causing seizures in neonates and early-infants, primarily due to mutations in the GLDC gene. However, understanding the impact of missense variants identified in this gene is a major challenge for the application of genomics into clinical practice. Herein, a comprehensive functional and structural analysis of 19 GLDC missense variants identified in a cohort of 26 NKH patients was performed. Mutant cDNA constructs were expressed in COS7 cells followed by enzymatic assays and Western blot analysis of the GCS P-protein to assess the residual activity and mutant protein stability. Structural analysis, based on molecular modeling of the 3D structure of GCS P-protein, was also performed. We identify hypomorphic variants that produce attenuated phenotypes with improved prognosis of the disease. Structural analysis allows us to interpret the effects of mutations on protein stability and catalytic activity, providing molecular evidence for clinical outcome and disease severity. Moreover, we identify an important number of mutants whose loss-of-functionality is associated with instability and, thus, are potential targets for rescue using folding therapeutic approaches. © 2017 Wiley Periodicals, Inc.
Structure-Functional Prediction and Analysis of Cancer Mutation Effects in Protein Kinases
Dixit, Anshuman; Verkhivker, Gennady M.
2014-01-01
A central goal of cancer research is to discover and characterize the functional effects of mutated genes that contribute to tumorigenesis. In this study, we provide a detailed structural classification and analysis of functional dynamics for members of protein kinase families that are known to harbor cancer mutations. We also present a systematic computational analysis that combines sequence and structure-based prediction models to characterize the effect of cancer mutations in protein kinases. We focus on the differential effects of activating point mutations that increase protein kinase activity and kinase-inactivating mutations that decrease activity. Mapping of cancer mutations onto the conformational mobility profiles of known crystal structures demonstrated that activating mutations could reduce a steric barrier for the movement from the basal “low” activity state to the “active” state. According to our analysis, the mechanism of activating mutations reflects a combined effect of partial destabilization of the kinase in its inactive state and a concomitant stabilization of its active-like form, which is likely to drive tumorigenesis at some level. Ultimately, the analysis of the evolutionary and structural features of the major cancer-causing mutational hotspot in kinases can also aid in the correlation of kinase mutation effects with clinical outcomes. PMID:24817905
Enrichment and proteomic analysis of plasma membrane from rat dorsal root ganglions
2009-01-01
Background Dorsal root ganglion (DRG) neurons are primary sensory neurons that conduct neuronal impulses related to pain, touch and temperature senses. Plasma membrane (PM) of DRG cells plays important roles in their functions. PM proteins are main performers of the functions. However, mainly due to the very low amount of DRG that leads to the difficulties in PM sample collection, few proteomic analyses on the PM have been reported and it is a subject that demands further investigation. Results By using aqueous polymer two-phase partition in combination with high salt and high pH washing, PMs were efficiently enriched, demonstrated by western blot analysis. A total of 954 non-redundant proteins were identified from the plasma membrane-enriched preparation with CapLC-MS/MS analysis subsequent to protein separation by sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE) or shotgun digestion. 205 (21.5%) of the identified proteins were unambiguously assigned as PM proteins, including a large number of signal proteins, receptors, ion channel and transporters. Conclusion The aqueous polymer two-phase partition is a simple, rapid and relatively inexpensive method. It is well suitable for the purification of PMs from small amount of tissues. Therefore, it is reasonable for the DRG PM to be enriched by using aqueous two-phase partition as a preferred method. Proteomic analysis showed that DRG PM was rich in proteins involved in the fundamental biological processes including material exchange, energy transformation and information transmission, etc. These data would help to our further understanding of the fundamental DRG functions. PMID:19889238
Wan, Cen; Lees, Jonathan G; Minneci, Federico; Orengo, Christine A; Jones, David T
2017-10-01
Accurate gene or protein function prediction is a key challenge in the post-genome era. Most current methods perform well on molecular function prediction, but struggle to provide useful annotations relating to biological process functions due to the limited power of sequence-based features in that functional domain. In this work, we systematically evaluate the predictive power of temporal transcription expression profiles for protein function prediction in Drosophila melanogaster. Our results show significantly better performance on predicting protein function when transcription expression profile-based features are integrated with sequence-derived features, compared with the sequence-derived features alone. We also observe that the combination of expression-based and sequence-based features leads to further improvement of accuracy on predicting all three domains of gene function. Based on the optimal feature combinations, we then propose a novel multi-classifier-based function prediction method for Drosophila melanogaster proteins, FFPred-fly+. Interpreting our machine learning models also allows us to identify some of the underlying links between biological processes and developmental stages of Drosophila melanogaster.
Expanded microbial genome coverage and improved protein family annotation in the COG database
Galperin, Michael Y.; Makarova, Kira S.; Wolf, Yuri I.; Koonin, Eugene V.
2015-01-01
Microbial genome sequencing projects produce numerous sequences of deduced proteins, only a small fraction of which have been or will ever be studied experimentally. This leaves sequence analysis as the only feasible way to annotate these proteins and assign to them tentative functions. The Clusters of Orthologous Groups of proteins (COGs) database (http://www.ncbi.nlm.nih.gov/COG/), first created in 1997, has been a popular tool for functional annotation. Its success was largely based on (i) its reliance on complete microbial genomes, which allowed reliable assignment of orthologs and paralogs for most genes; (ii) orthology-based approach, which used the function(s) of the characterized member(s) of the protein family (COG) to assign function(s) to the entire set of carefully identified orthologs and describe the range of potential functions when there were more than one; and (iii) careful manual curation of the annotation of the COGs, aimed at detailed prediction of the biological function(s) for each COG while avoiding annotation errors and overprediction. Here we present an update of the COGs, the first since 2003, and a comprehensive revision of the COG annotations and expansion of the genome coverage to include representative complete genomes from all bacterial and archaeal lineages down to the genus level. This re-analysis of the COGs shows that the original COG assignments had an error rate below 0.5% and allows an assessment of the progress in functional genomics in the past 12 years. During this time, functions of many previously uncharacterized COGs have been elucidated and tentative functional assignments of many COGs have been validated, either by targeted experiments or through the use of high-throughput methods. A particularly important development is the assignment of functions to several widespread, conserved proteins many of which turned out to participate in translation, in particular rRNA maturation and tRNA modification. The new version of the COGs is expected to become an important tool for microbial genomics. PMID:25428365
MIPS: a database for genomes and protein sequences
Mewes, H. W.; Frishman, D.; Güldener, U.; Mannhaupt, G.; Mayer, K.; Mokrejs, M.; Morgenstern, B.; Münsterkötter, M.; Rudd, S.; Weil, B.
2002-01-01
The Munich Information Center for Protein Sequences (MIPS-GSF, Neuherberg, Germany) continues to provide genome-related information in a systematic way. MIPS supports both national and European sequencing and functional analysis projects, develops and maintains automatically generated and manually annotated genome-specific databases, develops systematic classification schemes for the functional annotation of protein sequences, and provides tools for the comprehensive analysis of protein sequences. This report updates the information on the yeast genome (CYGD), the Neurospora crassa genome (MNCDB), the databases for the comprehensive set of genomes (PEDANT genomes), the database of annotated human EST clusters (HIB), the database of complete cDNAs from the DHGP (German Human Genome Project), as well as the project specific databases for the GABI (Genome Analysis in Plants) and HNB (Helmholtz–Netzwerk Bioinformatik) networks. The Arabidospsis thaliana database (MATDB), the database of mitochondrial proteins (MITOP) and our contribution to the PIR International Protein Sequence Database have been described elsewhere [Schoof et al. (2002) Nucleic Acids Res., 30, 91–93; Scharfe et al. (2000) Nucleic Acids Res., 28, 155–158; Barker et al. (2001) Nucleic Acids Res., 29, 29–32]. All databases described, the protein analysis tools provided and the detailed descriptions of our projects can be accessed through the MIPS World Wide Web server (http://mips.gsf.de). PMID:11752246
MIPS: a database for genomes and protein sequences.
Mewes, H W; Frishman, D; Güldener, U; Mannhaupt, G; Mayer, K; Mokrejs, M; Morgenstern, B; Münsterkötter, M; Rudd, S; Weil, B
2002-01-01
The Munich Information Center for Protein Sequences (MIPS-GSF, Neuherberg, Germany) continues to provide genome-related information in a systematic way. MIPS supports both national and European sequencing and functional analysis projects, develops and maintains automatically generated and manually annotated genome-specific databases, develops systematic classification schemes for the functional annotation of protein sequences, and provides tools for the comprehensive analysis of protein sequences. This report updates the information on the yeast genome (CYGD), the Neurospora crassa genome (MNCDB), the databases for the comprehensive set of genomes (PEDANT genomes), the database of annotated human EST clusters (HIB), the database of complete cDNAs from the DHGP (German Human Genome Project), as well as the project specific databases for the GABI (Genome Analysis in Plants) and HNB (Helmholtz-Netzwerk Bioinformatik) networks. The Arabidospsis thaliana database (MATDB), the database of mitochondrial proteins (MITOP) and our contribution to the PIR International Protein Sequence Database have been described elsewhere [Schoof et al. (2002) Nucleic Acids Res., 30, 91-93; Scharfe et al. (2000) Nucleic Acids Res., 28, 155-158; Barker et al. (2001) Nucleic Acids Res., 29, 29-32]. All databases described, the protein analysis tools provided and the detailed descriptions of our projects can be accessed through the MIPS World Wide Web server (http://mips.gsf.de).
Xu, Han; Wang, Tao; Zhang, Yanming
2015-01-01
Porcine circovirus type 2 (PCV2) is the essential infectious agent responsible for causing porcine circovirus-associated diseases in pigs. To date, eleven RNAs and five viral proteins of PCV2 have been detected. Here, we identified a novel viral gene within the PCV2 genome, termed ORF5, that exists at both the transcriptional and translational level during productive infection of PCV2 in porcine alveolar macrophages 3D4/2 (PAMs). Northern blot analysis was used to demonstrate that the ORF5 gene measures 180 bp in length and overlaps completely with ORF1 when read in the same direction. Site-directed mutagenesis was used to show that the ORF5 protein is not essential for PCV2 replication. To investigate the biological functions of the novel protein, we constructed a recombinant eukaryotic expression plasmid capable of expressing PCV2 ORF5. The results show that the GFP-tagged PCV2 ORF5 protein localizes to the endoplasmic reticulum (ER), is degraded via the proteasome, inhibits PAM growth and prolongs the S-phase of the cell cycle. Further studies show that the GFP-tagged PCV2 ORF5 protein induces ER stress and activates NF-κB, which was further confirmed by a significant upregulation in IL-6, IL-8 and COX-2 expression. In addition, five cellular proteins (GPNMB, CYP1A1, YWHAB, ZNF511 and SRSF3) were found to interact with ORF5 via yeast two-hybrid assay. These findings provide novel information on the identification and functional analysis of the PCV2 ORF5 protein and are likely to be of benefit in elucidating the molecular mechanisms of PCV2 pathogenicity. However, additional experiments are needed to validate the expression and function of the ORF5 protein during PCV2 infection in vitro before any definitive conclusion can be drawn. PMID:26035722
Antennal proteome comparison of sexually mature drone and forager honeybees.
Feng, Mao; Song, Feifei; Aleku, Dereje Woltedji; Han, Bin; Fang, Yu; Li, Jianke
2011-07-01
Honeybees have evolved an intricate system of chemical communication to regulate their complex social interactions. Specific proteins involved in odorant detection most likely supported this chemical communication. Odorant reception takes place mainly in the antennae within hairlike structures called olfactory sensilla. Antennal proteomes of sexually mature drone and forager worker bees (an age group of bees assigned to perform field tasks) were compared using two-dimensional electrophoresis, mass spectrometry, quantitative real-time polymerase chain reaction, and bioinformatics. Sixty-one differentially expressed proteins were identified in which 67% were highly upregulated in the drones' antennae whereas only 33% upregulated in the worker bees' antennae. The antennae of the worker bees strongly expressed carbohydrate and energy metabolism and molecular transporters signifying a strong demand for metabolic energy and odorant binding proteins for their foraging activities and other olfactory responses, while proteins related to fatty acid metabolism, antioxidation, and protein folding were strongly upregulated in the drones' antennae as an indication of the importance for the detection and degradation of sex pheromones during queen identification for mating. On the basis of both groups of altered antenna proteins, carbohydrate metabolism and energy production and molecular transporters comprised more than 80% of the functional enrichment analysis and 45% of the constructed biological interaction networks (BIN), respectively. This suggests these two protein families play crucial roles in the antennal olfactory function of sexually mature drone and forager worker bees. Several key node proteins in the BIN were validated at the transcript level. This first global proteomic comparative analysis of antennae reveals sex-biased protein expression in both bees, indicating that odorant response mechanisms are sex-specific because of natural selection for different olfactory functions. To the best of our knowledge, this result further provides extensive insight into the expression of the proteins in the antennae of drone and worker honeybees and adds vital information to the previous findings. It also provides a new angle for future detailed functional analysis of the antennae of the honeybee castes.
Deng, Ning; Li, Zhenye; Pan, Chao; Duan, Huilong
2015-01-01
Study of complex proteome brings forward higher request for the quantification method using mass spectrometry technology. In this paper, we present a mass spectrometry label-free quantification tool for complex proteomes, called freeQuant, which integrated quantification with functional analysis effectively. freeQuant consists of two well-integrated modules: label-free quantification and functional analysis with biomedical knowledge. freeQuant supports label-free quantitative analysis which makes full use of tandem mass spectrometry (MS/MS) spectral count, protein sequence length, shared peptides, and ion intensity. It adopts spectral count for quantitative analysis and builds a new method for shared peptides to accurately evaluate abundance of isoforms. For proteins with low abundance, MS/MS total ion count coupled with spectral count is included to ensure accurate protein quantification. Furthermore, freeQuant supports the large-scale functional annotations for complex proteomes. Mitochondrial proteomes from the mouse heart, the mouse liver, and the human heart were used to evaluate the usability and performance of freeQuant. The evaluation showed that the quantitative algorithms implemented in freeQuant can improve accuracy of quantification with better dynamic range.
2013-01-01
Background Protein-protein interactions (PPIs) play a key role in understanding the mechanisms of cellular processes. The availability of interactome data has catalyzed the development of computational approaches to elucidate functional behaviors of proteins on a system level. Gene Ontology (GO) and its annotations are a significant resource for functional characterization of proteins. Because of wide coverage, GO data have often been adopted as a benchmark for protein function prediction on the genomic scale. Results We propose a computational approach, called M-Finder, for functional association pattern mining. This method employs semantic analytics to integrate the genome-wide PPIs with GO data. We also introduce an interactive web application tool that visualizes a functional association network linked to a protein specified by a user. The proposed approach comprises two major components. First, the PPIs that have been generated by high-throughput methods are weighted in terms of their functional consistency using GO and its annotations. We assess two advanced semantic similarity metrics which quantify the functional association level of each interacting protein pair. We demonstrate that these measures outperform the other existing methods by evaluating their agreement to other biological features, such as sequence similarity, the presence of common Pfam domains, and core PPIs. Second, the information flow-based algorithm is employed to discover a set of proteins functionally associated with the protein in a query and their links efficiently. This algorithm reconstructs a functional association network of the query protein. The output network size can be flexibly determined by parameters. Conclusions M-Finder provides a useful framework to investigate functional association patterns with any protein. This software will also allow users to perform further systematic analysis of a set of proteins for any specific function. It is available online at http://bionet.ecs.baylor.edu/mfinder PMID:24565382
Analysis of In Vivo Chromatin and Protein Interactions of Arabidopsis Transcript Elongation Factors.
Pfab, Alexander; Antosz, Wojciech; Holzinger, Philipp; Bruckmann, Astrid; Griesenbeck, Joachim; Grasser, Klaus D
2017-01-01
A central step to elucidate the function of proteins commonly comprises the analysis of their molecular interactions in vivo. For nuclear regulatory proteins this involves determining protein-protein interactions as well as mapping of chromatin binding sites. Here, we present two protocols to identify protein-protein and chromatin interactions of transcript elongation factors (TEFs) in Arabidopsis. The first protocol (Subheading 3.1) describes protein affinity-purification coupled to mass spectrometry (AP-MS) that utilizes suspension cultured cells as experimental system. This approach provides an unbiased view of proteins interacting with epitope-tagged TEFs. The second protocol (Subheading 3.2) depicts details about a chromatin immunoprecipitation (ChIP) procedure to characterize genomic binding sites of TEFs. These methods should be valuable tools for the analysis of a broad variety of nuclear proteins.
Suplatov, Dmitry; Kirilin, Eugeny; Arbatsky, Mikhail; Takhaveev, Vakil; Švedas, Vytas
2014-01-01
The new web-server pocketZebra implements the power of bioinformatics and geometry-based structural approaches to identify and rank subfamily-specific binding sites in proteins by functional significance, and select particular positions in the structure that determine selective accommodation of ligands. A new scoring function has been developed to annotate binding sites by the presence of the subfamily-specific positions in diverse protein families. pocketZebra web-server has multiple input modes to meet the needs of users with different experience in bioinformatics. The server provides on-site visualization of the results as well as off-line version of the output in annotated text format and as PyMol sessions ready for structural analysis. pocketZebra can be used to study structure–function relationship and regulation in large protein superfamilies, classify functionally important binding sites and annotate proteins with unknown function. The server can be used to engineer ligand-binding sites and allosteric regulation of enzymes, or implemented in a drug discovery process to search for potential molecular targets and novel selective inhibitors/effectors. The server, documentation and examples are freely available at http://biokinet.belozersky.msu.ru/pocketzebra and there are no login requirements. PMID:24852248
Lysine acetylome profiling uncovers novel histone deacetylase substrate proteins in Arabidopsis.
Hartl, Markus; Füßl, Magdalena; Boersema, Paul J; Jost, Jan-Oliver; Kramer, Katharina; Bakirbas, Ahmet; Sindlinger, Julia; Plöchinger, Magdalena; Leister, Dario; Uhrig, Glen; Moorhead, Greg Bg; Cox, Jürgen; Salvucci, Michael E; Schwarzer, Dirk; Mann, Matthias; Finkemeier, Iris
2017-10-23
Histone deacetylases have central functions in regulating stress defenses and development in plants. However, the knowledge about the deacetylase functions is largely limited to histones, although these enzymes were found in diverse subcellular compartments. In this study, we determined the proteome-wide signatures of the RPD3/HDA1 class of histone deacetylases in Arabidopsis Relative quantification of the changes in the lysine acetylation levels was determined on a proteome-wide scale after treatment of Arabidopsis leaves with deacetylase inhibitors apicidin and trichostatin A. We identified 91 new acetylated candidate proteins other than histones, which are potential substrates of the RPD3/HDA1-like histone deacetylases in Arabidopsis , of which at least 30 of these proteins function in nucleic acid binding. Furthermore, our analysis revealed that histone deacetylase 14 (HDA14) is the first organellar-localized RPD3/HDA1 class protein found to reside in the chloroplasts and that the majority of its protein targets have functions in photosynthesis. Finally, the analysis of HDA14 loss-of-function mutants revealed that the activation state of RuBisCO is controlled by lysine acetylation of RuBisCO activase under low-light conditions. © 2017 The Authors. Published under the terms of the CC BY 4.0 license.
Collell, Rosa; Closa-Monasterolo, Ricardo; Ferré, Natalia; Luque, Veronica; Koletzko, Berthold; Grote, Veit; Janas, Roman; Verduci, Elvira; Escribano, Joaquín
2016-06-01
Protein intake may modulate cardiac structure and function in pathological conditions, but there is a lack of knowledge on potential effects in healthy infants. Secondary analysis of an ongoing randomized clinical trial comparing two groups of infants receiving a higher (HP) or lower (LP) protein content formula in the first year of life, and compared with an observational group of breastfed (BF) infants. Growth and dietary intake were assessed periodically from birth to 2 y. Insulin-like growth factor 1 (IGF-1) axis parameters were analyzed at 6 mo in a blood sample. At 2 y, cardiac mass and function were assessed by echocardiography. HP infants (n = 50) showed a higher BMI z-score at 2 y compared with LP (n = 47) or BF (n = 44). Cardiac function parameters were increased in the HP group compared with the LP and were directly related to the protein intake during the first 6 mo of life. Moreover, there was an increase in free IGF-1 in the HP group at 6 mo. A moderate increase in protein supply during the first year of life is associated with higher cardiac function parameters at 2 y. IGF-1 axis modifications may, at least in part, underlie these effects.
Pillai, Viju Vijayan; Weber, Darren M; Phinney, Brett S; Selvaraj, Vimal
2017-01-01
The oviductal microenvironment is a site for key events that involve gamete maturation, fertilization and early embryo development. Secretions into the oviductal lumen by either the lining epithelium or by transudation of plasma constituents are known to contain elements conducive for reproductive success. Although previous studies have identified some of these factors involved in reproduction, knowledge of secreted proteins in the oviductal fluid remains rudimentary with limited definition of function even in extensively studied species like cattle. In this study, we used a shotgun proteomics approach followed by bioinformatics sequence prediction to identify secreted proteins present in the bovine oviductal fluid (ex vivo) and secretions from the bovine oviductal epithelial cells (in vitro). From a total of 2087 proteins identified, 266 proteins could be classified as secreted, 109 (41%) of which were common for both in vivo and in vitro conditions. Pathway analysis indicated different classes of proteins that included growth factors, metabolic regulators, immune modulators, enzymes, and extracellular matrix components. Functional analysis revealed mechanisms in the oviductal lumen linked to immune homeostasis, gamete maturation, fertilization and early embryo development. These results point to several novel components that work together with known elements mediating functional homeostasis, and highlight the diversity of machinery associated with oviductal physiology and early events in cattle fertility.
Liu, Mengjie; Duan, Liangwei; Wang, Meifang; Zeng, Hongmei; Liu, Xinqi; Qiu, Dewen
2016-01-01
The protein elicitor MoHrip2, which was extracted from Magnaporthe oryzae as an exocrine protein, triggers the tobacco immune system and enhances blast resistance in rice. However, the detailed mechanisms by which MoHrip2 acts as an elicitor remain unclear. Here, we investigated the structure of MoHrip2 to elucidate its functions based on molecular structure. The three-dimensional structure of MoHrip2 was obtained. Overall, the crystal structure formed a β-barrel structure and showed high similarity to the pathogenesis-related (PR) thaumatin superfamily protein thaumatin-like xylanase inhibitor (TL-XI). To investigate the functional regions responsible for MoHrip2 elicitor activities, the full length and eight truncated proteins were expressed in Escherichia coli and were evaluated for elicitor activity in tobacco. Biological function analysis showed that MoHrip2 triggered the defense system against Botrytis cinerea in tobacco. Moreover, only MoHrip2M14 and other fragments containing the 14 amino acids residues in the middle region of the protein showed the elicitor activity of inducing a hypersensitive response and resistance related pathways, which were similar to that of full-length MoHrip2. These results revealed that the central 14 amino acid residues were essential for anti-pathogenic activity.
2014-01-01
Background Due to rapid sequencing of genomes, there are now millions of deposited protein sequences with no known function. Fast sequence-based comparisons allow detecting close homologs for a protein of interest to transfer functional information from the homologs to the given protein. Sequence-based comparison cannot detect remote homologs, in which evolution has adjusted the sequence while largely preserving structure. Structure-based comparisons can detect remote homologs but most methods for doing so are too expensive to apply at a large scale over structural databases of proteins. Recently, fragment-based structural representations have been proposed that allow fast detection of remote homologs with reasonable accuracy. These representations have also been used to obtain linearly-reducible maps of protein structure space. It has been shown, as additionally supported from analysis in this paper that such maps preserve functional co-localization of the protein structure space. Methods Inspired by a recent application of the Latent Dirichlet Allocation (LDA) model for conducting structural comparisons of proteins, we propose higher-order LDA-obtained topic-based representations of protein structures to provide an alternative route for remote homology detection and organization of the protein structure space in few dimensions. Various techniques based on natural language processing are proposed and employed to aid the analysis of topics in the protein structure domain. Results We show that a topic-based representation is just as effective as a fragment-based one at automated detection of remote homologs and organization of protein structure space. We conduct a detailed analysis of the information content in the topic-based representation, showing that topics have semantic meaning. The fragment-based and topic-based representations are also shown to allow prediction of superfamily membership. Conclusions This work opens exciting venues in designing novel representations to extract information about protein structures, as well as organizing and mining protein structure space with mature text mining tools. PMID:25080993
Raleigh, David R; Marchiando, Amanda M; Zhang, Yong; Shen, Le; Sasaki, Hiroyuki; Wang, Yingmin; Long, Manyuan; Turner, Jerrold R
2010-04-01
In vitro studies have demonstrated that occludin and tricellulin are important for tight junction barrier function, but in vivo data suggest that loss of these proteins can be overcome. The presence of a heretofore unknown, yet related, protein could explain these observations. Here, we report marvelD3, a novel tight junction protein that, like occludin and tricellulin, contains a conserved four-transmembrane MARVEL (MAL and related proteins for vesicle trafficking and membrane link) domain. Phylogenetic tree reconstruction; analysis of RNA and protein tissue distribution; immunofluorescent and electron microscopic examination of subcellular localization; characterization of intracellular trafficking, protein interactions, dynamic behavior, and siRNA knockdown effects; and description of remodeling after in vivo immune activation show that marvelD3, occludin, and tricellulin have distinct but overlapping functions at the tight junction. Although marvelD3 is able to partially compensate for occludin or tricellulin loss, it cannot fully restore function. We conclude that marvelD3, occludin, and tricellulin define the tight junction-associated MARVEL protein family. The data further suggest that these proteins are best considered as a group with both redundant and unique contributions to epithelial function and tight junction regulation.
Vedula, Pavan; Cruz, Lissette A; Gutierrez, Natasha; Davis, Justin; Ayee, Brian; Abramczyk, Rachel; Rodriguez, Alexis J
2016-06-30
Quantifying multi-molecular complex assembly in specific cytoplasmic compartments is crucial to understand how cells use assembly/disassembly of these complexes to control function. Currently, biophysical methods like Fluorescence Resonance Energy Transfer and Fluorescence Correlation Spectroscopy provide quantitative measurements of direct protein-protein interactions, while traditional biochemical approaches such as sub-cellular fractionation and immunoprecipitation remain the main approaches used to study multi-protein complex assembly/disassembly dynamics. In this article, we validate and quantify multi-protein adherens junction complex assembly in situ using light microscopy and Fluorescence Covariance Analysis. Utilizing specific fluorescently-labeled protein pairs, we quantified various stages of adherens junction complex assembly, the multiprotein complex regulating epithelial tissue structure and function following de novo cell-cell contact. We demonstrate: minimal cadherin-catenin complex assembly in the perinuclear cytoplasm and subsequent localization to the cell-cell contact zone, assembly of adherens junction complexes, acto-myosin tension-mediated anchoring, and adherens junction maturation following de novo cell-cell contact. Finally applying Fluorescence Covariance Analysis in live cells expressing fluorescently tagged adherens junction complex proteins, we also quantified adherens junction complex assembly dynamics during epithelial monolayer formation.
HMPAS: Human Membrane Protein Analysis System
2013-01-01
Background Membrane proteins perform essential roles in diverse cellular functions and are regarded as major pharmaceutical targets. The significance of membrane proteins has led to the developing dozens of resources related with membrane proteins. However, most of these resources are built for specific well-known membrane protein groups, making it difficult to find common and specific features of various membrane protein groups. Methods We collected human membrane proteins from the dispersed resources and predicted novel membrane protein candidates by using ortholog information and our membrane protein classifiers. The membrane proteins were classified according to the type of interaction with the membrane, subcellular localization, and molecular function. We also made new feature dataset to characterize the membrane proteins in various aspects including membrane protein topology, domain, biological process, disease, and drug. Moreover, protein structure and ICD-10-CM based integrated disease and drug information was newly included. To analyze the comprehensive information of membrane proteins, we implemented analysis tools to identify novel sequence and functional features of the classified membrane protein groups and to extract features from protein sequences. Results We constructed HMPAS with 28,509 collected known membrane proteins and 8,076 newly predicted candidates. This system provides integrated information of human membrane proteins individually and in groups organized by 45 subcellular locations and 1,401 molecular functions. As a case study, we identified associations between the membrane proteins and diseases and present that membrane proteins are promising targets for diseases related with nervous system and circulatory system. A web-based interface of this system was constructed to facilitate researchers not only to retrieve organized information of individual proteins but also to use the tools to analyze the membrane proteins. Conclusions HMPAS provides comprehensive information about human membrane proteins including specific features of certain membrane protein groups. In this system, user can acquire the information of individual proteins and specified groups focused on their conserved sequence features, involved cellular processes, and diseases. HMPAS may contribute as a valuable resource for the inference of novel cellular mechanisms and pharmaceutical targets associated with the human membrane proteins. HMPAS is freely available at http://fcode.kaist.ac.kr/hmpas. PMID:24564858
Kolpakova, E; Frengen, E; Stokke, T; Olsnes, S
2000-01-01
Acidic fibroblast growth factor (aFGF) intracellular binding protein (FIBP) is a protein found mainly in the nucleus that might be involved in the intracellular function of aFGF. Here we present a comparative analysis of the deduced amino acid sequences of human, murine and Drosophila FIBP analogues and demonstrate that FIBP is an evolutionarily conserved protein. The human gene spans more than 5 kb, comprising ten exons and nine introns, and maps to chromosome 11q13.1. Two slightly different splice variants found in different tissues were isolated and characterized. Sequence analysis of the region surrounding the translation start revealed a CpG island, a classical feature of widely expressed genes. Functional studies of the promoter region with a luciferase reporter system suggested a strong transcriptional activity residing within 600 bp of the 5' flanking region. PMID:11104667
Wang, Qiangjun; Zhao, Xiaowei; Zhang, Zijun; Zhao, Huiling; Huang, Dongwei; Cheng, Guanglong; Yang, Yongxin
2017-04-01
Lactation performance of dairy cattle is susceptible to heat stress. The liver is one of the most crucial organs affected by high temperature in dairy cows. However, the physiological adaption by the liver to hot summer conditions has not been well elucidated in lactating dairy cows. In the present study, proteomic analysis of the liver in dairy cows in spring and hot summer was performed using a label-free method. In total, 127 differentially expressed proteins were identified; most of the upregulated proteins were involved in protein metabolic processes and responses to stimuli, whereas most of the downregulated proteins were related to oxidation-reduction. Pathway analysis indicated that 3 upregulated heat stress proteins (HSP90α, HSP90β, and endoplasmin) were enriched in the NOD-like receptor signaling pathway, whereas several downregulated NADH dehydrogenase proteins were involved in the oxidative phosphorylation pathway. The protein-protein interaction network indicated that several upregulated HSPs (HSP90α, HSP90β, and GRP78) were involved in more interactions than other proteins and were thus considered as central hub nodes. Our findings provide novel insights into the physiological adaption of liver function in lactating dairy cows to natural high temperature. Copyright © 2017. Published by Elsevier Ltd.
Proteomic analysis and cross species comparison of casein fractions from the milk of dairy animals
Wang, Xiaxia; Zhao, Xiaowei; Huang, Dongwei; Pan, Xiaocheng; Qi, Yunxia; Yang, Yongxin; Zhao, Huiling; Cheng, Guanglong
2017-01-01
Casein micelles contribute to the physicochemical properties of milk and may also influence its functionality. At present, however, there is an incomplete understanding of the casein micelle associated proteins and its diversity among the milk obtained from different species. Therefore, milk samples were collected from seven dairy animals groups, casein fractions were prepared by ultracentrifugation and their constituent proteins were identified by liquid chromatography tandem mass spectrometry. A total of 193 distinct proteins were identified among all the casein micelle preparations. Protein interaction analysis indicated that caseins could interact with major whey proteins, including β-lactoglobulin, α-lactalbumin, lactoferrin, and serum albumin, and then whey proteins interacted with other proteins. Pathway analysis found that the peroxisome proliferator-activated receptor signaling pathway is shared among the studied animals. Additionally, galactose metabolism pathway is also found to be commonly involved for proteins derived from camel and horse milk. According to the similarity of casein micelle proteomes, two major sample clusters were classified into ruminant animals (Holstein and Jersey cows, buffaloes, yaks, and goats) and non-ruminants (camels and horses). Our results provide new insights into the protein profile associated with casein micelles and the functionality of the casein micelle from the studied animals. PMID:28240229
The hypothetical protein Atu4866 from Agrobacterium tumefaciens adopts a streptavidin-like fold
Ai, Xuanjun; Semesi, Anthony; Yee, Adelinda; Arrowsmith, Cheryl H.; Choy, Wing-Yiu; Li, Shawn S.C.
2008-01-01
Atu4866 is a 79-residue conserved hypothetical protein of unknown function from Agrobacterium tumefaciens. Protein sequence alignments show that it shares ≥60% sequence identity with 20 other hypothetical proteins of bacterial origin. However, the structures and functions of these proteins remain unknown so far. To gain insight into the function of this family of proteins, we have determined the structure of Atu4866 as a target of a structural genomics project using solution NMR spectroscopy. Our results reveal that Atu4866 adopts a streptavidin-like fold featuring a β-barrel/sandwich formed by eight antiparallel β-strands. Further structural analysis identified a continuous patch of conserved residues on the surface of Atu4866 that may constitute a potential ligand-binding site. PMID:18042676
Choi, Elliot H; Suh, Susie; Sander, Christopher L; Hernandez, Christian J Ortiz; Bulman, Elizabeth R; Khadka, Nimesh; Dong, Zhiqian; Shi, Wuxian; Palczewski, Krzysztof; Kiser, Philip D
2018-04-12
RPE65 is the essential trans-cis isomerase of the classical retinoid (visual) cycle. Mutations in RPE65 give rise to severe retinal dystrophies, most of which are associated with loss of protein function and recessive inheritance. The only known exception is a c.1430G>A (D477G) mutation that gives rise to dominant retinitis pigmentosa with delayed onset and choroidal and macular involvement. Position 477 is distant from functionally critical regions of RPE65. Hence, the mechanism of D477G pathogenicity remains unclear, although protein misfolding and aggregation mechanisms have been suggested. We characterized a D477G knock-in mouse model which exhibited mild age-dependent changes in retinal structure and function. Immunoblot analysis of protein extracts from the eyes of the knock-in mice demonstrated the presence of ubiquitinated RPE65 and reduced RPE65 expression. We observed an accumulation of retinyl esters in the knock-in mice as well as a delay in rhodopsin regeneration kinetics and diminished electroretinography responses, indicative of RPE65 functional impairment induced by the D477G mutation in vivo. However, a cell line expressing D477G RPE65 revealed protein expression levels, cellular localization, and retinoid isomerase activity comparable to cells expressing wild-type protein. Structural analysis of an RPE65 chimera suggested that the D477G mutation does not perturb protein folding or tertiary structure. Instead, the mutation generates an aggregation-prone surface that could induce cellular toxicity through abnormal complex formation as suggested by crystal packing analysis. These results indicate that a toxic gain-of-function induced by the D477G RPE65 substitution may play a role in the pathogenesis of this form of dominant retinitis pigmentosa.
Query3d: a new method for high-throughput analysis of functional residues in protein structures.
Ausiello, Gabriele; Via, Allegra; Helmer-Citterich, Manuela
2005-12-01
The identification of local similarities between two protein structures can provide clues of a common function. Many different methods exist for searching for similar subsets of residues in proteins of known structure. However, the lack of functional and structural information on single residues, together with the low level of integration of this information in comparison methods, is a limitation that prevents these methods from being fully exploited in high-throughput analyses. Here we describe Query3d, a program that is both a structural DBMS (Database Management System) and a local comparison method. The method conserves a copy of all the residues of the Protein Data Bank annotated with a variety of functional and structural information. New annotations can be easily added from a variety of methods and known databases. The algorithm makes it possible to create complex queries based on the residues' function and then to compare only subsets of the selected residues. Functional information is also essential to speed up the comparison and the analysis of the results. With Query3d, users can easily obtain statistics on how many and which residues share certain properties in all proteins of known structure. At the same time, the method also finds their structural neighbours in the whole PDB. Programs and data can be accessed through the PdbFun web interface.
Query3d: a new method for high-throughput analysis of functional residues in protein structures
Ausiello, Gabriele; Via, Allegra; Helmer-Citterich, Manuela
2005-01-01
Background The identification of local similarities between two protein structures can provide clues of a common function. Many different methods exist for searching for similar subsets of residues in proteins of known structure. However, the lack of functional and structural information on single residues, together with the low level of integration of this information in comparison methods, is a limitation that prevents these methods from being fully exploited in high-throughput analyses. Results Here we describe Query3d, a program that is both a structural DBMS (Database Management System) and a local comparison method. The method conserves a copy of all the residues of the Protein Data Bank annotated with a variety of functional and structural information. New annotations can be easily added from a variety of methods and known databases. The algorithm makes it possible to create complex queries based on the residues' function and then to compare only subsets of the selected residues. Functional information is also essential to speed up the comparison and the analysis of the results. Conclusion With Query3d, users can easily obtain statistics on how many and which residues share certain properties in all proteins of known structure. At the same time, the method also finds their structural neighbours in the whole PDB. Programs and data can be accessed through the PdbFun web interface. PMID:16351754
Quintela, T; Marcelino, H; Deery, M J; Feret, R; Howard, J; Lilley, K S; Albuquerque, T; Gonçalves, I; Duarte, A C; Santos, C R A
2016-01-01
The choroid plexus (CP) epithelium is a unique structure in the brain that forms an interface between the peripheral blood and the cerebrospinal fluid (CSF), which is mostly produced by the CP itself. Because the CP transcriptome is regulated by the sex hormone background, the present study compared gene/protein expression profiles in the CP and CSF from male and female rats aiming to better understand sex-related differences in CP functions and brain physiology. We used data previously obtained by cDNA microarrays to compare the CP transcriptome between male and female rats, and complemented these data with the proteomic analysis of the CSF of castrated and sham-operated males and females. Microarray analysis showed that 17 128 and 17 002 genes are expressed in the male and female CP, which allowed the functional annotation of 141 and 134 pathways, respectively. Among the most expressed genes, canonical pathways associated with mitochondrial dysfunctions and oxidative phosphorylation were the most prominent, whereas the most relevant molecular and cellular functions annotated were protein synthesis, cellular growth and proliferation, cell death and survival, molecular transport, and protein trafficking. No significant differences were found between males and females regarding these pathways. Seminal functions of the CP differentially regulated between sexes were circadian rhythm signalling, as well as several canonical pathways related to stem cell differentiation, metabolism and the barrier function of the CP. The proteomic analysis identified five down-regulated proteins in the CSF samples from male rats compared to females and seven proteins exhibiting marked variation in the CSF of gonadectomised males compared to sham animals, whereas no differences were found between sham and ovariectomised females. These data clearly show sex-related differences in CP gene expression and CSF protein composition that may impact upon neurological diseases. © 2015 British Society for Neuroendocrinology.
Chu, Xin-Ling; Feng, Ming-Guang; Ying, Sheng-Hua
2016-02-01
Protein ubiquitination is an evolutionarily conserved post-translational modification process in eukaryotes, and it plays an important role in many biological processes. Aspergillus nidulans, a model filamentous fungus, contributes to our understanding of cellular physiology, metabolism and genetics, but its ubiquitination is not completely revealed. In this study, the ubiquitination sites in the proteome of A. nidulans were identified using a highly sensitive mass spectrometry combined with immuno-affinity enrichment of the ubiquitinated peptides. The 4816 ubiquitination sites were identified in 1913 ubiquitinated proteins, accounting for 18.1% of total proteins in A. nidulans. Bioinformatic analysis suggested that the ubiquitinated proteins associated with a number of biological functions and displayed various sub-cellular localisations. Meanwhile, seven motifs were revealed from the ubiquitinated peptides, and significantly over-presented in the different pathways. Comparison of the enriched functional catalogues indicated that the ubiquitination functions divergently during growth of A. nidulans and Saccharomyces cerevisiae. Additionally, the proteins in A. nidulans-specific sub-category (cell growth/morphogenesis) were subjected to the protein interaction analysis which demonstrated that ubiquitination is involved in the comprehensive protein interactions. This study presents a first proteomic view of ubiquitination in the filamentous fungus, and provides an initial framework for exploring the physiological roles of ubiquitination in A. nidulans.
Sadaf, Aiman; Du, Yang; Santillan, Claudia; Mortensen, Jonas S.; Molist, Iago; Seven, Alpay B.; Hariharan, Parameswaran; Skiniotis, Georgios; Loland, Claus J.; Kobilka, Brian K.; Guan, Lan; Byrne, Bernadette
2017-01-01
The critical contribution of membrane proteins in normal cellular function makes their detailed structure and functional analysis essential. Detergents, amphipathic agents with the ability to maintain membrane proteins in a soluble state in aqueous solution, have key roles in membrane protein manipulation. Structural and functional stability is a prerequisite for biophysical characterization. However, many conventional detergents are limited in their ability to stabilize membrane proteins, making development of novel detergents for membrane protein manipulation an important research area. The architecture of a detergent hydrophobic group, that directly interacts with the hydrophobic segment of membrane proteins, is a key factor in dictating their efficacy for both membrane protein solubilization and stabilization. In the current study, we developed two sets of maltoside-based detergents with four alkyl chains by introducing dendronic hydrophobic groups connected to a trimaltoside head group, designated dendronic trimaltosides (DTMs). Representative DTMs conferred enhanced stabilization to multiple membrane proteins compared to the benchmark conventional detergent, DDM. One DTM (i.e., DTM-A6) clearly outperformed DDM in stabilizing human β2 adrenergic receptor (β2AR) and its complex with Gs protein. A further evaluation of this DTM led to a clear visualization of β2AR-Gs complex via electron microscopic analysis. Thus, the current study not only provides novel detergent tools useful for membrane protein study, but also suggests that the dendronic architecture has a role in governing detergent efficacy for membrane protein stabilization. PMID:29619178
Concepts of Protein Sorting or Targeting Signals and Membrane Topology in Undergraduate Teaching
ERIC Educational Resources Information Center
Tang, Bor Luen; Teng, Felicia Yu Hsuan
2005-01-01
The process of protein biogenesis culminates in its correct targeting to specific subcellular locations where it serves a function. Contemporary molecular and cell biology investigations often involve the exogenous expression of epitope- or fluorescent protein-tagged recombinant molecules as well as subsequent analysis of protein-protein…
The CTD2 Center at Emory University used high-throughput protein-protein interaction (PPI) mapping for Hippo signaling pathway profiling to rapidly unveil promising PPIs as potential therapeutic targets and advance functional understanding of signaling circuitry in cells. Read the abstract.
Highly branched penta-saccharide-bearing amphiphiles for membrane protein studies
Ehsan, Muhammad; Du, Yang; Scull, Nicola J.; Tikhonova, Elena; Tarrasch, Jeffrey; Mortensen, Jonas S.; Loland, Claus J.; Skiniotis, Georgios; Guan, Lan; Byrne, Bernadette; Kobilka, Brian K.; Chae, Pil Seok
2016-01-01
Detergents are essential tools for membrane protein manipulation. Micelles formed by detergent molecules have the ability to encapsulate the hydrophobic domains of membrane proteins. The resulting protein-detergent complexes (PDCs) are compatible with the polar environments of aqueous media, making structural and functional analysis feasible. Although a number of novel agents have been developed to overcome the limitations of conventional detergents, most of them have traditional head groups such as glucoside or maltoside. In this study, we introduce a class of amphiphiles, the PSA’Es with a novel highly branched penta-saccharide hydrophilic group. The PSA’Es conferred markedly increased stability to a diverse range of membrane proteins compared to conventional detergents, indicating a positive role for the new hydrophilic group in maintaining the native protein integrity. In addition, PDCs formed by PSA’Es were smaller and more suitable for electron microscopic analysis than those formed by DDM, indicating that the new agents have significant potential for the structure-function studies of membrane proteins. PMID:26966956
Crystal structure of the YDR533c S. cerevisiae protein, a class II member of the Hsp31 family.
Graille, Marc; Quevillon-Cheruel, Sophie; Leulliot, Nicolas; Zhou, Cong-Zhao; Li de la Sierra Gallay, Ines; Jacquamet, Lilian; Ferrer, Jean-Luc; Liger, Dominique; Poupon, Anne; Janin, Joel; van Tilbeurgh, Herman
2004-05-01
The ORF YDR533c from Saccharomyces cerevisiae codes for a 25.5 kDa protein of unknown biochemical function. Transcriptome analysis of yeast has shown that this gene is activated in response to various stress conditions together with proteins belonging to the heat shock family. In order to clarify its biochemical function, we determined the crystal structure of YDR533c to 1.85 A resolution by the single anomalous diffraction method. The protein possesses an alpha/beta hydrolase fold and a putative Cys-His-Glu catalytic triad common to a large enzyme family containing proteases, amidotransferases, lipases, and esterases. The protein has strong structural resemblance with the E. coli Hsp31 protein and the intracellular protease I from Pyrococcus horikoshii, which are considered class I and class III members of the Hsp31 family, respectively. Detailed structural analysis strongly suggests that the YDR533c protein crystal structure is the first one of a class II member of the Hsp31 family.
Fleischmann, M; Clark, M W; Forrester, W; Wickens, M; Nishimoto, T; Aebi, M
1991-07-01
Mutations in the PRP20 gene of yeast show a pleiotropic phenotype, in which both mRNA metabolism and nuclear structure are affected. srm1 mutants, defective in the same gene, influence the signal transduction pathway for the pheromone response. The yeast PRP20/SRM1 protein is highly homologous to the RCC1 protein of man, hamster and frog. In mammalian cells, this protein is a negative regulator for initiation of chromosome condensation. We report the analysis of two, independently isolated, recessive temperature-sensitive prp20 mutants. They have identical G to A transitions, leading to the alteration of a highly conserved glycine residue to glutamic acid. By immunofluorescence microscopy the PRP20 protein was localized in the nucleus. Expression of the RCC1 protein can complement the temperature-sensitive phenotype of prp20 mutants, demonstrating the functional similarity of the yeast and mammalian proteins.
FunRich proteomics software analysis, let the fun begin!
Benito-Martin, Alberto; Peinado, Héctor
2015-08-01
Protein MS analysis is the preferred method for unbiased protein identification. It is normally applied to a large number of both small-scale and high-throughput studies. However, user-friendly computational tools for protein analysis are still needed. In this issue, Mathivanan and colleagues (Proteomics 2015, 15, 2597-2601) report the development of FunRich software, an open-access software that facilitates the analysis of proteomics data, providing tools for functional enrichment and interaction network analysis of genes and proteins. FunRich is a reinterpretation of proteomic software, a standalone tool combining ease of use with customizable databases, free access, and graphical representations. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Medvedeva, Irina V; Demenkov, Pavel S; Ivanisenko, Vladimir A
2017-04-01
Functional sites define the diversity of protein functions and are the central object of research of the structural and functional organization of proteins. The mechanisms underlying protein functional sites emergence and their variability during evolution are distinguished by duplication, shuffling, insertion and deletion of the exons in genes. The study of the correlation between a site structure and exon structure serves as the basis for the in-depth understanding of sites organization. In this regard, the development of programming resources that allow the realization of the mutual projection of exon structure of genes and primary and tertiary structures of encoded proteins is still the actual problem. Previously, we developed the SitEx system that provides information about protein and gene sequences with mapped exon borders and protein functional sites amino acid positions. The database included information on proteins with known 3D structure. However, data with respect to orthologs was not available. Therefore, we added the projection of sites positions to the exon structures of orthologs in SitEx 2.0. We implemented a search through database using site conservation variability and site discontinuity through exon structure. Inclusion of the information on orthologs allowed to expand the possibilities of SitEx usage for solving problems regarding the analysis of the structural and functional organization of proteins. Database URL: http://www-bionet.sscc.ru/sitex/ .
Ju, Jung Won; Kim, Ho-Cheol; Shin, Hyun-Il; Kim, Yu Jung; Kim, Dong-Myung
2015-01-01
Progress towards genetic sequencing of human parasites has provided the groundwork for a post-genomic approach to develop novel antigens for the diagnosis and treatment of parasite infections. To fully utilize the genomic data, however, high-throughput methodologies are required for functional analysis of the proteins encoded in the genomic sequences. In this study, we investigated cell-free expression and in situ immobilization of parasite proteins as a novel platform for the discovery of antigenic proteins. PCR-amplified parasite DNA was immobilized on microbeads that were also functionalized to capture synthesized proteins. When the microbeads were incubated in a reaction mixture for cell-free synthesis, proteins expressed from the microbead-immobilized DNA were instantly immobilized on the same microbeads, providing a physical linkage between the genetic information and encoded proteins. This approach of in situ expression and isolation enables streamlined recovery and analysis of cell-free synthesized proteins and also allows facile identification of the genes coding antigenic proteins through direct PCR of the microbead-bound DNA. PMID:26599101
Petyuk, Vladislav A.; Qian, Wei-Jun; Hinault, Charlotte; Gritsenko, Marina A.; Singhal, Mudita; Monroe, Matthew E.; Camp, David G.; Kulkarni, Rohit N.; Smith, Richard D.
2009-01-01
The pancreatic islets of Langerhans, and especially the insulin-producing beta cells, play a central role in the maintenance of glucose homeostasis. Alterations in the expression of multiple proteins in the islets that contribute to the maintenance of islet function are likely to underlie the pathogenesis of type 2 diabetes. To identify proteins that constitute the islet proteome, we provide the first comprehensive proteomic characterization of pancreatic islets for mouse, the most commonly used animal model in diabetes research. Using strong cation exchange fractionation coupled with reversed phase LC-MS/MS we report the confident identification of 17,350 different tryptic peptides covering 2,612 proteins having at least two unique peptides per protein. The dataset also identified ~60 post-translationally modified peptides including oxidative modifications and phosphorylation. While many of the identified phosphorylation sites corroborate those previously known, the oxidative modifications observed on cysteinyl residues reveal potentially novel information suggesting a role for oxidative stress in islet function. Comparative analysis with 15 available proteomic datasets from other mouse tissues and cells revealed a set of 133 proteins predominantly expressed in pancreatic islets. This unique set of proteins, in addition to those with known functions such as peptide hormones secreted from the islets, contains several proteins with as yet unknown functions. The mouse islet protein and peptide database accessible at http://ncrr.pnl.gov, provides an important reference resource for the research community to facilitate research in the diabetes and metabolism fields. PMID:18570455
Isaac, Arnold Emerson; Sinha, Sitabhra
2015-10-01
The representation of proteins as networks of interacting amino acids, referred to as protein contact networks (PCN), and their subsequent analyses using graph theoretic tools, can provide novel insights into the key functional roles of specific groups of residues. We have characterized the networks corresponding to the native states of 66 proteins (belonging to different families) in terms of their core-periphery organization. The resulting hierarchical classification of the amino acid constituents of a protein arranges the residues into successive layers - having higher core order - with increasing connection density, ranging from a sparsely linked periphery to a densely intra-connected core (distinct from the earlier concept of protein core defined in terms of the three-dimensional geometry of the native state, which has least solvent accessibility). Our results show that residues in the inner cores are more conserved than those at the periphery. Underlining the functional importance of the network core, we see that the receptor sites for known ligand molecules of most proteins occur in the innermost core. Furthermore, the association of residues with structural pockets and cavities in binding or active sites increases with the core order. From mutation sensitivity analysis, we show that the probability of deleterious or intolerant mutations also increases with the core order. We also show that stabilization centre residues are in the innermost cores, suggesting that the network core is critically important in maintaining the structural stability of the protein. A publicly available Web resource for performing core-periphery analysis of any protein whose native state is known has been made available by us at http://www.imsc.res.in/ ~sitabhra/proteinKcore/index.html.
Deineko, Viktor
2006-01-01
Human multisynthetase complex auxiliary component, protein p43 is an endothelial monocyte-activating polypeptide II precursor. In this study, comprehensive sequence analysis of N-terminus has been performed to identify structural domains, motifs, sites of post-translation modification and other functionally important parameters. The spatial structure model of full-chain protein p43 is obtained.
Ayub, Gohar; Waheed, Yasir
2016-06-01
The 2014 Ebola outbreak was one of the largest that have occurred; it started in Guinea and spread to Nigeria, Liberia and Sierra Leone. Phylogenetic analysis of the current virus species indicated that this outbreak is the result of a divergent lineage of the Zaire ebolavirus. The L protein of Ebola virus (EBOV) is the catalytic subunit of the RNA‑dependent RNA polymerase complex, which, with VP35, is key for the replication and transcription of viral RNA. Earlier sequence analysis demonstrated that the L protein of all non‑segmented negative‑sense (NNS) RNA viruses consists of six domains containing conserved functional motifs. The aim of the present study was to analyze the presence of these motifs in 2014 EBOV isolates, highlight their function and how they may contribute to the overall pathogenicity of the isolates. For this purpose, 81 2014 EBOV L protein sequences were aligned with 475 other NNS RNA viruses, including Paramyxoviridae and Rhabdoviridae viruses. Phylogenetic analysis of all EBOV outbreak L protein sequences was also performed. Analysis of the amino acid substitutions in the 2014 EBOV outbreak was conducted using sequence analysis. The alignment demonstrated the presence of previously conserved motifs in the 2014 EBOV isolates and novel residues. Notably, all the mutations identified in the 2014 EBOV isolates were tolerant, they were pathogenic with certain examples occurring within previously determined functional conserved motifs, possibly altering viral pathogenicity, replication and virulence. The phylogenetic analysis demonstrated that all sequences with the exception of the 2014 EBOV sequences were clustered together. The 2014 EBOV outbreak has acquired a great number of mutations, which may explain the reasons behind this unprecedented outbreak. Certain residues critical to the function of the polymerase remain conserved and may be targets for the development of antiviral therapeutic agents.
2013-01-01
Despite its prominence for characterization of complex mixtures, LC–MS/MS frequently fails to identify many proteins. Network-based analysis methods, based on protein–protein interaction networks (PPINs), biological pathways, and protein complexes, are useful for recovering non-detected proteins, thereby enhancing analytical resolution. However, network-based analysis methods do come in varied flavors for which the respective efficacies are largely unknown. We compare the recovery performance and functional insights from three distinct instances of PPIN-based approaches, viz., Proteomics Expansion Pipeline (PEP), Functional Class Scoring (FCS), and Maxlink, in a test scenario of valproic acid (VPA)-treated mice. We find that the most comprehensive functional insights, as well as best non-detected protein recovery performance, are derived from FCS utilizing real biological complexes. This outstrips other network-based methods such as Maxlink or Proteomics Expansion Pipeline (PEP). From FCS, we identified known biological complexes involved in epigenetic modifications, neuronal system development, and cytoskeletal rearrangements. This is congruent with the observed phenotype where adult mice showed an increase in dendritic branching to allow the rewiring of visual cortical circuitry and an improvement in their visual acuity when tested behaviorally. In addition, PEP also identified a novel complex, comprising YWHAB, NR1, NR2B, ACTB, and TJP1, which is functionally related to the observed phenotype. Although our results suggest different network analysis methods can produce different results, on the whole, the findings are mutually supportive. More critically, the non-overlapping information each provides can provide greater holistic understanding of complex phenotypes. PMID:23557376
Schokraie, Elham; Hotz-Wagenblatt, Agnes; Warnken, Uwe; Mali, Brahim; Frohme, Marcus; Förster, Frank; Dandekar, Thomas; Hengherr, Steffen; Schill, Ralph O; Schnölzer, Martina
2010-03-03
Tardigrades are small, multicellular invertebrates which are able to survive times of unfavourable environmental conditions using their well-known capability to undergo cryptobiosis at any stage of their life cycle. Milnesium tardigradum has become a powerful model system for the analysis of cryptobiosis. While some genetic information is already available for Milnesium tardigradum the proteome is still to be discovered. Here we present to the best of our knowledge the first comprehensive study of Milnesium tardigradum on the protein level. To establish a proteome reference map we developed optimized protocols for protein extraction from tardigrades in the active state and for separation of proteins by high resolution two-dimensional gel electrophoresis. Since only limited sequence information of M. tardigradum on the genome and gene expression level is available to date in public databases we initiated in parallel a tardigrade EST sequencing project to allow for protein identification by electrospray ionization tandem mass spectrometry. 271 out of 606 analyzed protein spots could be identified by searching against the publicly available NCBInr database as well as our newly established tardigrade protein database corresponding to 144 unique proteins. Another 150 spots could be identified in the tardigrade clustered EST database corresponding to 36 unique contigs and ESTs. Proteins with annotated function were further categorized in more detail by their molecular function, biological process and cellular component. For the proteins of unknown function more information could be obtained by performing a protein domain annotation analysis. Our results include proteins like protein member of different heat shock protein families and LEA group 3, which might play important roles in surviving extreme conditions. The proteome reference map of Milnesium tardigradum provides the basis for further studies in order to identify and characterize the biochemical mechanisms of tolerance to extreme desiccation. The optimized proteomics workflow will enable application of sensitive quantification techniques to detect differences in protein expression, which are characteristic of the active and anhydrobiotic states of tardigrades.
Schokraie, Elham; Hotz-Wagenblatt, Agnes; Warnken, Uwe; Mali, Brahim; Frohme, Marcus; Förster, Frank; Dandekar, Thomas; Hengherr, Steffen; Schill, Ralph O.; Schnölzer, Martina
2010-01-01
Background Tardigrades are small, multicellular invertebrates which are able to survive times of unfavourable environmental conditions using their well-known capability to undergo cryptobiosis at any stage of their life cycle. Milnesium tardigradum has become a powerful model system for the analysis of cryptobiosis. While some genetic information is already available for Milnesium tardigradum the proteome is still to be discovered. Principal Findings Here we present to the best of our knowledge the first comprehensive study of Milnesium tardigradum on the protein level. To establish a proteome reference map we developed optimized protocols for protein extraction from tardigrades in the active state and for separation of proteins by high resolution two-dimensional gel electrophoresis. Since only limited sequence information of M. tardigradum on the genome and gene expression level is available to date in public databases we initiated in parallel a tardigrade EST sequencing project to allow for protein identification by electrospray ionization tandem mass spectrometry. 271 out of 606 analyzed protein spots could be identified by searching against the publicly available NCBInr database as well as our newly established tardigrade protein database corresponding to 144 unique proteins. Another 150 spots could be identified in the tardigrade clustered EST database corresponding to 36 unique contigs and ESTs. Proteins with annotated function were further categorized in more detail by their molecular function, biological process and cellular component. For the proteins of unknown function more information could be obtained by performing a protein domain annotation analysis. Our results include proteins like protein member of different heat shock protein families and LEA group 3, which might play important roles in surviving extreme conditions. Conclusions The proteome reference map of Milnesium tardigradum provides the basis for further studies in order to identify and characterize the biochemical mechanisms of tolerance to extreme desiccation. The optimized proteomics workflow will enable application of sensitive quantification techniques to detect differences in protein expression, which are characteristic of the active and anhydrobiotic states of tardigrades. PMID:20224743
Introduction to bioinformatics.
Can, Tolga
2014-01-01
Bioinformatics is an interdisciplinary field mainly involving molecular biology and genetics, computer science, mathematics, and statistics. Data intensive, large-scale biological problems are addressed from a computational point of view. The most common problems are modeling biological processes at the molecular level and making inferences from collected data. A bioinformatics solution usually involves the following steps: Collect statistics from biological data. Build a computational model. Solve a computational modeling problem. Test and evaluate a computational algorithm. This chapter gives a brief introduction to bioinformatics by first providing an introduction to biological terminology and then discussing some classical bioinformatics problems organized by the types of data sources. Sequence analysis is the analysis of DNA and protein sequences for clues regarding function and includes subproblems such as identification of homologs, multiple sequence alignment, searching sequence patterns, and evolutionary analyses. Protein structures are three-dimensional data and the associated problems are structure prediction (secondary and tertiary), analysis of protein structures for clues regarding function, and structural alignment. Gene expression data is usually represented as matrices and analysis of microarray data mostly involves statistics analysis, classification, and clustering approaches. Biological networks such as gene regulatory networks, metabolic pathways, and protein-protein interaction networks are usually modeled as graphs and graph theoretic approaches are used to solve associated problems such as construction and analysis of large-scale networks.
Adhesion, invasion and evasion: the many functions of the surface proteins of Staphylococcus aureus
Foster, Timothy J.; Geoghegan, Joan A.; Ganesh, Vannakambadi K.; Höök, Magnus
2014-01-01
Staphylococcus aureus is an important opportunistic pathogen and persistently colonizes about 20% of the human population. Its surface is ‘decorated’ with proteins that are covalently anchored to the cell wall peptidoglycan. Structural and functional analysis has identified four distinct classes of surface proteins, of which microbial surface component recognizing adhesive matrix molecules (MSCRAMMs) are the largest class. These surface proteins have numerous functions, including adhesion to and invasion of host cells and tissues, evasion of immune responses and biofilm formation. Thus, cell wall-anchored proteins are essential virulence factors for the survival of S. aureus in the commensal state and during invasive infections, and targeting them with vaccines could combat S. aureus infections. PMID:24336184
Dubovenko, Alexey; Nikolsky, Yuri; Rakhmatulin, Eugene; Nikolskaya, Tatiana
2017-01-01
Analysis of NGS and other sequencing data, gene variants, gene expression, proteomics, and other high-throughput (OMICs) data is challenging because of its biological complexity and high level of technical and biological noise. One way to deal with both problems is to perform analysis with a high fidelity annotated knowledgebase of protein interactions, pathways, and functional ontologies. This knowledgebase has to be structured in a computer-readable format and must include software tools for managing experimental data, analysis, and reporting. Here, we present MetaCore™ and Key Pathway Advisor (KPA), an integrated platform for functional data analysis. On the content side, MetaCore and KPA encompass a comprehensive database of molecular interactions of different types, pathways, network models, and ten functional ontologies covering human, mouse, and rat genes. The analytical toolkit includes tools for gene/protein list enrichment analysis, statistical "interactome" tool for the identification of over- and under-connected proteins in the dataset, and a biological network analysis module made up of network generation algorithms and filters. The suite also features Advanced Search, an application for combinatorial search of the database content, as well as a Java-based tool called Pathway Map Creator for drawing and editing custom pathway maps. Applications of MetaCore and KPA include molecular mode of action of disease research, identification of potential biomarkers and drug targets, pathway hypothesis generation, analysis of biological effects for novel small molecule compounds and clinical applications (analysis of large cohorts of patients, and translational and personalized medicine).
Bhattacharyya, Moitrayee; Vishveshwara, Saraswathi
2009-01-01
Background The genome of a wide variety of prokaryotes contains the luxS gene homologue, which encodes for the protein S-ribosylhomocysteinelyase (LuxS). This protein is responsible for the production of the quorum sensing molecule, AI-2 and has been implicated in a variety of functions such as flagellar motility, metabolic regulation, toxin production and even in pathogenicity. A high structural similarity is present in the LuxS structures determined from a few species. In this study, we have modelled the structures from several other species and have investigated their dimer interfaces. We have attempted to correlate the interface features of LuxS with the phenotypic nature of the organisms. Results The protein structure networks (PSN) are constructed and graph theoretical analysis is performed on the structures obtained from X-ray crystallography and on the modelled ones. The interfaces, which are known to contain the active site, are characterized from the PSNs of these homodimeric proteins. The key features presented by the protein interfaces are investigated for the classification of the proteins in relation to their function. From our analysis, structural interface motifs are identified for each class in our dataset, which showed distinctly different pattern at the interface of LuxS for the probiotics and some extremophiles. Our analysis also reveals potential sites of mutation and geometric patterns at the interface that was not evident from conventional sequence alignment studies. Conclusion The structure network approach employed in this study for the analysis of dimeric interfaces in LuxS has brought out certain structural details at the side-chain interaction level, which were elusive from the conventional structure comparison methods. The results from this study provide a better understanding of the relation between the luxS gene and its functional role in the prokaryotes. This study also makes it possible to explore the potential direction towards the design of inhibitors of LuxS and thus towards a wide range of antimicrobials. PMID:19243584
Panda, Subhamay; Kumari, Leena
2017-01-01
Serine proteases are a group of enzymes that hydrolyses the peptide bonds in proteins. In mammals, these enzymes help in the regulation of several major physiological functions such as digestion, blood clotting, responses of immune system, reproductive functions and the complement system. Serine proteases obtained from the venom of Octopodidae family is a relatively unexplored area of research. In the present work, we tried to effectively utilize comparative composite molecular modeling technique. Our key aim was to propose the first molecular model structure of unexplored serine protease 5 derived from big blue octopus. The other objective of this study was to analyze the distribution of negatively and positively charged amino acid over molecular modeled structure, distribution of secondary structural elements, hydrophobicity molecular surface analysis and electrostatic potential analysis with the aid of different bioinformatic tools. In the present study, molecular model has been generated with the help of I-TASSER suite. Afterwards the refined structural model was validated with standard methods. For functional annotation of protein molecule we used Protein Information Resource (PIR) database. Serine protease 5 of big blue octopus was analyzed with different bioinformatical algorithms for the distribution of negatively and positively charged amino acid over molecular modeled structure, distribution of secondary structural elements, hydrophobicity molecular surface analysis and electrostatic potential analysis. The functionally critical amino acids and ligand- binding site (LBS) of the proteins (modeled) were determined using the COACH program. The molecular model data in cooperation to other pertinent post model analysis data put forward molecular insight to proteolytic activity of serine protease 5, which helps in the clear understanding of procoagulant and anticoagulant characteristics of this natural lead molecule. Our approach was to investigate the octopus venom protein as a whole or a part of their structure that may result in the development of new lead molecule. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Carvalho, Henrique F; Barbosa, Arménio J M; Roque, Ana C A; Iranzo, Olga; Branco, Ricardo J F
2017-01-01
Recent advances in de novo protein design have gained considerable insight from the intrinsic dynamics of proteins, based on the integration of molecular dynamics simulations protocols on the state-of-the-art de novo protein design protocols used nowadays. With this protocol we illustrate how to set up and run a molecular dynamics simulation followed by a functional protein dynamics analysis. New users will be introduced to some useful open-source computational tools, including the GROMACS molecular dynamics simulation software package and ProDy for protein structural dynamics analysis.
Lang, Tiange; Yin, Kangquan; Liu, Jinyu; Cao, Kunfang; Cannon, Charles H; Du, Fang K
2014-01-01
Predicting protein domains is essential for understanding a protein's function at the molecular level. However, up till now, there has been no direct and straightforward method for predicting protein domains in species without a reference genome sequence. In this study, we developed a functionality with a set of programs that can predict protein domains directly from genomic sequence data without a reference genome. Using whole genome sequence data, the programming functionality mainly comprised DNA assembly in combination with next-generation sequencing (NGS) assembly methods and traditional methods, peptide prediction and protein domain prediction. The proposed new functionality avoids problems associated with de novo assembly due to micro reads and small single repeats. Furthermore, we applied our functionality for the prediction of leucine rich repeat (LRR) domains in four species of Ficus with no reference genome, based on NGS genomic data. We found that the LRRNT_2 and LRR_8 domains are related to plant transpiration efficiency, as indicated by the stomata index, in the four species of Ficus. The programming functionality established in this study provides new insights for protein domain prediction, which is particularly timely in the current age of NGS data expansion.
Shedding new light on opsin evolution
Porter, Megan L.; Blasic, Joseph R.; Bok, Michael J.; Cameron, Evan G.; Pringle, Thomas; Cronin, Thomas W.; Robinson, Phyllis R.
2012-01-01
Opsin proteins are essential molecules in mediating the ability of animals to detect and use light for diverse biological functions. Therefore, understanding the evolutionary history of opsins is key to understanding the evolution of light detection and photoreception in animals. As genomic data have appeared and rapidly expanded in quantity, it has become possible to analyse opsins that functionally and histologically are less well characterized, and thus to examine opsin evolution strictly from a genetic perspective. We have incorporated these new data into a large-scale, genome-based analysis of opsin evolution. We use an extensive phylogeny of currently known opsin sequence diversity as a foundation for examining the evolutionary distributions of key functional features within the opsin clade. This new analysis illustrates the lability of opsin protein-expression patterns, site-specific functionality (i.e. counterion position) and G-protein binding interactions. Further, it demonstrates the limitations of current model organisms, and highlights the need for further characterization of many of the opsin sequence groups with unknown function. PMID:22012981
Semantic integration to identify overlapping functional modules in protein interaction networks
Cho, Young-Rae; Hwang, Woochang; Ramanathan, Murali; Zhang, Aidong
2007-01-01
Background The systematic analysis of protein-protein interactions can enable a better understanding of cellular organization, processes and functions. Functional modules can be identified from the protein interaction networks derived from experimental data sets. However, these analyses are challenging because of the presence of unreliable interactions and the complex connectivity of the network. The integration of protein-protein interactions with the data from other sources can be leveraged for improving the effectiveness of functional module detection algorithms. Results We have developed novel metrics, called semantic similarity and semantic interactivity, which use Gene Ontology (GO) annotations to measure the reliability of protein-protein interactions. The protein interaction networks can be converted into a weighted graph representation by assigning the reliability values to each interaction as a weight. We presented a flow-based modularization algorithm to efficiently identify overlapping modules in the weighted interaction networks. The experimental results show that the semantic similarity and semantic interactivity of interacting pairs were positively correlated with functional co-occurrence. The effectiveness of the algorithm for identifying modules was evaluated using functional categories from the MIPS database. We demonstrated that our algorithm had higher accuracy compared to other competing approaches. Conclusion The integration of protein interaction networks with GO annotation data and the capability of detecting overlapping modules substantially improve the accuracy of module identification. PMID:17650343
Ahmadi, Homa; Ramezani, Mohammad; Yazdian-Robati, Rezvan; Behnam, Behzad; Razavi Azarkhiavi, Kamal; Hashem Nia, Azadeh; Mokhtarzadeh, Ahad; Matbou Riahi, Maryam; Razavi, Bibi Marjan; Abnous, Khalil
2017-09-25
Recently carbon nanotubes (CNTs) showed promising potentials in different biomedical applications but their safe use in humans and probable toxicities are still challenging. The aim of this study was to determine the acute toxicity of functionalized single walled carbon nanotubes (SWCNTs). In this project, PEGylated and Tween functionalized SWCNTs were prepared. BALB/c mice were randomly divided into nine groups, including PEGylated SWCNTs (75,150μg/mouse) and PEG, Tween80 suspended SWCNTs, Tween 80 and a control group (intact mice). One or 7 days after intravenous injection, the mice were killed and serum and livers were collected. The oxidative stress markers, biochemical and histopathological changes were studied. Subsequently, proteomics approach was used to investigate the alterations of protein expression profiles in the liver. Results showed that there were not any significant differences in malondealdehyde (MDA), glutathione (GSH) levels and biochemical enzymes (ALT and AST) between groups, while the histopathological observations of livers showed some injuries. The results of proteomics analysis revealed indolethylamine N-Methyltransferase (INMT), glycine N-Methyltransferase (GNMT), selenium binding protein (Selenbp), thioredoxin peroxidase (TPx), TNF receptor associated protein 1(Trap1), peroxiredoxin-6 (Prdx6), electron transport flavoprotein (Etf-α), regucalcin (Rgn) and ATP5b proteins were differentially expressed in functionalized SWCNTs groups. Western blot analyses confirmed that the changes in Prdx6 were consistent with 2-DE gel analysis. In summary, acute toxicological study on two functionalized SWCNTs did not show any significant toxicity at selected doses. Proteomics analysis also showed that following exposure to functionalized SWCNTs, the expression of some proteins with antioxidant activity and detoxifying properties were increased in liver tissue. Copyright © 2017 Elsevier B.V. All rights reserved.
Desdouits, Nathan; Nilges, Michael; Blondel, Arnaud
2015-02-01
Protein conformation has been recognized as the key feature determining biological function, as it determines the position of the essential groups specifically interacting with substrates. Hence, the shape of the cavities or grooves at the protein surface appears to drive those functions. However, only a few studies describe the geometrical evolution of protein cavities during molecular dynamics simulations (MD), usually with a crude representation. To unveil the dynamics of cavity geometry evolution, we developed an approach combining cavity detection and Principal Component Analysis (PCA). This approach was applied to four systems subjected to MD (lysozyme, sperm whale myoglobin, Dengue envelope protein and EF-CaM complex). PCA on cavities allows us to perform efficient analysis and classification of the geometry diversity explored by a cavity. Additionally, it reveals correlations between the evolutions of the cavities and structures, and can even suggest how to modify the protein conformation to induce a given cavity geometry. It also helps to perform fast and consensual clustering of conformations according to cavity geometry. Finally, using this approach, we show that both carbon monoxide (CO) location and transfer among the different xenon sites of myoglobin are correlated with few cavity evolution modes of high amplitude. This correlation illustrates the link between ligand diffusion and the dynamic network of internal cavities. Copyright © 2014 The Authors. Published by Elsevier Inc. All rights reserved.
Lee, Jinoo; Valkova, Nelly; White, Mark P; Kültz, Dietmar
2006-09-01
We used dogfish shark (Squalus acanthias) as a model for proteome analysis of six different tissues to evaluate tissue-specific protein expression on a global scale and to deduce specific functions and the relatedness of multiple tissues from their proteomes. Proteomes of heart, brain, kidney, intestine, gill, and rectal gland were separated by two-dimensional gel electrophoresis (2DGE), gel images were matched using Delta 2D software and then evaluated for tissue-specific proteins. Sixty-one proteins (4%) were found to be in only a single type of tissue and 535 proteins (36%) were equally abundant in all six tissues. Relatedness between tissues was assessed based on tissue-specific expression patterns of all 1465 consistently resolved protein spots. This analysis revealed that tissues with osmoregulatory function (kidney, intestine, gill, rectal gland) were more similar in their overall proteomes than non-osmoregulatory tissues (heart, brain). Sixty-one proteins were identified by MALDI-TOF/TOF mass spectrometry and biological functions characteristic of osmoregulatory tissues were derived from gene ontology and molecular pathway analysis. Our data demonstrate that the molecular machinery for energy and urea metabolism and the Rho-GTPase/cytoskeleton pathway are enriched in osmoregulatory tissues of sharks. Our work provides a strong rationale for further study of the contribution of these mechanisms to the osmoregulation of marine sharks.
Harper, Angela F; Leuthaeuser, Janelle B; Babbitt, Patricia C; Morris, John H; Ferrin, Thomas E; Poole, Leslie B; Fetrow, Jacquelyn S
2017-02-01
Peroxiredoxins (Prxs or Prdxs) are a large protein superfamily of antioxidant enzymes that rapidly detoxify damaging peroxides and/or affect signal transduction and, thus, have roles in proliferation, differentiation, and apoptosis. Prx superfamily members are widespread across phylogeny and multiple methods have been developed to classify them. Here we present an updated atlas of the Prx superfamily identified using a novel method called MISST (Multi-level Iterative Sequence Searching Technique). MISST is an iterative search process developed to be both agglomerative, to add sequences containing similar functional site features, and divisive, to split groups when functional site features suggest distinct functionally-relevant clusters. Superfamily members need not be identified initially-MISST begins with a minimal representative set of known structures and searches GenBank iteratively. Further, the method's novelty lies in the manner in which isofunctional groups are selected; rather than use a single or shifting threshold to identify clusters, the groups are deemed isofunctional when they pass a self-identification criterion, such that the group identifies itself and nothing else in a search of GenBank. The method was preliminarily validated on the Prxs, as the Prxs presented challenges of both agglomeration and division. For example, previous sequence analysis clustered the Prx functional families Prx1 and Prx6 into one group. Subsequent expert analysis clearly identified Prx6 as a distinct functionally relevant group. The MISST process distinguishes these two closely related, though functionally distinct, families. Through MISST search iterations, over 38,000 Prx sequences were identified, which the method divided into six isofunctional clusters, consistent with previous expert analysis. The results represent the most complete computational functional analysis of proteins comprising the Prx superfamily. The feasibility of this novel method is demonstrated by the Prx superfamily results, laying the foundation for potential functionally relevant clustering of the universe of protein sequences.
Babbitt, Patricia C.; Ferrin, Thomas E.
2017-01-01
Peroxiredoxins (Prxs or Prdxs) are a large protein superfamily of antioxidant enzymes that rapidly detoxify damaging peroxides and/or affect signal transduction and, thus, have roles in proliferation, differentiation, and apoptosis. Prx superfamily members are widespread across phylogeny and multiple methods have been developed to classify them. Here we present an updated atlas of the Prx superfamily identified using a novel method called MISST (Multi-level Iterative Sequence Searching Technique). MISST is an iterative search process developed to be both agglomerative, to add sequences containing similar functional site features, and divisive, to split groups when functional site features suggest distinct functionally-relevant clusters. Superfamily members need not be identified initially—MISST begins with a minimal representative set of known structures and searches GenBank iteratively. Further, the method’s novelty lies in the manner in which isofunctional groups are selected; rather than use a single or shifting threshold to identify clusters, the groups are deemed isofunctional when they pass a self-identification criterion, such that the group identifies itself and nothing else in a search of GenBank. The method was preliminarily validated on the Prxs, as the Prxs presented challenges of both agglomeration and division. For example, previous sequence analysis clustered the Prx functional families Prx1 and Prx6 into one group. Subsequent expert analysis clearly identified Prx6 as a distinct functionally relevant group. The MISST process distinguishes these two closely related, though functionally distinct, families. Through MISST search iterations, over 38,000 Prx sequences were identified, which the method divided into six isofunctional clusters, consistent with previous expert analysis. The results represent the most complete computational functional analysis of proteins comprising the Prx superfamily. The feasibility of this novel method is demonstrated by the Prx superfamily results, laying the foundation for potential functionally relevant clustering of the universe of protein sequences. PMID:28187133
Domżalska, Lucyna; Kędracka-Krok, Sylwia; Jankowska, Urszula; Grzyb, Małgorzata; Sobczak, Mirosław; Rybczyński, Jan J; Mikuła, Anna
2017-05-01
Using cyto-morphological analysis of somatic embryogenesis (SE) in the tree fern Cyathea delgadii as a guide, we performed a comparative proteomic analysis in stipe explants undergoing direct SE. Plant material was cultured on hormone-free medium supplemented with 2% sucrose. Phenol extracted proteins were separated using two-dimensional gel electrophoresis (2-DE) and mass spectrometry was performed for protein identification. A total number of 114 differentially regulated proteins was identified during early SE, i.e. when the first cell divisions started and several-cell pro-embryos were formed. Proteins were assigned to seven functional categories: carbohydrate metabolism, protein metabolism, cell organization, defense and stress responses, amino acid metabolism, purine metabolism, and fatty acid metabolism. Carbohydrate and protein metabolism were found to be the most sensitive SE functions with the greatest number of alterations in the intensity of spots in gel. Differences, especially in non-enzymatic and structural protein abundance, are indicative for cell organization, including cytoskeleton rearrangement and changes in cell wall components. The highest induced changes concern those enzymes related to fatty acid metabolism. Global analysis of the proteome reveals several proteins that can represent markers for the first 16days of SE induction and expression in fern. The findings of this research improve the understanding of molecular processes involved in direct SE in C. delgadii. Copyright © 2017 Elsevier B.V. All rights reserved.
Plasma proteomic analysis reveals altered protein abundances in cardiovascular disease.
Lygirou, Vasiliki; Latosinska, Agnieszka; Makridakis, Manousos; Mullen, William; Delles, Christian; Schanstra, Joost P; Zoidakis, Jerome; Pieske, Burkert; Mischak, Harald; Vlahou, Antonia
2018-04-17
Cardiovascular disease (CVD) describes the pathological conditions of the heart and blood vessels. Despite the large number of studies on CVD and its etiology, its key modulators remain largely unknown. To this end, we performed a comprehensive proteomic analysis of blood plasma, with the scope to identify disease-associated changes after placing them in the context of existing knowledge, and generate a well characterized dataset for further use in CVD multi-omics integrative analysis. LC-MS/MS was employed to analyze plasma from 32 subjects (19 cases of various CVD phenotypes and 13 controls) in two steps: discovery (13 cases and 8 controls) and test (6 cases and 5 controls) set analysis. Following label-free quantification, the detected proteins were correlated to existing plasma proteomics datasets (plasma proteome database; PPD) and functionally annotated (Cytoscape, Ingenuity Pathway Analysis). Differential expression was defined based on identification confidence (≥ 2 peptides per protein), statistical significance (Mann-Whitney p value ≤ 0.05) and a minimum of twofold change. Peptides detected in at least 50% of samples per group were considered, resulting in a total of 3796 identified proteins (838 proteins based on ≥ 2 peptides). Pathway annotation confirmed the functional relevance of the findings (representation of complement cascade, fibrin clot formation, platelet degranulation, etc.). Correlation of the relative abundance of the proteins identified in the discovery set with their reported concentrations in the PPD was significant, confirming the validity of the quantification method. The discovery set analysis revealed 100 differentially expressed proteins between cases and controls, 39 of which were verified (≥ twofold change) in the test set. These included proteins already studied in the context of CVD (such as apolipoprotein B, alpha-2-macroglobulin), as well as novel findings (such as low density lipoprotein receptor related protein 2 [LRP2], protein SZT2) for which a mechanism of action is suggested. This proteomic study provides a comprehensive dataset to be used for integrative and functional studies in the field. The observed protein changes reflect known CVD-related processes (e.g. lipid uptake, inflammation) but also novel hypotheses for further investigation including a potential pleiotropic role of LPR2 but also links of SZT2 to CVD.
van Herwijnen, Martijn J.C.; Zonneveld, Marijke I.; Goerdayal, Soenita; Nolte – 't Hoen, Esther N.M.; Garssen, Johan; Stahl, Bernd; Maarten Altelaar, A.F.; Redegeld, Frank A.; Wauben, Marca H.M.
2016-01-01
Breast milk contains several macromolecular components with distinctive functions, whereby milk fat globules and casein micelles mainly provide nutrition to the newborn, and whey contains molecules that can stimulate the newborn's developing immune system and gastrointestinal tract. Although extracellular vesicles (EV) have been identified in breast milk, their physiological function and composition has not been addressed in detail. EV are submicron sized vehicles released by cells for intercellular communication via selectively incorporated lipids, nucleic acids, and proteins. Because of the difficulty in separating EV from other milk components, an in-depth analysis of the proteome of human milk-derived EV is lacking. In this study, an extensive LC-MS/MS proteomic analysis was performed of EV that had been purified from breast milk of seven individual donors using a recently established, optimized density-gradient-based EV isolation protocol. A total of 1963 proteins were identified in milk-derived EV, including EV-associated proteins like CD9, Annexin A5, and Flotillin-1, with a remarkable overlap between the different donors. Interestingly, 198 of the identified proteins are not present in the human EV database Vesiclepedia, indicating that milk-derived EV harbor proteins not yet identified in EV of different origin. Similarly, the proteome of milk-derived EV was compared with that of other milk components. For this, data from 38 published milk proteomic studies were combined in order to construct the total milk proteome, which consists of 2698 unique proteins. Remarkably, 633 proteins identified in milk-derived EV have not yet been identified in human milk to date. Interestingly, these novel proteins include proteins involved in regulation of cell growth and controlling inflammatory signaling pathways, suggesting that milk-derived EVs could support the newborn's developing gastrointestinal tract and immune system. Overall, this study provides an expansion of the whole milk proteome and illustrates that milk-derived EV are macromolecular components with a unique functional proteome. PMID:27601599
2013-01-01
Background The body of disease mutations with known phenotypic relevance continues to increase and is expected to do so even faster with the advent of new experimental techniques such as whole-genome sequencing coupled with disease association studies. However, genomic association studies are limited by the molecular complexity of the phenotype being studied and the population size needed to have adequate statistical power. One way to circumvent this problem, which is critical for the study of rare diseases, is to study the molecular patterns emerging from functional studies of existing disease mutations. Current gene-centric analyses to study mutations in coding regions are limited by their inability to account for the functional modularity of the protein. Previous studies of the functional patterns of known human disease mutations have shown a significant tendency to cluster at protein domain positions, namely position-based domain hotspots of disease mutations. However, the limited number of known disease mutations remains the main factor hindering the advancement of mutation studies at a functional level. In this paper, we address this problem by incorporating mutations known to be disruptive of phenotypes in other species. Focusing on two evolutionarily distant organisms, human and yeast, we describe the first inter-species analysis of mutations of phenotypic relevance at the protein domain level. Results The results of this analysis reveal that phenotypic mutations from yeast cluster at specific positions on protein domains, a characteristic previously revealed to be displayed by human disease mutations. We found over one hundred domain hotspots in yeast with approximately 50% in the exact same domain position as known human disease mutations. Conclusions We describe an analysis using protein domains as a framework for transferring functional information by studying domain hotspots in human and yeast and relating phenotypic changes in yeast to diseases in human. This first-of-a-kind study of phenotypically relevant yeast mutations in relation to human disease mutations demonstrates the utility of a multi-species analysis for advancing the understanding of the relationship between genetic mutations and phenotypic changes at the organismal level. PMID:23819456
Verma, Ved Vrat; Gupta, Rani; Goel, Manisha
2015-09-14
γ-glutamyltranspeptidase (GGT) is a bi-substrate enzyme conserved in all three domains of life. It catalyzes the cleavage and transfer of γ-glutamyl moiety of glutathione to either water (hydrolysis) or substrates like peptides (transpeptidation). GGTs exhibit great variability in their enzyme kinetics although the mechanism of catalysis is conserved. Recently, GGT has been shown to be a virulence factor in microbes like Helicobacter pylori and Bacillus anthracis. In mammalian cells also, GGT inhibition prior to chemotherapy has been shown to sensitize tumors to the therapy. Therefore, lately both bacterial and eukaryotic GGTs have emerged as potential drug targets, but the efforts directed towards finding suitable inhibitors have not yielded any significant results yet. We propose that delineating the residues responsible for the functional diversity associated with these proteins could help in design of species/clade specific inhibitors. In the present study, we have carried out phylogenetic analysis on a set of 47 GGT-like proteins to address the functional diversity. These proteins segregate into various subfamilies, forming separate clades on the tree. Sequence conservation and motif prediction studies show that even though most of the highly conserved residues have been characterized biochemically in previous studies, a significant number of novel putative sites and motifs are discovered that vary in a clade specific manner. Many of the putative sites predicted during the functional divergence type I and type II analysis, lie close to the known catalytic residues and line the walls of the substrate binding cavity, reinforcing their role in modulating the substrate specificity, catalytic rates and stability of this protein. The study offers interesting insights into the evolution of GGT-like proteins in pathogenic vs. non-pathogenic bacteria, archaea and eukaryotes. Our analysis delineates residues that are highly specific to each GGT subfamily. We propose that these sites not only explain the differences in stability and catalytic variability of various GGTs but can also aid in design of specific inhibitors against particular GGTs. Thus, apart from the commonly used in-silico inhibitor screening approaches, evolutionary analysis identifying the functional divergence hotspots in GGT proteins could augment the structure based drug design approaches.
Pokharel, Yuba Raj; Saarela, Jani; Szwajda, Agnieszka; Rupp, Christian; Rokka, Anne; Lal Kumar Karna, Shibendra; Teittinen, Kaisa; Corthals, Garry; Kallioniemi, Olli; Wennerberg, Krister; Aittokallio, Tero; Westermarck, Jukka
2015-12-01
High content protein interaction screens have revolutionized our understanding of protein complex assembly. However, one of the major challenges in translation of high content protein interaction data is identification of those interactions that are functionally relevant for a particular biological question. To address this challenge, we developed a relevance ranking platform (RRP), which consist of modular functional and bioinformatic filters to provide relevance rank among the interactome proteins. We demonstrate the versatility of RRP to enable a systematic prioritization of the most relevant interaction partners from high content data, highlighted by the analysis of cancer relevant protein interactions for oncoproteins Pin1 and PME-1. We validated the importance of selected interactions by demonstration of PTOV1 and CSKN2B as novel regulators of Pin1 target c-Jun phosphorylation and reveal previously unknown interacting proteins that may mediate PME-1 effects via PP2A-inhibition. The RRP framework is modular and can be modified to answer versatile research problems depending on the nature of the biological question under study. Based on comparison of RRP to other existing filtering tools, the presented data indicate that RRP offers added value especially for the analysis of interacting proteins for which there is no sufficient prior knowledge available. Finally, we encourage the use of RRP in combination with either SAINT or CRAPome computational tools for selecting the candidate interactors that fulfill the both important requirements, functional relevance, and high confidence interaction detection. © 2015 by The American Society for Biochemistry and Molecular Biology, Inc.
2014-01-01
Background Sm proteins are multimeric RNA-binding factors, found in all three domains of life. Eukaryotic Sm proteins, together with their associated RNAs, form small ribonucleoprotein (RNP) complexes important in multiple aspects of gene regulation. Comprehensive knowledge of the RNA components of Sm RNPs is critical for understanding their functions. Results We developed a multi-targeting RNA-immunoprecipitation sequencing (RIP-seq) strategy to reliably identify Sm-associated RNAs from Drosophila ovaries and cultured human cells. Using this method, we discovered three major categories of Sm-associated transcripts: small nuclear (sn)RNAs, small Cajal body (sca)RNAs and mRNAs. Additional RIP-PCR analysis showed both ubiquitous and tissue-specific interactions. We provide evidence that the mRNA-Sm interactions are mediated by snRNPs, and that one of the mechanisms of interaction is via base pairing. Moreover, the Sm-associated mRNAs are mature, indicating a splicing-independent function for Sm RNPs. Conclusions This study represents the first comprehensive analysis of eukaryotic Sm-containing RNPs, and provides a basis for additional functional analyses of Sm proteins and their associated snRNPs outside of the context of pre-mRNA splicing. Our findings expand the repertoire of eukaryotic Sm-containing RNPs and suggest new functions for snRNPs in mRNA metabolism. PMID:24393626
USDA-ARS?s Scientific Manuscript database
Polygalacturonase-inhibiting proteins (PGIPs) are plant cell wall glycoproteins that can inhibit fungal endopolygalacturonases (PGs). Inhibiting by PGIPs directly reduces potential PG activity in specific plant pathogenic fungi, reducing their aggressiveness. Here, we isolated and functionally chara...
Suk, Hyung; Knipe, David M
2015-06-01
The herpes simplex virus 1 virion protein 16 (VP16) tegument protein forms a transactivation complex with the cellular proteins host cell factor 1 (HCF-1) and octamer-binding transcription factor 1 (Oct-1) upon entry into the host cell. VP16 has also been shown to interact with a number of virion tegument proteins and viral glycoprotein H to promote viral assembly, but no comprehensive study of the VP16 proteome has been performed at early times postinfection. We therefore performed a proteomic analysis of VP16-interacting proteins at 3 h postinfection. We confirmed the interaction of VP16 with HCF-1 and a large number of cellular Mediator complex proteins, but most surprisingly, we found that the major viral protein associating with VP16 is the infected cell protein 4 (ICP4) immediate-early (IE) transactivator protein. These results raise the potential for a new function for VP16 in associating with the IE ICP4 and playing a role in transactivation of early and late gene expression, in addition to its well-documented function in transactivation of IE gene expression. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Pang, Siew Wai; Lahiri, Chandrajit; Poh, Chit Laa; Tan, Kuan Onn
2018-05-01
Paraneoplastic Ma Family (PNMA) comprises a growing number of family members which share relatively conserved protein sequences encoded by the human genome and is localized to several human chromosomes, including the X-chromosome. Based on sequence analysis, PNMA family members share sequence homology to the Gag protein of LTR retrotransposon, and several family members with aberrant protein expressions have been reported to be closely associated with the human Paraneoplastic Disorder (PND). In addition, gene mutations of specific members of PNMA family are known to be associated with human mental retardation or 3-M syndrome consisting of restrictive post-natal growth or dwarfism, and development of skeletal abnormalities. Other than sequence homology, the physiological function of many members in this family remains unclear. However, several members of this family have been characterized, including cell signalling events mediated by these proteins that are associated with apoptosis, and cancer in different cell types. Furthermore, while certain PNMA family members show restricted gene expression in the human brain and testis, other PNMA family members exhibit broader gene expression or preferential and selective protein interaction profiles, suggesting functional divergence within the family. Functional analysis of some members of this family have identified protein domains that are required for subcellular localization, protein-protein interactions, and cell signalling events which are the focus of this review paper. Copyright © 2018 Elsevier Inc. All rights reserved.
Liao, Jiang-Lin; Zhou, Hui-Wen; Huang, Ying-Jin
2014-01-01
Rice yield and quality are adversely affected by high temperatures, and these effects are more pronounced at the ‘milky stage’ of the rice grain ripening phase. Identifying the functional proteins involved in the response of rice to high temperature stress may provide the basis for improving heat tolerance in rice. In the present study, a comparative proteomic analysis of paired, genetically similar heat-tolerant and heat-sensitive rice lines was conducted. Two-dimensional electrophoresis (2-DE) revealed a total of 27 differentially expressed proteins in rice grains, predominantly from the heat-tolerant lines. The protein profiles clearly indicated variations in protein expression between the heat-tolerant and heat-sensitive rice lines. Matrix-assisted laser desorption/ionization time-of-flight/time-of-flight mass spectrometry (MALDI-TOF/TOF MS) analysis revealed that 25 of the 27 differentially displayed proteins were homologous to known functional proteins. These homologous proteins were involved in biosynthesis, energy metabolism, oxidation, heat shock metabolism, and the regulation of transcription. Seventeen of the 25 genes encoding the differentially displayed proteins were mapped to rice chromosomes according to the co-segregating conditions between the simple sequence repeat (SSR) markers and the target genes in recombinant inbred lines (RILs). The proteins identified in the present study provide a basis to elucidate further the molecular mechanisms underlying the adaptation of rice to high temperature stress. PMID:24376254
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gao, B.; Sugiman-Marangos, S; Junop, M
2009-01-01
The Actinobacteria phylum represents one of the largest and most diverse groups of bacteria, encompassing many important and well-characterized organisms including Streptomyces, Bifidobacterium, Corynebacterium and Mycobacterium. Members of this phylum are remarkably diverse in terms of life cycle, morphology, physiology and ecology. Recent comparative genomic analysis of 19 actinobacterial species determined that only 5 genes of unknown function uniquely define this large phylum [1]. The cellular functions of these actinobacteria-specific proteins (ASP) are not known.
Hwang, Hyundoo; Barnes, Dawn E; Matsunaga, Yohei; Benian, Guy M; Ono, Shoichiro; Lu, Hang
2016-01-29
The sarcomere, the fundamental unit of muscle contraction, is a highly-ordered complex of hundreds of proteins. Despite decades of genetics work, the functional relationships and the roles of those sarcomeric proteins in animal behaviors remain unclear. In this paper, we demonstrate that optogenetic activation of the motor neurons that induce muscle contraction can facilitate quantitative studies of muscle kinetics in C. elegans. To increase the throughput of the study, we trapped multiple worms in parallel in a microfluidic device and illuminated for photoactivation of channelrhodopsin-2 to induce contractions in body wall muscles. Using image processing, the change in body size was quantified over time. A total of five parameters including rate constants for contraction and relaxation were extracted from the optogenetic assay as descriptors of sarcomere functions. To potentially relate the genes encoding the sarcomeric proteins functionally, a hierarchical clustering analysis was conducted on the basis of those parameters. Because it assesses physiological output different from conventional assays, this method provides a complement to the phenotypic analysis of C. elegans muscle mutants currently performed in many labs; the clusters may provide new insights and drive new hypotheses for functional relationships among the many sarcomere components.
NASA Astrophysics Data System (ADS)
Hwang, Hyundoo; Barnes, Dawn E.; Matsunaga, Yohei; Benian, Guy M.; Ono, Shoichiro; Lu, Hang
2016-01-01
The sarcomere, the fundamental unit of muscle contraction, is a highly-ordered complex of hundreds of proteins. Despite decades of genetics work, the functional relationships and the roles of those sarcomeric proteins in animal behaviors remain unclear. In this paper, we demonstrate that optogenetic activation of the motor neurons that induce muscle contraction can facilitate quantitative studies of muscle kinetics in C. elegans. To increase the throughput of the study, we trapped multiple worms in parallel in a microfluidic device and illuminated for photoactivation of channelrhodopsin-2 to induce contractions in body wall muscles. Using image processing, the change in body size was quantified over time. A total of five parameters including rate constants for contraction and relaxation were extracted from the optogenetic assay as descriptors of sarcomere functions. To potentially relate the genes encoding the sarcomeric proteins functionally, a hierarchical clustering analysis was conducted on the basis of those parameters. Because it assesses physiological output different from conventional assays, this method provides a complement to the phenotypic analysis of C. elegans muscle mutants currently performed in many labs; the clusters may provide new insights and drive new hypotheses for functional relationships among the many sarcomere components.
Jing, Lan; Guo, Dandan; Hu, Wenjie; Niu, Xiaofan
2017-03-11
Many plant pathogen secretory proteins are known to be elicitors or pathogenic factors,which play an important role in the host-pathogen interaction process. Bioinformatics approaches make possible the large scale prediction and analysis of secretory proteins from the Puccinia helianthi transcriptome. The internet-based software SignalP v4.1, TargetP v1.01, Big-PI predictor, TMHMM v2.0 and ProtComp v9.0 were utilized to predict the signal peptides and the signal peptide-dependent secreted proteins among the 35,286 ORFs of the P. helianthi transcriptome. 908 ORFs (accounting for 2.6% of the total proteins) were identified as putative secretory proteins containing signal peptides. The length of the majority of proteins ranged from 51 to 300 amino acids (aa), while the signal peptides were from 18 to 20 aa long. Signal peptidase I (SpI) cleavage sites were found in 463 of these putative secretory signal peptides. 55 proteins contained the lipoprotein signal peptide recognition site of signal peptidase II (SpII). Out of 908 secretory proteins, 581 (63.8%) have functions related to signal recognition and transduction, metabolism, transport and catabolism. Additionally, 143 putative secretory proteins were categorized into 27 functional groups based on Gene Ontology terms, including 14 groups in biological process, seven in cellular component, and six in molecular function. Gene ontology analysis of the secretory proteins revealed an enrichment of hydrolase activity. Pathway associations were established for 82 (9.0%) secretory proteins. A number of cell wall degrading enzymes and three homologous proteins specific to Phytophthora sojae effectors were also identified, which may be involved in the pathogenicity of the sunflower rust pathogen. This investigation proposes a new approach for identifying elicitors and pathogenic factors. The eventual identification and characterization of 908 extracellularly secreted proteins will advance our understanding of the molecular mechanisms of interactions between sunflower and rust pathogen and will enhance our ability to intervene in disease states.
Rahaman, Siti Nurulnabila A.; Mat Yusop, Jastina; Mohamed-Hussein, Zeti-Azura; Ho, Kok Lian; Teh, Aik-Hong; Waterman, Jitka; Ng, Chyan Leong
2016-01-01
C1ORF123 is a human hypothetical protein found in open reading frame 123 of chromosome 1. The protein belongs to the DUF866 protein family comprising eukaryote-conserved proteins with unknown function. Recent proteomic and bioinformatic analyses identified the presence of C1ORF123 in brain, frontal cortex and synapses, as well as its involvement in endocrine function and polycystic ovary syndrome (PCOS), indicating the importance of its biological role. In order to provide a better understanding of the biological function of the human C1ORF123 protein, the characterization and analysis of recombinant C1ORF123 (rC1ORF123), including overexpression and purification, verification by mass spectrometry and a Western blot using anti-C1ORF123 antibodies, crystallization and X-ray diffraction analysis of the protein crystals, are reported here. The rC1ORF123 protein was crystallized by the hanging-drop vapor-diffusion method with a reservoir solution comprised of 20% PEG 3350, 0.2 M magnesium chloride hexahydrate, 0.1 M sodium citrate pH 6.5. The crystals diffracted to 1.9 Å resolution and belonged to an orthorhombic space group with unit-cell parameters a = 59.32, b = 65.35, c = 95.05 Å. The calculated Matthews coefficient (V M) value of 2.27 Å3 Da−1 suggests that there are two molecules per asymmetric unit, with an estimated solvent content of 45.7%. PMID:26919524
Wang, Yong-Qiang; Yang, Yong; Fei, Zhangjun; Yuan, Hui; Fish, Tara; Thannhauser, Theodore W; Mazourek, Michael; Kochian, Leon V; Wang, Xiaowu; Li, Li
2013-02-01
Chromoplasts are unique plastids that accumulate massive amounts of carotenoids. To gain a general and comparative characterization of chromoplast proteins, this study performed proteomic analysis of chromoplasts from six carotenoid-rich crops: watermelon, tomato, carrot, orange cauliflower, red papaya, and red bell pepper. Stromal and membrane proteins of chromoplasts were separated by 1D gel electrophoresis and analysed using nLC-MS/MS. A total of 953-2262 proteins from chromoplasts of different crop species were identified. Approximately 60% of the identified proteins were predicted to be plastid localized. Functional classification using MapMan bins revealed large numbers of proteins involved in protein metabolism, transport, amino acid metabolism, lipid metabolism, and redox in chromoplasts from all six species. Seventeen core carotenoid metabolic enzymes were identified. Phytoene synthase, phytoene desaturase, ζ-carotene desaturase, 9-cis-epoxycarotenoid dioxygenase, and carotenoid cleavage dioxygenase 1 were found in almost all crops, suggesting relative abundance of them among the carotenoid pathway enzymes. Chromoplasts from different crops contained abundant amounts of ATP synthase and adenine nucleotide translocator, which indicates an important role of ATP production and transport in chromoplast development. Distinctive abundant proteins were observed in chromoplast from different crops, including capsanthin/capsorubin synthase and fibrillins in pepper, superoxide dismutase in watermelon, carrot, and cauliflower, and glutathione-S-transferease in papaya. The comparative analysis of chromoplast proteins among six crop species offers new insights into the general metabolism and function of chromoplasts as well as the uniqueness of chromoplasts in specific crop species. This work provides reference datasets for future experimental study of chromoplast biogenesis, development, and regulation in plants.
Prediction of scaffold proteins based on protein interaction and domain architectures.
Oh, Kimin; Yi, Gwan-Su
2016-07-28
Scaffold proteins are known for being crucial regulators of various cellular functions by assembling multiple proteins involved in signaling and metabolic pathways. Identification of scaffold proteins and the study of their molecular mechanisms can open a new aspect of cellular systemic regulation and the results can be applied in the field of medicine and engineering. Despite being highlighted as the regulatory roles of dozens of scaffold proteins, there was only one known computational approach carried out so far to find scaffold proteins from interactomes. However, there were limitations in finding diverse types of scaffold proteins because their criteria were restricted to the classical scaffold proteins. In this paper, we will suggest a systematic approach to predict massive scaffold proteins from interactomes and to characterize the roles of scaffold proteins comprehensively. From a total of 10,419 basic scaffold protein candidates in protein interactomes, we classified them into three classes according to the structural evidences for scaffolding, such as domain architectures, domain interactions and protein complexes. Finally, we could define 2716 highly reliable scaffold protein candidates and their characterized functional features. To assess the accuracy of our prediction, the gold standard positive and negative data sets were constructed. We prepared 158 gold standard positive data and 844 gold standard negative data based on the functional information from Gene Ontology consortium. The precision, sensitivity and specificity of our testing was 80.3, 51.0, and 98.5 % respectively. Through the function enrichment analysis of highly reliable scaffold proteins, we could confirm the significantly enriched functions that are related to scaffold protein binding. We also identified functional association between scaffold proteins and their recruited proteins. Furthermore, we checked that the disease association of scaffold proteins is higher than kinases. In conclusion, we could predict larger volume of scaffold proteins and analyzed their functional characteristics. Deeper understandings about the roles of scaffold proteins from this study will provide a higher opportunity to find therapeutic or engineering applications of scaffold proteins using their functional characteristics.
Podder, Avijit; Jatana, Nidhi; Latha, N
2014-09-21
Dopamine receptors (DR) are one of the major neurotransmitter receptors present in human brain. Malfunctioning of these receptors is well established to trigger many neurological and psychiatric disorders. Taking into consideration that proteins function collectively in a network for most of the biological processes, the present study is aimed to depict the interactions between all dopamine receptors following a systems biology approach. To capture comprehensive interactions of candidate proteins associated with human dopamine receptors, we performed a protein-protein interaction network (PPIN) analysis of all five receptors and their protein partners by mapping them into human interactome and constructed a human Dopamine Receptors Interaction Network (DRIN). We explored the topology of dopamine receptors as molecular network, revealing their characteristics and the role of central network elements. More to the point, a sub-network analysis was done to determine major functional clusters in human DRIN that govern key neurological pathways. Besides, interacting proteins in a pathway were characterized and prioritized based on their affinity for utmost drug molecules. The vulnerability of different networks to the dysfunction of diverse combination of components was estimated under random and direct attack scenarios. To the best of our knowledge, the current study is unique to put all five dopamine receptors together in a common interaction network and to understand the functionality of interacting proteins collectively. Our study pinpointed distinctive topological and functional properties of human dopamine receptors that have helped in identifying potential therapeutic drug targets in the dopamine interaction network. Copyright © 2014 Elsevier Ltd. All rights reserved.
Protein Composition of Trypanosoma brucei Mitochondrial Membranes
Acestor, Nathalie; Panigrahi, Aswini K.; Ogata, Yuko; Anupama, Atashi; Stuart, Kenneth D.
2010-01-01
Mitochondria consist of four compartments, outer membrane, intermembrane space, inner membrane and matrix; each harboring specific functions and structures. In this study, we used mass spectrometry (LC-MS/MS) to characterize the protein composition of Trypanosoma brucei mitochondrial membranes, which were enriched by different biochemical fractionation techniques. The analyses identified 202 proteins that contain one or more transmembrane domain(s) and/or positive GRAVY scores. Of these, various criteria were used to assign 72 proteins to mitochondrial membranes with high confidence, and 106 with moderate to low confidence. The sub-cellular localization of a selected subset of 13 membrane assigned proteins was confirmed by tagging and immunofluorescence analysis. While most proteins assigned to mitochondrial membrane have putative roles in metabolic, energy generating, and transport processes, ~50% have no known function. These studies result in a comprehensive profile of the composition and sub-organellar location of proteins in the T. brucei mitochondrion thus, providing useful information on mitochondrial functions. PMID:19834910
Gomez, Sandra; Adalid-Peralta, Laura; Palafox-Fonseca, Hector; Cantu-Robles, Vito Adrian; Soberón, Xavier; Sciutto, Edda; Fragoso, Gladis; Bobes, Raúl J; Laclette, Juan P; Yauner, Luis del Pozo; Ochoa-Leyva, Adrián
2015-05-19
Excretory/Secretory (ES) proteins play an important role in the host-parasite interactions. Experimental identification of ES proteins is time-consuming and expensive. Alternative bioinformatics approaches are cost-effective and can be used to prioritize the experimental analysis of therapeutic targets for parasitic diseases. Here we predicted and functionally annotated the ES proteins in T. solium genome using an integration of bioinformatics tools. Additionally, we developed a novel measurement to evaluate the potential antigenicity of T. solium secretome using sequence length and number of antigenic regions of ES proteins. This measurement was formalized as the Abundance of Antigenic Regions (AAR) value. AAR value for secretome showed a similar value to that obtained for a set of experimentally determined antigenic proteins and was different to the calculated value for the non-ES proteins of T. solium genome. Furthermore, we calculated the AAR values for known helminth secretomes and they were similar to that obtained for T. solium. The results reveal the utility of AAR value as a novel genomic measurement to evaluate the potential antigenicity of secretomes. This comprehensive analysis of T. solium secretome provides functional information for future experimental studies, including the identification of novel ES proteins of therapeutic, diagnosis and immunological interest.
Gomez, Sandra; Adalid-Peralta, Laura; Palafox-Fonseca, Hector; Cantu-Robles, Vito Adrian; Soberón, Xavier; Sciutto, Edda; Fragoso, Gladis; Bobes, Raúl J.; Laclette, Juan P.; Yauner, Luis del Pozo; Ochoa-Leyva, Adrián
2015-01-01
Excretory/Secretory (ES) proteins play an important role in the host-parasite interactions. Experimental identification of ES proteins is time-consuming and expensive. Alternative bioinformatics approaches are cost-effective and can be used to prioritize the experimental analysis of therapeutic targets for parasitic diseases. Here we predicted and functionally annotated the ES proteins in T. solium genome using an integration of bioinformatics tools. Additionally, we developed a novel measurement to evaluate the potential antigenicity of T. solium secretome using sequence length and number of antigenic regions of ES proteins. This measurement was formalized as the Abundance of Antigenic Regions (AAR) value. AAR value for secretome showed a similar value to that obtained for a set of experimentally determined antigenic proteins and was different to the calculated value for the non-ES proteins of T. solium genome. Furthermore, we calculated the AAR values for known helminth secretomes and they were similar to that obtained for T. solium. The results reveal the utility of AAR value as a novel genomic measurement to evaluate the potential antigenicity of secretomes. This comprehensive analysis of T. solium secretome provides functional information for future experimental studies, including the identification of novel ES proteins of therapeutic, diagnosis and immunological interest. PMID:25989346
Arunima, Aryashree; Yelamanchi, Soujanya D; Padhi, Chandrashekhar; Jaiswal, Sangeeta; Ryan, Daniel; Gupta, Bhawna; Sathe, Gajanan; Advani, Jayshree; Gowda, Harsha; Prasad, T S Keshava; Suar, Mrutyunjay
2017-10-01
Salmonella Enteritidis causes food-borne gastroenteritis by the two type three secretion systems (TTSS). TTSS-1 mediates invasion through intestinal lining, and TTSS-2 facilitates phagocytic survival. The pathogens' ability to infect effectively under TTSS-1-deficient background in host's phagocytes is poorly understood. Therefore, pathobiological understanding of TTSS-1-defective nontyphoidal Salmonellosis is highly important. We performed a comparative global proteomic analysis of the isogenic TTSS-1 mutant of Salmonella Enteritidis (M1511) and its wild-type isolate P125109. Our results showed 43 proteins were differentially expressed. Functional annotation further revealed that differentially expressed proteins belong to pathogenesis, tRNA and ncRNA metabolic processes. Three proteins, tryptophan subunit alpha chain, citrate lyase subunit alpha, and hypothetical protein 3202, were selected for in vitro analysis based on their functional annotations. Deletion mutants generated for the above proteins in the M1511 strain showed reduced intracellular survival inside macrophages in vitro. In sum, this study provides mass spectrometry-based evidence for seven hypothetical proteins, which will be subject of future investigations. Our study identifies proteins influencing virulence of Salmonella in the host. The study complements and further strengthens previously published research on proteins involved in enteropathogenesis of Salmonella and extends their role in noninvasive Salmonellosis.
Future directions of electron crystallography.
Fujiyoshi, Yoshinori
2013-01-01
In biological science, there are still many interesting and fundamental yet difficult questions, such as those in neuroscience, remaining to be answered. Structural and functional studies of membrane proteins, which are key molecules of signal transduction in neural and other cells, are essential for understanding the molecular mechanisms of many fundamental biological processes. Technological and instrumental advancements of electron microscopy have facilitated comprehension of structural studies of biological components, such as membrane proteins. While X-ray crystallography has been the main method of structure analysis of proteins including membrane proteins, electron crystallography is now an established technique to analyze structures of membrane proteins in the lipid bilayer, which is close to their natural biological environment. By utilizing cryo-electron microscopes with helium-cooled specimen stages, structures of membrane proteins were analyzed at a resolution better than 3 Å. Such high-resolution structural analysis of membrane proteins by electron crystallography opens up the new research field of structural physiology. Considering the fact that the structures of integral membrane proteins in their native membrane environment without artifacts from crystal contacts are critical in understanding their physiological functions, electron crystallography will continue to be an important technology for structural analysis. In this chapter, I will present several examples to highlight important advantages and to suggest future directions of this technique.
MutationAligner: a resource of recurrent mutation hotspots in protein domains in cancer
Gauthier, Nicholas Paul; Reznik, Ed; Gao, Jianjiong; Sumer, Selcuk Onur; Schultz, Nikolaus; Sander, Chris; Miller, Martin L.
2016-01-01
The MutationAligner web resource, available at http://www.mutationaligner.org, enables discovery and exploration of somatic mutation hotspots identified in protein domains in currently (mid-2015) more than 5000 cancer patient samples across 22 different tumor types. Using multiple sequence alignments of protein domains in the human genome, we extend the principle of recurrence analysis by aggregating mutations in homologous positions across sets of paralogous genes. Protein domain analysis enhances the statistical power to detect cancer-relevant mutations and links mutations to the specific biological functions encoded in domains. We illustrate how the MutationAligner database and interactive web tool can be used to explore, visualize and analyze mutation hotspots in protein domains across genes and tumor types. We believe that MutationAligner will be an important resource for the cancer research community by providing detailed clues for the functional importance of particular mutations, as well as for the design of functional genomics experiments and for decision support in precision medicine. MutationAligner is slated to be periodically updated to incorporate additional analyses and new data from cancer genomics projects. PMID:26590264
Zhuravlev, Pavel I; Papoian, Garegin A
2010-08-01
Energy landscape theories have provided a common ground for understanding the protein folding problem, which once seemed to be overwhelmingly complicated. At the same time, the native state was found to be an ensemble of interconverting states with frustration playing a more important role compared to the folding problem. The landscape of the folded protein - the native landscape - is glassier than the folding landscape; hence, a general description analogous to the folding theories is difficult to achieve. On the other hand, the native basin phase volume is much smaller, allowing a protein to fully sample its native energy landscape on the biological timescales. Current computational resources may also be used to perform this sampling for smaller proteins, to build a 'topographical map' of the native landscape that can be used for subsequent analysis. Several major approaches to representing this topographical map are highlighted in this review, including the construction of kinetic networks, hierarchical trees and free energy surfaces with subsequent structural and kinetic analyses. In this review, we extensively discuss the important question of choosing proper collective coordinates characterizing functional motions. In many cases, the substates on the native energy landscape, which represent different functional states, can be used to obtain variables that are well suited for building free energy surfaces and analyzing the protein's functional dynamics. Normal mode analysis can provide such variables in cases where functional motions are dictated by the molecule's architecture. Principal component analysis is a more expensive way of inferring the essential variables from the protein's motions, one that requires a long molecular dynamics simulation. Finally, the two popular models for the allosteric switching mechanism, 'preexisting equilibrium' and 'induced fit', are interpreted within the energy landscape paradigm as extreme points of a continuum of transition mechanisms. Some experimental evidence illustrating each of these two models, as well as intermediate mechanisms, is presented and discussed.
Zauber, Henrik; Szymanski, Witold; Schulze, Waltraud X
2013-12-01
During the last decade, research on plasma membrane focused increasingly on the analysis of so-called microdomains. It has been shown that function of many membrane-associated proteins involved in signaling and transport depends on their conditional segregation within sterol-enriched membrane domains. High throughput proteomic analysis of sterol-protein interactions are often based on analyzing detergent resistant membrane fraction enriched in sterols and associated proteins, which also contain proteins from these microdomain structures. Most studies so far focused exclusively on the characterization of detergent resistant membrane protein composition and abundances. This approach has received some criticism because of its unspecificity and many co-purifying proteins. In this study, by a label-free quantitation approach, we extended the characterization of membrane microdomains by particularly studying distributions of each protein between detergent resistant membrane and detergent-soluble fractions (DSF). This approach allows a more stringent definition of dynamic processes between different membrane phases and provides a means of identification of co-purifying proteins. We developed a random sampling algorithm, called Unicorn, allowing for robust statistical testing of alterations in the protein distribution ratios of the two different fractions. Unicorn was validated on proteomic data from methyl-β-cyclodextrin treated plasma membranes and the sterol biosynthesis mutant smt1. Both, chemical treatment and sterol-biosynthesis mutation affected similar protein classes in their membrane phase distribution and particularly proteins with signaling and transport functions.
Zauber, Henrik; Szymanski, Witold; Schulze, Waltraud X.
2013-01-01
During the last decade, research on plasma membrane focused increasingly on the analysis of so-called microdomains. It has been shown that function of many membrane-associated proteins involved in signaling and transport depends on their conditional segregation within sterol-enriched membrane domains. High throughput proteomic analysis of sterol-protein interactions are often based on analyzing detergent resistant membrane fraction enriched in sterols and associated proteins, which also contain proteins from these microdomain structures. Most studies so far focused exclusively on the characterization of detergent resistant membrane protein composition and abundances. This approach has received some criticism because of its unspecificity and many co-purifying proteins. In this study, by a label-free quantitation approach, we extended the characterization of membrane microdomains by particularly studying distributions of each protein between detergent resistant membrane and detergent-soluble fractions (DSF). This approach allows a more stringent definition of dynamic processes between different membrane phases and provides a means of identification of co-purifying proteins. We developed a random sampling algorithm, called Unicorn, allowing for robust statistical testing of alterations in the protein distribution ratios of the two different fractions. Unicorn was validated on proteomic data from methyl-β-cyclodextrin treated plasma membranes and the sterol biosynthesis mutant smt1. Both, chemical treatment and sterol-biosynthesis mutation affected similar protein classes in their membrane phase distribution and particularly proteins with signaling and transport functions. PMID:24030099
Proteome Analysis of Watery Saliva Secreted by Green Rice Leafhopper, Nephotettix cincticeps
Hattori, Makoto; Komatsu, Setsuko; Noda, Hiroaki; Matsumoto, Yukiko
2015-01-01
The green rice leafhopper, Nephotettix cincticeps, is a vascular bundle feeder that discharges watery and gelling saliva during the feeding process. To understand the potential functions of saliva for successful and safe feeding on host plants, we analyzed the complexity of proteinaceous components in the watery saliva of N. cincticeps. Salivary proteins were collected from a sucrose diet that adult leafhoppers had fed on through a membrane of stretched parafilm. Protein concentrates were separated using SDS-PAGE under reducing and non-reducing conditions. Six proteins were identified by a gas-phase protein sequencer and two proteins were identified using LC-MS/MS analysis with reference to expressed sequence tag (EST) databases of this species. Full -length cDNAs encoding these major proteins were obtained by rapid amplification of cDNA ends-PCR (RACE-PCR) and degenerate PCR. Furthermore, gel-free proteome analysis that was performed to cover the broad range of salivary proteins with reference to the latest RNA-sequencing data from the salivary gland of N. cincticeps, yielded 63 additional protein species. Out of 71 novel proteins identified from the watery saliva, about 60 % of those were enzymes or other functional proteins, including GH5 cellulase, transferrin, carbonic anhydrases, aminopeptidase, regucalcin, and apolipoprotein. The remaining proteins appeared to be unique and species- specific. This is the first study to identify and characterize the proteins in watery saliva of Auchenorrhyncha species, especially sheath-producing, vascular bundle-feeders. PMID:25909947
The Origin and Early Evolution of Membrane Proteins
NASA Technical Reports Server (NTRS)
Pohorille, Andrew; Schweighofer, Karl; Wilson, Michael A.
2005-01-01
Membrane proteins mediate functions that are essential to all cells. These functions include transport of ions, nutrients and waste products across cell walls, capture of energy and its transduction into the form usable in chemical reactions, transmission of environmental signals to the interior of the cell, cellular growth and cell volume regulation. In the absence of membrane proteins, ancestors of cell (protocells), would have had only very limited capabilities to communicate with their environment. Thus, it is not surprising that membrane proteins are quite common even in simplest prokaryotic cells. Considering that contemporary membrane channels are large and complex, both structurally and functionally, a question arises how their presumably much simpler ancestors could have emerged, perform functions and diversify in early protobiological evolution. Remarkably, despite their overall complexity, structural motifs in membrane proteins are quite simple, with a-helices being most common. This suggests that these proteins might have evolved from simple building blocks. To explain how these blocks could have organized into functional structures, we performed large-scale, accurate computer simulations of folding peptides at a water-membrane interface, their insertion into the membrane, self-assembly into higher-order structures and function. The results of these simulations, combined with analysis of structural and functional experimental data led to the first integrated view of the origin and early evolution of membrane proteins.
Genes encoding calmodulin-binding proteins in the Arabidopsis genome
NASA Technical Reports Server (NTRS)
Reddy, Vaka S.; Ali, Gul S.; Reddy, Anireddy S N.
2002-01-01
Analysis of the recently completed Arabidopsis genome sequence indicates that approximately 31% of the predicted genes could not be assigned to functional categories, as they do not show any sequence similarity with proteins of known function from other organisms. Calmodulin (CaM), a ubiquitous and multifunctional Ca(2+) sensor, interacts with a wide variety of cellular proteins and modulates their activity/function in regulating diverse cellular processes. However, the primary amino acid sequence of the CaM-binding domain in different CaM-binding proteins (CBPs) is not conserved. One way to identify most of the CBPs in the Arabidopsis genome is by protein-protein interaction-based screening of expression libraries with CaM. Here, using a mixture of radiolabeled CaM isoforms from Arabidopsis, we screened several expression libraries prepared from flower meristem, seedlings, or tissues treated with hormones, an elicitor, or a pathogen. Sequence analysis of 77 positive clones that interact with CaM in a Ca(2+)-dependent manner revealed 20 CBPs, including 14 previously unknown CBPs. In addition, by searching the Arabidopsis genome sequence with the newly identified and known plant or animal CBPs, we identified a total of 27 CBPs. Among these, 16 CBPs are represented by families with 2-20 members in each family. Gene expression analysis revealed that CBPs and CBP paralogs are expressed differentially. Our data suggest that Arabidopsis has a large number of CBPs including several plant-specific ones. Although CaM is highly conserved between plants and animals, only a few CBPs are common to both plants and animals. Analysis of Arabidopsis CBPs revealed the presence of a variety of interesting domains. Our analyses identified several hypothetical proteins in the Arabidopsis genome as CaM targets, suggesting their involvement in Ca(2+)-mediated signaling networks.
Challenges in the Development of Functional Assays of Membrane Proteins
Tiefenauer, Louis; Demarche, Sophie
2012-01-01
Lipid bilayers are natural barriers of biological cells and cellular compartments. Membrane proteins integrated in biological membranes enable vital cell functions such as signal transduction and the transport of ions or small molecules. In order to determine the activity of a protein of interest at defined conditions, the membrane protein has to be integrated into artificial lipid bilayers immobilized on a surface. For the fabrication of such biosensors expertise is required in material science, surface and analytical chemistry, molecular biology and biotechnology. Specifically, techniques are needed for structuring surfaces in the micro- and nanometer scale, chemical modification and analysis, lipid bilayer formation, protein expression, purification and solubilization, and most importantly, protein integration into engineered lipid bilayers. Electrochemical and optical methods are suitable to detect membrane activity-related signals. The importance of structural knowledge to understand membrane protein function is obvious. Presently only a few structures of membrane proteins are solved at atomic resolution. Functional assays together with known structures of individual membrane proteins will contribute to a better understanding of vital biological processes occurring at biological membranes. Such assays will be utilized in the discovery of drugs, since membrane proteins are major drug targets.
Uversky, Vladimir N
2015-03-01
Intrinsically disordered proteins (IDPs) and intrinsically disordered protein regions (IDPRs) are functional proteins or regions that do not have unique 3D structures under functional conditions. Therefore, from the viewpoint of their lack of stable 3D structure, IDPs/IDPRs are inherently unstable. As much as structure and function of normal ordered globular proteins are determined by their amino acid sequences, the lack of unique 3D structure in IDPs/IDPRs and their disorder-based functionality are also encoded in the amino acid sequences. Because of their specific sequence features and distinctive conformational behavior, these intrinsically unstable proteins or regions have several applications in biotechnology. This review introduces some of the most characteristic features of IDPs/IDPRs (such as peculiarities of amino acid sequences of these proteins and regions, their major structural features, and peculiar responses to changes in their environment) and describes how these features can be used in the biotechnology, for example for the proteome-wide analysis of the abundance of extended IDPs, for recombinant protein isolation and purification, as polypeptide nanoparticles for drug delivery, as solubilization tools, and as thermally sensitive carriers of active peptides and proteins. Copyright © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Automated quantitative assessment of proteins' biological function in protein knowledge bases.
Mayr, Gabriele; Lepperdinger, Günter; Lackner, Peter
2008-01-01
Primary protein sequence data are archived in databases together with information regarding corresponding biological functions. In this respect, UniProt/Swiss-Prot is currently the most comprehensive collection and it is routinely cross-examined when trying to unravel the biological role of hypothetical proteins. Bioscientists frequently extract single entries and further evaluate those on a subjective basis. In lieu of a standardized procedure for scoring the existing knowledge regarding individual proteins, we here report about a computer-assisted method, which we applied to score the present knowledge about any given Swiss-Prot entry. Applying this quantitative score allows the comparison of proteins with respect to their sequence yet highlights the comprehension of functional data. pfs analysis may be also applied for quality control of individual entries or for database management in order to rank entry listings.
Computing Prediction and Functional Analysis of Prokaryotic Propionylation.
Wang, Li-Na; Shi, Shao-Ping; Wen, Ping-Ping; Zhou, Zhi-You; Qiu, Jian-Ding
2017-11-27
Identification and systematic analysis of candidates for protein propionylation are crucial steps for understanding its molecular mechanisms and biological functions. Although several proteome-scale methods have been performed to delineate potential propionylated proteins, the majority of lysine-propionylated substrates and their role in pathological physiology still remain largely unknown. By gathering various databases and literatures, experimental prokaryotic propionylation data were collated to be trained in a support vector machine with various features via a three-step feature selection method. A novel online tool for seeking potential lysine-propionylated sites (PropSeek) ( http://bioinfo.ncu.edu.cn/PropSeek.aspx ) was built. Independent test results of leave-one-out and n-fold cross-validation were similar to each other, showing that PropSeek is a stable and robust predictor with satisfying performance. Meanwhile, analyses of Gene Ontology, Kyoto Encyclopedia of Genes and Genomes pathways, and protein-protein interactions implied a potential role of prokaryotic propionylation in protein synthesis and metabolism.
Proteomic analysis of the bacterial cell cycle
Grünenfelder, Björn; Rummel, Gabriele; Vohradsky, Jiri; Röder, Daniel; Langen, Hanno; Jenal, Urs
2001-01-01
A global approach was used to analyze protein synthesis and stability during the cell cycle of the bacterium Caulobacter crescentus. Approximately one-fourth (979) of the estimated C. crescentus gene products were detected by two-dimensional gel electrophoresis, 144 of which showed differential cell cycle expression patterns. Eighty-one of these proteins were identified by mass spectrometry and were assigned to a wide variety of functional groups. Pattern analysis revealed that coexpression groups were functionally clustered. A total of 48 proteins were rapidly degraded in the course of one cell cycle. More than half of these unstable proteins were also found to be synthesized in a cell cycle-dependent manner, establishing a strong correlation between rapid protein turnover and the periodicity of the bacterial cell cycle. This is, to our knowledge, the first evidence for a global role of proteolysis in bacterial cell cycle control. PMID:11287652
Islam, Mohammad T; Garg, Gagan; Hancock, William S; Risk, Brian A; Baker, Mark S; Ranganathan, Shoba
2014-01-03
The chromosome-centric human proteome project (C-HPP) aims to define the complete set of proteins encoded in each human chromosome. The neXtProt database (September 2013) lists 20,128 proteins for the human proteome, of which 3831 human proteins (∼19%) are considered "missing" according to the standard metrics table (released September 27, 2013). In support of the C-HPP initiative, we have extended the annotation strategy developed for human chromosome 7 "missing" proteins into a semiautomated pipeline to functionally annotate the "missing" human proteome. This pipeline integrates a suite of bioinformatics analysis and annotation software tools to identify homologues and map putative functional signatures, gene ontology, and biochemical pathways. From sequential BLAST searches, we have primarily identified homologues from reviewed nonhuman mammalian proteins with protein evidence for 1271 (33.2%) "missing" proteins, followed by 703 (18.4%) homologues from reviewed nonhuman mammalian proteins and subsequently 564 (14.7%) homologues from reviewed human proteins. Functional annotations for 1945 (50.8%) "missing" proteins were also determined. To accelerate the identification of "missing" proteins from proteomics studies, we generated proteotypic peptides in silico. Matching these proteotypic peptides to ENCODE proteogenomic data resulted in proteomic evidence for 107 (2.8%) of the 3831 "missing proteins, while evidence from a recent membrane proteomic study supported the existence for another 15 "missing" proteins. The chromosome-wise functional annotation of all "missing" proteins is freely available to the scientific community through our web server (http://biolinfo.org/protannotator).
Peng, Chuanhua; Wang, Xiaoping; Li, Fei; Lin, Yongjun
2012-01-01
The rice stem borer, Chilo suppressalis (Walker) (Lepidoptera: Pyralidae), is one of the most detrimental pests affecting rice crops. The use of Bacillus thuringiensis (Bt) toxins has been explored as a means to control this pest, but the potential for C. suppressalis to develop resistance to Bt toxins makes this approach problematic. Few C. suppressalis gene sequences are known, which makes in-depth study of gene function difficult. Herein, we sequenced the midgut transcriptome of the rice stem borer. In total, 37,040 contigs were obtained, with a mean size of 497 bp. As expected, the transcripts of C. suppressalis shared high similarity with arthropod genes. Gene ontology and KEGG analysis were used to classify the gene functions in C. suppressalis. Using the midgut transcriptome data, we conducted a proteome analysis to identify proteins expressed abundantly in the brush border membrane vesicles (BBMV). Of the 100 top abundant proteins that were excised and subjected to mass spectrometry analysis, 74 share high similarity with known proteins. Among these proteins, Western blot analysis showed that Aminopeptidase N and EH domain-containing protein have the binding activities with Bt-toxin Cry1Ac. These data provide invaluable information about the gene sequences of C. suppressalis and the proteins that bind with Cry1Ac. PMID:22666467
Malviya, N; Gupta, S; Singh, V K; Yadav, M K; Bisht, N C; Sarangi, B K; Yadav, D
2015-02-01
The DNA binding with One Finger (Dof) protein is a plant specific transcription factor involved in the regulation of wide range of processes. The analysis of whole genome sequence of pigeonpea has identified 38 putative Dof genes (CcDof) distributed on 8 chromosomes. A total of 17 out of 38 CcDof genes were found to be intronless. A comprehensive in silico characterization of CcDof gene family including the gene structure, chromosome location, protein motif, phylogeny, gene duplication and functional divergence has been attempted. The phylogenetic analysis resulted in 3 major clusters with closely related members in phylogenetic tree revealed common motif distribution. The in silico cis-regulatory element analysis revealed functional diversity with predominance of light responsive and stress responsive elements indicating the possibility of these CcDof genes to be associated with photoperiodic control and biotic and abiotic stress. The duplication pattern showed that tandem duplication is predominant over segmental duplication events. The comparative phylogenetic analysis of these Dof proteins along with 78 soybean, 36 Arabidopsis and 30 rice Dof proteins revealed 7 major clusters. Several groups of orthologs and paralogs were identified based on phylogenetic tree constructed. Our study provides useful information for functional characterization of CcDof genes.
Zhang, Chengxin; Zheng, Wei; Freddolino, Peter L; Zhang, Yang
2018-03-10
Homology-based transferal remains the major approach to computational protein function annotations, but it becomes increasingly unreliable when the sequence identity between query and template decreases below 30%. We propose a novel pipeline, MetaGO, to deduce Gene Ontology attributes of proteins by combining sequence homology-based annotation with low-resolution structure prediction and comparison, and partner's homology-based protein-protein network mapping. The pipeline was tested on a large-scale set of 1000 non-redundant proteins from the CAFA3 experiment. Under the stringent benchmark conditions where templates with >30% sequence identity to the query are excluded, MetaGO achieves average F-measures of 0.487, 0.408, and 0.598, for Molecular Function, Biological Process, and Cellular Component, respectively, which are significantly higher than those achieved by other state-of-the-art function annotations methods. Detailed data analysis shows that the major advantage of the MetaGO lies in the new functional homolog detections from partner's homology-based network mapping and structure-based local and global structure alignments, the confidence scores of which can be optimally combined through logistic regression. These data demonstrate the power of using a hybrid model incorporating protein structure and interaction networks to deduce new functional insights beyond traditional sequence homology-based referrals, especially for proteins that lack homologous function templates. The MetaGO pipeline is available at http://zhanglab.ccmb.med.umich.edu/MetaGO/. Copyright © 2018. Published by Elsevier Ltd.
Guo, Deyin; Spetz, Carl; Saarma, Mart; Valkonen, Jari P T
2003-05-01
Potyviral helper-component proteinase (HCpro) is a multifunctional protein exerting its cellular functions in interaction with putative host proteins. In this study, cellular protein partners of the HCpro encoded by Potato virus A (PVA) (genus Potyvirus) were screened in a potato leaf cDNA library using a yeast two-hybrid system. Two cellular proteins were obtained that interact specifically with PVA HCpro in yeast and in the two in vitro binding assays used. Both proteins are encoded by single-copy genes in the potato genome. Analysis of the deduced amino acid sequences revealed that one (HIP1) of the two HCpro interactors is a novel RING finger protein. The sequence of the other protein (HIP2) showed no resemblance to the protein sequences available from databanks and has known biological functions.
Functional Interaction Network Construction and Analysis for Disease Discovery.
Wu, Guanming; Haw, Robin
2017-01-01
Network-based approaches project seemingly unrelated genes or proteins onto a large-scale network context, therefore providing a holistic visualization and analysis platform for genomic data generated from high-throughput experiments, reducing the dimensionality of data via using network modules and increasing the statistic analysis power. Based on the Reactome database, the most popular and comprehensive open-source biological pathway knowledgebase, we have developed a highly reliable protein functional interaction network covering around 60 % of total human genes and an app called ReactomeFIViz for Cytoscape, the most popular biological network visualization and analysis platform. In this chapter, we describe the detailed procedures on how this functional interaction network is constructed by integrating multiple external data sources, extracting functional interactions from human curated pathway databases, building a machine learning classifier called a Naïve Bayesian Classifier, predicting interactions based on the trained Naïve Bayesian Classifier, and finally constructing the functional interaction database. We also provide an example on how to use ReactomeFIViz for performing network-based data analysis for a list of genes.
Hierarchical Partitioning of Metazoan Protein Conservation Profiles Provides New Functional Insights
Witztum, Jonathan; Persi, Erez; Horn, David; Pasmanik-Chor, Metsada; Chor, Benny
2014-01-01
The availability of many complete, annotated proteomes enables the systematic study of the relationships between protein conservation and functionality. We explore this question based solely on the presence or absence of protein homologues (a.k.a. conservation profiles). We study 18 metazoans, from two distinct points of view: the human's and the fly's. Using the GOrilla gene ontology (GO) analysis tool, we explore functional enrichment of the “universal proteins”, those with homologues in all 17 other species, and of the “non-universal proteins”. A large number of GO terms are strongly enriched in both human and fly universal proteins. Most of these functions are known to be essential. A smaller number of GO terms, exhibiting markedly different properties, are enriched in both human and fly non-universal proteins. We further explore the non-universal proteins, whose conservation profiles are consistent with the “tree of life” (TOL consistent), as well as the TOL inconsistent proteins. Finally, we applied Quantum Clustering to the conservation profiles of the TOL consistent proteins. Each cluster is strongly associated with one or a small number of specific monophyletic clades in the tree of life. The proteins in many of these clusters exhibit strong functional enrichment associated with the “life style” of the related clades. Most previous approaches for studying function and conservation are “bottom up”, studying protein families one by one, and separately assessing the conservation of each. By way of contrast, our approach is “top down”. We globally partition the set of all proteins hierarchically, as described above, and then identify protein families enriched within different subdivisions. While supporting previous findings, our approach also provides a tool for discovering novel relations between protein conservation profiles, functionality, and evolutionary history as represented by the tree of life. PMID:24594619
Defining the human deubiquitinating enzyme interaction landscape.
Sowa, Mathew E; Bennett, Eric J; Gygi, Steven P; Harper, J Wade
2009-07-23
Deubiquitinating enzymes (Dubs) function to remove covalently attached ubiquitin from proteins, thereby controlling substrate activity and/or abundance. For most Dubs, their functions, targets, and regulation are poorly understood. To systematically investigate Dub function, we initiated a global proteomic analysis of Dubs and their associated protein complexes. This was accomplished through the development of a software platform called CompPASS, which uses unbiased metrics to assign confidence measurements to interactions from parallel nonreciprocal proteomic data sets. We identified 774 candidate interacting proteins associated with 75 Dubs. Using Gene Ontology, interactome topology classification, subcellular localization, and functional studies, we link Dubs to diverse processes, including protein turnover, transcription, RNA processing, DNA damage, and endoplasmic reticulum-associated degradation. This work provides the first glimpse into the Dub interaction landscape, places previously unstudied Dubs within putative biological pathways, and identifies previously unknown interactions and protein complexes involved in this increasingly important arm of the ubiquitin-proteasome pathway.
Defining the Human Deubiquitinating Enzyme Interaction Landscape
Sowa, Mathew E.; Bennett, Eric J.; Gygi, Steven P.; Harper, J. Wade
2009-01-01
Summary Deubiquitinating enzymes (Dubs) function to remove covalently attached ubiquitin from proteins, thereby controlling substrate activity and/or abundance. For most Dubs, their functions, targets, and regulation are poorly understood. To systematically investigate Dub function, we initiated a global proteomic analysis of Dubs and their associated protein complexes. This was accomplished through the development of a software platform, called CompPASS, which uses unbiased metrics to assign confidence measurements to interactions from parallel non-reciprocal proteomic datasets. We identified 774 candidate interacting proteins associated with 75 Dubs. Using Gene Ontology, interactome topology classification, sub-cellular localization and functional studies, we link Dubs to diverse processes, including protein turnover, transcription, RNA processing, DNA damage, and endoplasmic reticulum-associated degradation. This work provides the first glimpse into the Dub interaction landscape, places previously unstudied Dubs within putative biological pathways, and identifies previously unknown interactions and protein complexes involved in this increasingly important arm of the ubiquitin-proteasome pathway. PMID:19615732
Stanewsky, R.; Rendahl, K. G.; Dill, M.; Saumweber, H.
1993-01-01
We have performed a genetic analysis of the 14C region of the X chromosome of Drosophila melanogaster to isolate loss of function alleles of no-on-transient A (nonA; 14C1-2; 1-52.3). NONA is a nuclear protein common to many cell types, which is present in many puffs on polytene chromosomes. Sequence data suggest that the protein contains a pair of RNA binding motifs (RRM) found in many single-strand nucleic acid binding proteins. Hypomorphic alleles of this gene, which lead to aberrant visual and courtship song behavior, still contain normally distributed nonA RNA and NONA protein in embryos, and in all available alleles NONA protein is present in puffs of third instar larval polytene chromosomes. We find that complete loss of this general nuclear protein is semilethal in hemizygous males and homozygous cell lethal in the female germline. Surviving males show more extreme defects in nervous system function than have been described for the hypomorphic alleles. Five other essential genes that reside within this region have been partially characterized. PMID:8244005
Alborghetti, Marcos R; Furlan, Ariane S; Kobarg, Jörg
2011-03-08
The FEZ (fasciculation and elongation protein zeta) family designation was purposed by Bloom and Horvitz by genetic analysis of C. elegans unc-76. Similar human sequences were identified in the expressed sequence tag database as FEZ1 and FEZ2. The unc-76 function is necessary for normal axon fasciculation and is required for axon-axon interactions. Indeed, the loss of UNC-76 function results in defects in axonal transport. The human FEZ1 protein has been shown to rescue defects caused by unc-76 mutations in nematodes, indicating that both UNC-76 and FEZ1 are evolutionarily conserved in their function. Until today, little is known about FEZ2 protein function. Using the yeast two-hybrid system we demonstrate here conserved evolutionary features among orthologs and non-conserved features between paralogs of the FEZ family of proteins, by comparing the interactome profiles of the C-terminals of human FEZ1, FEZ2 and UNC-76 from C. elegans. Furthermore, we correlate our data with an analysis of the molecular evolution of the FEZ protein family in the animal kingdom. We found that FEZ2 interacted with 59 proteins and that of these only 40 interacted with FEZ1. Of the 40 FEZ1 interacting proteins, 36 (90%), also interacted with UNC-76 and none of the 19 FEZ2 specific proteins interacted with FEZ1 or UNC-76. This together with the duplication of unc-76 gene in the ancestral line of chordates suggests that FEZ2 is in the process of acquiring new additional functions. The results provide also an explanation for the dramatic difference between C. elegans and D. melanogaster unc-76 mutants on one hand, which cause serious defects in the nervous system, and the mouse FEZ1 -/- knockout mice on the other, which show no morphological and no strong behavioural phenotype. Likely, the ubiquitously expressed FEZ2 can completely compensate the lack of neuronal FEZ1, since it can interact with all FEZ1 interacting proteins and additional 19 proteins.
Alborghetti, Marcos R.; Furlan, Ariane S.; Kobarg, Jörg
2011-01-01
Background The FEZ (fasciculation and elongation protein zeta) family designation was purposed by Bloom and Horvitz by genetic analysis of C. elegans unc-76. Similar human sequences were identified in the expressed sequence tag database as FEZ1 and FEZ2. The unc-76 function is necessary for normal axon fasciculation and is required for axon-axon interactions. Indeed, the loss of UNC-76 function results in defects in axonal transport. The human FEZ1 protein has been shown to rescue defects caused by unc-76 mutations in nematodes, indicating that both UNC-76 and FEZ1 are evolutionarily conserved in their function. Until today, little is known about FEZ2 protein function. Methodology/Principal Findings Using the yeast two-hybrid system we demonstrate here conserved evolutionary features among orthologs and non-conserved features between paralogs of the FEZ family of proteins, by comparing the interactome profiles of the C-terminals of human FEZ1, FEZ2 and UNC-76 from C. elegans. Furthermore, we correlate our data with an analysis of the molecular evolution of the FEZ protein family in the animal kingdom. Conclusions/Significance We found that FEZ2 interacted with 59 proteins and that of these only 40 interacted with FEZ1. Of the 40 FEZ1 interacting proteins, 36 (90%), also interacted with UNC-76 and none of the 19 FEZ2 specific proteins interacted with FEZ1 or UNC-76. This together with the duplication of unc-76 gene in the ancestral line of chordates suggests that FEZ2 is in the process of acquiring new additional functions. The results provide also an explanation for the dramatic difference between C. elegans and D. melanogaster unc-76 mutants on one hand, which cause serious defects in the nervous system, and the mouse FEZ1 -/- knockout mice on the other, which show no morphological and no strong behavioural phenotype. Likely, the ubiquitously expressed FEZ2 can completely compensate the lack of neuronal FEZ1, since it can interact with all FEZ1 interacting proteins and additional 19 proteins. PMID:21408165
Lymphocyte signaling: beyond knockouts.
Saveliev, Alexander; Tybulewicz, Victor L J
2009-04-01
The analysis of lymphocyte signaling was greatly enhanced by the advent of gene targeting, which allows the selective inactivation of a single gene. Although this gene 'knockout' approach is often informative, in many cases, the phenotype resulting from gene ablation might not provide a complete picture of the function of the corresponding protein. If a protein has multiple functions within a single or several signaling pathways, or stabilizes other proteins in a complex, the phenotypic consequences of a gene knockout may manifest as a combination of several different perturbations. In these cases, gene targeting to 'knock in' subtle point mutations might provide more accurate insight into protein function. However, to be informative, such mutations must be carefully based on structural and biophysical data.
Building toy models of proteins using coevolutionary information
NASA Astrophysics Data System (ADS)
Cheng, Ryan; Raghunathan, Mohit; Onuchic, Jose
2015-03-01
Recent developments in global statistical methodologies have advanced the analysis of large collections of protein sequences for coevolutionary information. Coevolution between amino acids in a protein arises from compensatory mutations that are needed to maintain the stability or function of a protein over the course of evolution. This gives rise to quantifiable correlations between amino acid positions within the multiple sequence alignment of a protein family. Here, we use Direct Coupling Analysis (DCA) to infer a Potts model Hamiltonian governing the correlated mutations in a protein family to obtain the sequence-dependent interaction energies of a toy protein model. We demonstrate that this methodology predicts residue-residue interaction energies that are consistent with experimental mutational changes in protein stabilities as well as other computational methodologies. Furthermore, we demonstrate with several examples that DCA could be used to construct a structure-based model that quantitatively agrees with experimental data on folding mechanisms. This work serves as a potential framework for generating models of proteins that are enriched by evolutionary data that can potentially be used to engineer key functional motions and interactions in protein systems. This research has been supported by the NSF INSPIRE award MCB-1241332 and by the CTBP sponsored by the NSF (Grant PHY-1427654).
Yang, Yongxin; Zhao, Xiaowei; Yu, Shumin; Cao, Suizhong
2015-02-01
Yak (Bos grunniens) is an important natural resource in mountainous regions. To date, few studies have addressed the differences in the protein profiles of yak colostrum and milk. We used quantitative proteomics to compare the protein profiles of whey from yak colostrum and milk. Milk samples were collected from 21 yaks after calving (1 and 28 d). Whey protein profiles were generated through isobaric tag for relative and absolute quantification (iTRAQ)-labelled proteomics. We identified 183 proteins in milk whey; of these, the expression levels of 86 proteins differed significantly between the whey from colostrum and milk. Haemoglobin expression showed the greatest change; its levels were significantly higher in the whey from colostrum than in mature milk whey. Functional analysis revealed that many of the differentially expressed proteins were associated with biological regulation and response to stimuli. Further, eight differentially expressed proteins involved in the complement and coagulation cascade pathway were enriched in milk whey. These findings add to the general understanding of the protein composition of yak milk, suggest potential functions of the differentially expressed proteins, and provide novel information on the role of colostral components in calf survival. © 2014 Society of Chemical Industry.
Zhuang, Fengfeng; Nguyen, Manuel P; Shuler, Charles; Liu, Yi-Hsin
2009-04-03
Previous studies have shown that Msx proteins control gene transcription predominantly through repression mechanisms. However, gene expression studies using either the gain-of-function or the loss-of-function mutants revealed many gene targets whose expression require functional Msx proteins. To date, investigations into the mechanisms of Msx-dependent transactivation have been hindered by the lack of a responsive promoter. Here, we demonstrated the usefulness of the mouse Hspa1b promoter in probing Msx-dependent mechanisms of gene activation. We showed that Msx protein activates Hspa1b promoter via its C-terminal domain. The activation absolutely depends on the HSEs and physical interactions between Msx proteins and heat shock factors may play a contributing role.
Papaleo, Elena
2015-01-01
In the last years, we have been observing remarkable improvements in the field of protein dynamics. Indeed, we can now study protein dynamics in atomistic details over several timescales with a rich portfolio of experimental and computational techniques. On one side, this provides us with the possibility to validate simulation methods and physical models against a broad range of experimental observables. On the other side, it also allows a complementary and comprehensive view on protein structure and dynamics. What is needed now is a better understanding of the link between the dynamic properties that we observe and the functional properties of these important cellular machines. To make progresses in this direction, we need to improve the physical models used to describe proteins and solvent in molecular dynamics, as well as to strengthen the integration of experiments and simulations to overcome their own limitations. Moreover, now that we have the means to study protein dynamics in great details, we need new tools to understand the information embedded in the protein ensembles and in their dynamic signature. With this aim in mind, we should enrich the current tools for analysis of biomolecular simulations with attention to the effects that can be propagated over long distances and are often associated to important biological functions. In this context, approaches inspired by network analysis can make an important contribution to the analysis of molecular dynamics simulations.
Kilic, Gamze; Wang, Junfeng; Sosa-Pineda, Beatriz
2008-01-01
Matricellular proteins mediate both tissue morphogenesis and tissue homeostasis in important ways because they modulate cell-matrix and cell-cell interactions. In this study, we found that the matricellular protein osteopontin (Opn) is a novel marker of undifferentiated pancreatic precursors and pancreatic ductal tissues in mice. Our analysis also underscored a specific, dynamic profile of Opn expression in embryonic pancreatic tissues that suggests the participation of this protein’s function in processes involving cell migration, cell-cell interactions, or both. Surprisingly, our analysis of Opn-deficient pancreata did not reveal obvious alterations in the morphology or differentiation of these tissues. Therefore, in embryonic pancreatic tissues, it is possible that other proteins act redundantly to Opn or that this protein’s function is dispensable for pancreas development. Finally, the maintenance of Opn expression in pancreatic tissues of adults argues for a possible function of this protein in injury and pathologic responses. PMID:16518820
Evolutionary conservation of Ebola virus proteins predicts important functions at residue level.
Arslan, Ahmed; van Noort, Vera
2017-01-15
The recent outbreak of Ebola virus disease (EVD) resulted in a large number of human deaths. Due to this devastation, the Ebola virus has attracted renewed interest as model for virus evolution. Recent literature on Ebola virus (EBOV) has contributed substantially to our understanding of the underlying genetics and its scope with reference to the 2014 outbreak. But no study yet, has focused on the conservation patterns of EBOV proteins. We analyzed the evolution of functional regions of EBOV and highlight the function of conserved residues in protein activities. We apply an array of computational tools to dissect the functions of EBOV proteins in detail: (i) protein sequence conservation, (ii) protein-protein interactome analysis, (iii) structural modeling and (iv) kinase prediction. Our results suggest the presence of novel post-translational modifications in EBOV proteins and their role in the modulation of protein functions and protein interactions. Moreover, on the basis of the presence of ATM recognition motifs in all EBOV proteins we postulate a role of DNA damage response pathways and ATM kinase in EVD. The ATM kinase is put forward, for further evaluation, as novel potential therapeutic target. http://www.biw.kuleuven.be/CSB/EBOV-PTMs CONTACT: vera.vannoort@biw.kuleuven.beSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.
Kankeu, Cynthia; Clarke, Kylie; Van Haver, Delphi; Gevaert, Kris; Impens, Francis; Dittrich, Anna; Roderick, H Llewelyn; Passante, Egle; Huber, Heinrich J
2018-05-17
The rat cardiomyoblast cell line H9C2 has emerged as a valuable tool for studying cardiac development, mechanisms of disease and toxicology. We present here a rigorous proteomic analysis that monitored the changes in protein expression during differentiation of H9C2 cells into cardiomyocyte-like cells over time. Quantitative mass spectrometry followed by gene ontology (GO) enrichment analysis revealed that early changes in H9C2 differentiation are related to protein pathways of cardiac muscle morphogenesis and sphingolipid synthesis. These changes in the proteome were followed later in the differentiation time-course by alterations in the expression of proteins involved in cation transport and beta-oxidation. Studying the temporal profile of the H9C2 proteome during differentiation in further detail revealed eight clusters of co-regulated proteins that can be associated with early, late, continuous and transient up- and downregulation. Subsequent reactome pathway analysis based on these eight clusters further corroborated and detailed the results of the GO analysis. Specifically, this analysis confirmed that proteins related to pathways in muscle contraction are upregulated early and transiently, and proteins relevant to extracellular matrix organization are downregulated early. In contrast, upregulation of proteins related to cardiac metabolism occurs at later time points. Finally, independent validation of the proteomics results by immunoblotting confirmed hereto unknown regulators of cardiac structure and ionic metabolism. Our results are consistent with a 'function follows form' model of differentiation, whereby early and transient alterations of structural proteins enable subsequent changes that are relevant to the characteristic physiology of cardiomyocytes.
Li, Edward B; Truong, Dawn; Hallett, Shawn A; Mukherjee, Kusumika; Schutte, Brian C; Liao, Eric C
2017-09-01
Large-scale sequencing efforts have captured a rapidly growing catalogue of genetic variations. However, the accurate establishment of gene variant pathogenicity remains a central challenge in translating personal genomics information to clinical decisions. Interferon Regulatory Factor 6 (IRF6) gene variants are significant genetic contributors to orofacial clefts. Although approximately three hundred IRF6 gene variants have been documented, their effects on protein functions remain difficult to interpret. Here, we demonstrate the protein functions of human IRF6 missense gene variants could be rapidly assessed in detail by their abilities to rescue the irf6 -/- phenotype in zebrafish through variant mRNA microinjections at the one-cell stage. The results revealed many missense variants previously predicted by traditional statistical and computational tools to be loss-of-function and pathogenic retained partial or full protein function and rescued the zebrafish irf6 -/- periderm rupture phenotype. Through mRNA dosage titration and analysis of the Exome Aggregation Consortium (ExAC) database, IRF6 missense variants were grouped by their abilities to rescue at various dosages into three functional categories: wild type function, reduced function, and complete loss-of-function. This sensitive and specific biological assay was able to address the nuanced functional significances of IRF6 missense gene variants and overcome many limitations faced by current statistical and computational tools in assigning variant protein function and pathogenicity. Furthermore, it unlocked the possibility for characterizing yet undiscovered human IRF6 missense gene variants from orofacial cleft patients, and illustrated a generalizable functional genomics paradigm in personalized medicine.
Wendler, Sergej; Hürtgen, Daniel; Kalinowski, Jörn; Klein, Andreas; Niehaus, Karsten; Schulte, Fabian; Schwientek, Patrick; Wehlmann, Hermann; Wehmeier, Udo F; Pühler, Alfred
2013-08-20
The pseudotetrasaccharide acarbose is a medically relevant secondary metabolite produced by strains of the genera Actinoplanes and Streptomyces. In this study gene products involved in acarbose metabolism were identified by analyzing the cytosolic and extracellular proteome of Actinoplanes sp. SE50/110 cultures grown in a high-maltose minimal medium. The analysis by 2D protein gel electrophoresis of cytosolic proteins of Actinoplanes sp. SE50/110 resulted in 318 protein spots and 162 identified proteins. Nine of those were acarbose cluster proteins (Acb-proteins), namely AcbB, AcbD, AcbE, AcbK, AcbL, AcbN, AcbR, AcbV and AcbZ. The analysis of proteins in the extracellular space of Actinoplanes sp. SE50/110 cultures resulted in about 100 protein spots and 22 identified proteins. The identifications included the three acarbose gene cluster proteins AcbD, AcbE and AcbZ. After their identification, proteins were classified into functional groups. The dominant functional groups were the carbohydrate binding, carbohydrate cleavage and carbohydrate transport proteins. The other functional groups included protein cleavage, amino acid degradation, nucleic acid cleavage and a number of functionally uncharacterized proteins. In addition, signal peptide structures of extracellularly found proteins were analyzed. Of the 22 detected proteins 19 contained signal peptides, while 2 had N-terminal transmembrane helices explaining their localization. The only protein having neither of them was enolase. Under the conditions applied, the secretome of Actinoplanes sp. SE50/110 was dominated by seven proteins involved in carbohydrate metabolism (PulA, AcbE, AcbD, MalE, AglE, CbpA and Cgt). Of special interest were the identified extracellular pullulanase PulA and the two solute-binding proteins MalE and AglE. The identifications suggest that Actinoplanes sp. SE50/110 has two maltose/maltodextrin import systems. We postulate the identified MalEFG transport system of Actinoplanes sp. SE50/100 as the missing acarbose-metabolite importer and present a model of acarbose metabolism that is extended by the newly identified gene products. Copyright © 2012 Elsevier B.V. All rights reserved.
Suplatov, Dmitry; Kirilin, Eugeny; Arbatsky, Mikhail; Takhaveev, Vakil; Svedas, Vytas
2014-07-01
The new web-server pocketZebra implements the power of bioinformatics and geometry-based structural approaches to identify and rank subfamily-specific binding sites in proteins by functional significance, and select particular positions in the structure that determine selective accommodation of ligands. A new scoring function has been developed to annotate binding sites by the presence of the subfamily-specific positions in diverse protein families. pocketZebra web-server has multiple input modes to meet the needs of users with different experience in bioinformatics. The server provides on-site visualization of the results as well as off-line version of the output in annotated text format and as PyMol sessions ready for structural analysis. pocketZebra can be used to study structure-function relationship and regulation in large protein superfamilies, classify functionally important binding sites and annotate proteins with unknown function. The server can be used to engineer ligand-binding sites and allosteric regulation of enzymes, or implemented in a drug discovery process to search for potential molecular targets and novel selective inhibitors/effectors. The server, documentation and examples are freely available at http://biokinet.belozersky.msu.ru/pocketzebra and there are no login requirements. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Rapid Identification of Sequences for Orphan Enzymes to Power Accurate Protein Annotation
Ojha, Sunil; Watson, Douglas S.; Bomar, Martha G.; Galande, Amit K.; Shearer, Alexander G.
2013-01-01
The power of genome sequencing depends on the ability to understand what those genes and their proteins products actually do. The automated methods used to assign functions to putative proteins in newly sequenced organisms are limited by the size of our library of proteins with both known function and sequence. Unfortunately this library grows slowly, lagging well behind the rapid increase in novel protein sequences produced by modern genome sequencing methods. One potential source for rapidly expanding this functional library is the “back catalog” of enzymology – “orphan enzymes,” those enzymes that have been characterized and yet lack any associated sequence. There are hundreds of orphan enzymes in the Enzyme Commission (EC) database alone. In this study, we demonstrate how this orphan enzyme “back catalog” is a fertile source for rapidly advancing the state of protein annotation. Starting from three orphan enzyme samples, we applied mass-spectrometry based analysis and computational methods (including sequence similarity networks, sequence and structural alignments, and operon context analysis) to rapidly identify the specific sequence for each orphan while avoiding the most time- and labor-intensive aspects of typical sequence identifications. We then used these three new sequences to more accurately predict the catalytic function of 385 previously uncharacterized or misannotated proteins. We expect that this kind of rapid sequence identification could be efficiently applied on a larger scale to make enzymology’s “back catalog” another powerful tool to drive accurate genome annotation. PMID:24386392
Rapid identification of sequences for orphan enzymes to power accurate protein annotation.
Ramkissoon, Kevin R; Miller, Jennifer K; Ojha, Sunil; Watson, Douglas S; Bomar, Martha G; Galande, Amit K; Shearer, Alexander G
2013-01-01
The power of genome sequencing depends on the ability to understand what those genes and their proteins products actually do. The automated methods used to assign functions to putative proteins in newly sequenced organisms are limited by the size of our library of proteins with both known function and sequence. Unfortunately this library grows slowly, lagging well behind the rapid increase in novel protein sequences produced by modern genome sequencing methods. One potential source for rapidly expanding this functional library is the "back catalog" of enzymology--"orphan enzymes," those enzymes that have been characterized and yet lack any associated sequence. There are hundreds of orphan enzymes in the Enzyme Commission (EC) database alone. In this study, we demonstrate how this orphan enzyme "back catalog" is a fertile source for rapidly advancing the state of protein annotation. Starting from three orphan enzyme samples, we applied mass-spectrometry based analysis and computational methods (including sequence similarity networks, sequence and structural alignments, and operon context analysis) to rapidly identify the specific sequence for each orphan while avoiding the most time- and labor-intensive aspects of typical sequence identifications. We then used these three new sequences to more accurately predict the catalytic function of 385 previously uncharacterized or misannotated proteins. We expect that this kind of rapid sequence identification could be efficiently applied on a larger scale to make enzymology's "back catalog" another powerful tool to drive accurate genome annotation.
Comprehensive proteomic analysis of the human spliceosome
NASA Astrophysics Data System (ADS)
Zhou, Zhaolan; Licklider, Lawrence J.; Gygi, Steven P.; Reed, Robin
2002-09-01
The precise excision of introns from pre-messenger RNA is performed by the spliceosome, a macromolecular machine containing five small nuclear RNAs and numerous proteins. Much has been learned about the protein components of the spliceosome from analysis of individual purified small nuclear ribonucleoproteins and salt-stable spliceosome `core' particles. However, the complete set of proteins that constitutes intact functional spliceosomes has yet to be identified. Here we use maltose-binding protein affinity chromatography to isolate spliceosomes in highly purified and functional form. Using nanoscale microcapillary liquid chromatography tandem mass spectrometry, we identify ~145 distinct spliceosomal proteins, making the spliceosome the most complex cellular machine so far characterized. Our spliceosomes comprise all previously known splicing factors and 58 newly identified components. The spliceosome contains at least 30 proteins with known or putative roles in gene expression steps other than splicing. This complexity may be required not only for splicing multi-intronic metazoan pre-messenger RNAs, but also for mediating the extensive coupling between splicing and other steps in gene expression.
A Systematic Analysis of a Deep Mouse Epididymal Sperm Proteome
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chauvin, Theodore; Xie, Fang; Liu, Tao
Spermatozoa are highly specialized cells that, when mature, are capable of navigating the female reproductive tract and fertilizing an oocyte. The sperm cell is thought to be largely quiescent in terms of transcriptional and translational activity. As a result, once it has left the male reproductive tract, the sperm cell is essentially operating with a static population of proteins. It is therefore theoretically possible to understand the protein networks contained in a sperm cell and to deduce its cellular function capabilities. To this end we have performed a proteomic analysis of mouse sperm isolated from the cauda epididymis and havemore » confidently identified 2,850 proteins, which is the most comprehensive sperm proteome for any species reported to date. These proteins comprise many complete cellular pathways, including those for energy production via glycolysis, β-oxidation and oxidative phosphorylation, protein folding and transport, and cell signaling systems. This proteome should prove a useful tool for assembly and testing of protein networks important for sperm function.« less
ITRAQ-based quantitative proteomic analysis of Cynops orientalis limb regeneration.
Tang, Jie; Yu, Yuan; Zheng, Hanxue; Yin, Lu; Sun, Mei; Wang, Wenjun; Cui, Jihong; Liu, Wenguang; Xie, Xin; Chen, Fulin
2017-09-22
Salamanders regenerate their limbs after amputation. However, the molecular mechanism of this unique regeneration remains unclear. In this study, isobaric tags for relative and absolute quantification (iTRAQ) coupled with liquid chromatography tandem mass spectrometry (LC-MS/MS) was employed to quantitatively identify differentially expressed proteins in regenerating limbs 3, 7, 14, 30 and 42 days post amputation (dpa). Of 2636 proteins detected in total, 253 proteins were differentially expressed during different regeneration stages. Among these proteins, Asporin, Cadherin-13, Keratin, Collagen alpha-1(XI) and Titin were down-regulated. CAPG, Coronin-1A, AnnexinA1, Cathepsin B were up-regulated compared with the control. The identified proteins were further analyzed to obtain information about their expression patterns and functions in limb regeneration. Functional analysis indicated that the differentially expressed proteins were associated with wound healing, immune response, cellular process, metabolism and binding. This work indicated that significant proteome alternations occurred during salamander limb regeneration. The results may provide fundamental knowledge to understand the mechanism of limb regeneration.
Proteomic Analysis of the Mediator Complex Interactome in Saccharomyces cerevisiae.
Uthe, Henriette; Vanselow, Jens T; Schlosser, Andreas
2017-02-27
Here we present the most comprehensive analysis of the yeast Mediator complex interactome to date. Particularly gentle cell lysis and co-immunopurification conditions allowed us to preserve even transient protein-protein interactions and to comprehensively probe the molecular environment of the Mediator complex in the cell. Metabolic 15 N-labeling thereby enabled stringent discrimination between bona fide interaction partners and nonspecifically captured proteins. Our data indicates a functional role for Mediator beyond transcription initiation. We identified a large number of Mediator-interacting proteins and protein complexes, such as RNA polymerase II, general transcription factors, a large number of transcriptional activators, the SAGA complex, chromatin remodeling complexes, histone chaperones, highly acetylated histones, as well as proteins playing a role in co-transcriptional processes, such as splicing, mRNA decapping and mRNA decay. Moreover, our data provides clear evidence, that the Mediator complex interacts not only with RNA polymerase II, but also with RNA polymerases I and III, and indicates a functional role of the Mediator complex in rRNA processing and ribosome biogenesis.
2012-01-01
Background GDSL esterases/lipases are a newly discovered subclass of lipolytic enzymes that are very important and attractive research subjects because of their multifunctional properties, such as broad substrate specificity and regiospecificity. Compared with the current knowledge regarding these enzymes in bacteria, our understanding of the plant GDSL enzymes is very limited, although the GDSL gene family in plant species include numerous members in many fully sequenced plant genomes. Only two genes from a large rice GDSL esterase/lipase gene family were previously characterised, and the majority of the members remain unknown. In the present study, we describe the rice OsGELP (Oryza sativa GDSL esterase/lipase protein) gene family at the genomic and proteomic levels, and use this knowledge to provide insights into the multifunctionality of the rice OsGELP enzymes. Results In this study, an extensive bioinformatics analysis identified 114 genes in the rice OsGELP gene family. A complete overview of this family in rice is presented, including the chromosome locations, gene structures, phylogeny, and protein motifs. Among the OsGELPs and the plant GDSL esterase/lipase proteins of known functions, 41 motifs were found that represent the core secondary structure elements or appear specifically in different phylogenetic subclades. The specification and distribution of identified putative conserved clade-common and -specific peptide motifs, and their location on the predicted protein three dimensional structure may possibly signify their functional roles. Potentially important regions for substrate specificity are highlighted, in accordance with protein three-dimensional model and location of the phylogenetic specific conserved motifs. The differential expression of some representative genes were confirmed by quantitative real-time PCR. The phylogenetic analysis, together with protein motif architectures, and the expression profiling were analysed to predict the possible biological functions of the rice OsGELP genes. Conclusions Our current genomic analysis, for the first time, presents fundamental information on the organization of the rice OsGELP gene family. With combination of the genomic, phylogenetic, microarray expression, protein motif distribution, and protein structure analyses, we were able to create supported basis for the functional prediction of many members in the rice GDSL esterase/lipase family. The present study provides a platform for the selection of candidate genes for further detailed functional study. PMID:22793791
Le, N; Simon, M A
1998-08-01
DRK, the Drosophila homolog of the SH2-SH3 domain adaptor protein Grb2, is required during signaling by the sevenless receptor tyrosine kinase (SEV). One role of DRK is to provide a link between activated SEV and the Ras1 activator SOS. We have investigated the possibility that DRK performs other functions by identifying additional DRK-binding proteins. We show that the phosphotyrosine-binding (PTB) domain-containing protein Disabled (DAB) binds to the DRK SH3 domains. DAB is expressed in the ommatidial clusters, and loss of DAB function disrupts ommatidial development. Moreover, reduction of DAB function attenuates signaling by a constitutively activated SEV. Our biochemical analysis suggests that DAB binds SEV directly via its PTB domain, becomes tyrosine phosphorylated upon SEV activation, and then serves as an adaptor protein for SH2 domain-containing proteins. Taken together, these results indicate that DAB is a novel component of the SEV signaling pathway.
Family-specific scaling laws in bacterial genomes.
De Lazzari, Eleonora; Grilli, Jacopo; Maslov, Sergei; Cosentino Lagomarsino, Marco
2017-07-27
Among several quantitative invariants found in evolutionary genomics, one of the most striking is the scaling of the overall abundance of proteins, or protein domains, sharing a specific functional annotation across genomes of given size. The size of these functional categories change, on average, as power-laws in the total number of protein-coding genes. Here, we show that such regularities are not restricted to the overall behavior of high-level functional categories, but also exist systematically at the level of single evolutionary families of protein domains. Specifically, the number of proteins within each family follows family-specific scaling laws with genome size. Functionally similar sets of families tend to follow similar scaling laws, but this is not always the case. To understand this systematically, we provide a comprehensive classification of families based on their scaling properties. Additionally, we develop a quantitative score for the heterogeneity of the scaling of families belonging to a given category or predefined group. Under the common reasonable assumption that selection is driven solely or mainly by biological function, these findings point to fine-tuned and interdependent functional roles of specific protein domains, beyond our current functional annotations. This analysis provides a deeper view on the links between evolutionary expansion of protein families and the functional constraints shaping the gene repertoire of bacterial genomes. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
PDB2Graph: A toolbox for identifying critical amino acids map in proteins based on graph theory.
Niknam, Niloofar; Khakzad, Hamed; Arab, Seyed Shahriar; Naderi-Manesh, Hossein
2016-05-01
The integrative and cooperative nature of protein structure involves the assessment of topological and global features of constituent parts. Network concept takes complete advantage of both of these properties in the analysis concomitantly. High compatibility to structural concepts or physicochemical properties in addition to exploiting a remarkable simplification in the system has made network an ideal tool to explore biological systems. There are numerous examples in which different protein structural and functional characteristics have been clarified by the network approach. Here, we present an interactive and user-friendly Matlab-based toolbox, PDB2Graph, devoted to protein structure network construction, visualization, and analysis. Moreover, PDB2Graph is an appropriate tool for identifying critical nodes involved in protein structural robustness and function based on centrality indices. It maps critical amino acids in protein networks and can greatly aid structural biologists in selecting proper amino acid candidates for manipulating protein structures in a more reasonable and rational manner. To introduce the capability and efficiency of PDB2Graph in detail, the structural modification of Calmodulin through allosteric binding of Ca(2+) is considered. In addition, a mutational analysis for three well-identified model proteins including Phage T4 lysozyme, Barnase and Ribonuclease HI, was performed to inspect the influence of mutating important central residues on protein activity. Copyright © 2016 Elsevier Ltd. All rights reserved.
Bag, Susmita; Ramaiah, Sudha; Anbarasu, Anand
2015-01-07
Network study on genes and proteins offers functional basics of the complexity of gene and protein, and its interacting partners. The gene fatty acid-binding protein 4 (fabp4) is found to be highly expressed in adipose tissue, and is one of the most abundant proteins in mature adipocytes. Our investigations on functional modules of fabp4 provide useful information on the functional genes interacting with fabp4, their biochemical properties and their regulatory functions. The present study shows that there are eight set of candidate genes: acp1, ext2, insr, lipe, ostf1, sncg, usp15, and vim that are strongly and functionally linked up with fabp4. Gene ontological analysis of network modules of fabp4 provides an explicit idea on the functional aspect of fabp4 and its interacting nodes. The hierarchal mapping on gene ontology indicates gene specific processes and functions as well as their compartmentalization in tissues. The fabp4 along with its interacting genes are involved in lipid metabolic activity and are integrated in multi-cellular processes of tissues and organs. They also have important protein/enzyme binding activity. Our study elucidated disease-associated nsSNP prediction for fabp4 and it is interesting to note that there are four rsID׳s (rs1051231, rs3204631, rs140925685 and rs141169989) with disease allelic variation (T104P, T126P, G27D and G90V respectively). On the whole, our gene network analysis presents a clear insight about the interactions and functions associated with fabp4 gene network. Copyright © 2014 Elsevier Ltd. All rights reserved.
Herrera, Victoria L M; Steffen, Martin; Moran, Ann Marie; Tan, Glaiza A; Pasion, Khristine A; Rivera, Keith; Pappin, Darryl J; Ruiz-Opazo, Nelson
2016-06-14
In contrast to rat and mouse databases, the NCBI gene database lists the human dual-endothelin1/VEGFsp receptor (DEspR, formerly Dear) as a unitary transcribed pseudogene due to a stop [TGA]-codon at codon#14 in automated DNA and RNA sequences. However, re-analysis is needed given prior single gene studies detected a tryptophan [TGG]-codon#14 by manual Sanger sequencing, demonstrated DEspR translatability and functionality, and since the demonstration of actual non-translatability through expression studies, the standard-of-excellence for pseudogene designation, has not been performed. Re-analysis must meet UNIPROT criteria for demonstration of a protein's existence at the highest (protein) level, which a priori, would override DNA- or RNA-based deductions. To dissect the nucleotide sequence discrepancy, we performed Maxam-Gilbert sequencing and reviewed 727 RNA-seq entries. To comply with the highest level multiple UNIPROT criteria for determining DEspR's existence, we performed various experiments using multiple anti-DEspR monoclonal antibodies (mAbs) targeting distinct DEspR epitopes with one spanning the contested tryptophan [TGG]-codon#14, assessing: (a) DEspR protein expression, (b) predicted full-length protein size, (c) sequence-predicted protein-specific properties beyond codon#14: receptor glycosylation and internalization, (d) protein-partner interactions, and (e) DEspR functionality via DEspR-inhibition effects. Maxam-Gilbert sequencing and some RNA-seq entries demonstrate two guanines, hence a tryptophan [TGG]-codon#14 within a compression site spanning an error-prone compression sequence motif. Western blot analysis using anti-DEspR mAbs targeting distinct DEspR epitopes detect the identical glycosylated 17.5 kDa pull-down protein. Decrease in DEspR-protein size after PNGase-F digest demonstrates post-translational glycosylation, concordant with the consensus-glycosylation site beyond codon#14. Like other small single-transmembrane proteins, mass spectrometry analysis of anti-DEspR mAb pull-down proteins do not detect DEspR, but detect DEspR-protein interactions with proteins implicated in intracellular trafficking and cancer. FACS analyses also detect DEspR-protein in different human cancer stem-like cells (CSCs). DEspR-inhibition studies identify DEspR-roles in CSC survival and growth. Live cell imaging detects fluorescently-labeled anti-DEspR mAb targeted-receptor internalization, concordant with the single internalization-recognition sequence also located beyond codon#14. Data confirm translatability of DEspR, the full-length DEspR protein beyond codon#14, and elucidate DEspR-specific functionality. Along with detection of the tryptophan [TGG]-codon#14 within an error-prone compression site, cumulative data demonstrating DEspR protein existence fulfill multiple UNIPROT criteria, thus refuting its pseudogene designation.
Meiler, Arno; Klinger, Claudia; Kaufmann, Michael
2012-09-08
The COG database is the most popular collection of orthologous proteins from many different completely sequenced microbial genomes. Per definition, a cluster of orthologous groups (COG) within this database exclusively contains proteins that most likely achieve the same cellular function. Recently, the COG database was extended by assigning to every protein both the corresponding amino acid and its encoding nucleotide sequence resulting in the NUCOCOG database. This extended version of the COG database is a valuable resource connecting sequence features with the functionality of the respective proteins. Here we present ANCAC, a web tool and MySQL database for the analysis of amino acid, nucleotide, and codon frequencies in COGs on the basis of freely definable phylogenetic patterns. We demonstrate the usefulness of ANCAC by analyzing amino acid frequencies, codon usage, and GC-content in a species- or function-specific context. With respect to amino acids we, at least in part, confirm the cognate bias hypothesis by using ANCAC's NUCOCOG dataset as the largest one available for that purpose thus far. Using the NUCOCOG datasets, ANCAC connects taxonomic, amino acid, and nucleotide sequence information with the functional classification via COGs and provides a GUI for flexible mining for sequence-bias. Thereby, to our knowledge, it is the only tool for the analysis of sequence composition in the light of physiological roles and phylogenetic context without requirement of substantial programming-skills.
2012-01-01
Background The COG database is the most popular collection of orthologous proteins from many different completely sequenced microbial genomes. Per definition, a cluster of orthologous groups (COG) within this database exclusively contains proteins that most likely achieve the same cellular function. Recently, the COG database was extended by assigning to every protein both the corresponding amino acid and its encoding nucleotide sequence resulting in the NUCOCOG database. This extended version of the COG database is a valuable resource connecting sequence features with the functionality of the respective proteins. Results Here we present ANCAC, a web tool and MySQL database for the analysis of amino acid, nucleotide, and codon frequencies in COGs on the basis of freely definable phylogenetic patterns. We demonstrate the usefulness of ANCAC by analyzing amino acid frequencies, codon usage, and GC-content in a species- or function-specific context. With respect to amino acids we, at least in part, confirm the cognate bias hypothesis by using ANCAC’s NUCOCOG dataset as the largest one available for that purpose thus far. Conclusions Using the NUCOCOG datasets, ANCAC connects taxonomic, amino acid, and nucleotide sequence information with the functional classification via COGs and provides a GUI for flexible mining for sequence-bias. Thereby, to our knowledge, it is the only tool for the analysis of sequence composition in the light of physiological roles and phylogenetic context without requirement of substantial programming-skills. PMID:22958836
Improving protein complex classification accuracy using amino acid composition profile.
Huang, Chien-Hung; Chou, Szu-Yu; Ng, Ka-Lok
2013-09-01
Protein complex prediction approaches are based on the assumptions that complexes have dense protein-protein interactions and high functional similarity between their subunits. We investigated those assumptions by studying the subunits' interaction topology, sequence similarity and molecular function for human and yeast protein complexes. Inclusion of amino acids' physicochemical properties can provide better understanding of protein complex properties. Principal component analysis is carried out to determine the major features. Adopting amino acid composition profile information with the SVM classifier serves as an effective post-processing step for complexes classification. Improvement is based on primary sequence information only, which is easy to obtain. Copyright © 2013 Elsevier Ltd. All rights reserved.
Levine, Mia T; Holloway, Alisha K; Arshad, Umbreen; Begun, David J
2007-11-01
Dosage compensation refers to the equalization of X-linked gene transcription among heterogametic and homogametic sexes. In Drosophila, the dosage compensation complex (DCC) mediates the twofold hypertranscription of the single male X chromosome. Loss-of-function mutations at any DCC protein-coding gene are male lethal. Here we report a population genetic analysis suggesting that four of the five core DCC proteins--MSL1, MSL2, MSL3, and MOF--are evolving under positive selection in D. melanogaster. Within these four proteins, several domains that range in function from X chromosome localization to protein-protein interactions have elevated, D. melanogaster-specific, amino acid divergence.
The analysis Arabidopsis thaliana overexpressing a 14kDa self-folding protein [abstract
USDA-ARS?s Scientific Manuscript database
A recent study in banana identified a 14kDa protein that has been hypothesized to function in regulating the nucleation and growth of the needle-shaped crystals of calcium oxalate that accumulate within the tissues of this plant. To gain further insight in to the functional role of this 14 kDa prote...
Structure-Based Phylogenetic Analysis of the Lipocalin Superfamily.
Lakshmi, Balasubramanian; Mishra, Madhulika; Srinivasan, Narayanaswamy; Archunan, Govindaraju
2015-01-01
Lipocalins constitute a superfamily of extracellular proteins that are found in all three kingdoms of life. Although very divergent in their sequences and functions, they show remarkable similarity in 3-D structures. Lipocalins bind and transport small hydrophobic molecules. Earlier sequence-based phylogenetic studies of lipocalins highlighted that they have a long evolutionary history. However the molecular and structural basis of their functional diversity is not completely understood. The main objective of the present study is to understand functional diversity of the lipocalins using a structure-based phylogenetic approach. The present study with 39 protein domains from the lipocalin superfamily suggests that the clusters of lipocalins obtained by structure-based phylogeny correspond well with the functional diversity. The detailed analysis on each of the clusters and sub-clusters reveals that the 39 lipocalin domains cluster based on their mode of ligand binding though the clustering was performed on the basis of gross domain structure. The outliers in the phylogenetic tree are often from single member families. Also structure-based phylogenetic approach has provided pointers to assign putative function for the domains of unknown function in lipocalin family. The approach employed in the present study can be used in the future for the functional identification of new lipocalin proteins and may be extended to other protein families where members show poor sequence similarity but high structural similarity.
Benoit, Joshua B; Attardo, Geoffrey M; Michalkova, Veronika; Krause, Tyler B; Bohova, Jana; Zhang, Qirui; Baumann, Aaron A; Mireji, Paul O; Takáč, Peter; Denlinger, David L; Ribeiro, Jose M; Aksoy, Serap
2014-04-01
In tsetse flies, nutrients for intrauterine larval development are synthesized by the modified accessory gland (milk gland) and provided in mother's milk during lactation. Interference with at least two milk proteins has been shown to extend larval development and reduce fecundity. The goal of this study was to perform a comprehensive characterization of tsetse milk proteins using lactation-specific transcriptome/milk proteome analyses and to define functional role(s) for the milk proteins during lactation. Differential analysis of RNA-seq data from lactating and dry (non-lactating) females revealed enrichment of transcripts coding for protein synthesis machinery, lipid metabolism and secretory proteins during lactation. Among the genes induced during lactation were those encoding the previously identified milk proteins (milk gland proteins 1-3, transferrin and acid sphingomyelinase 1) and seven new genes (mgp4-10). The genes encoding mgp2-10 are organized on a 40 kb syntenic block in the tsetse genome, have similar exon-intron arrangements, and share regions of amino acid sequence similarity. Expression of mgp2-10 is female-specific and high during milk secretion. While knockdown of a single mgp failed to reduce fecundity, simultaneous knockdown of multiple variants reduced milk protein levels and lowered fecundity. The genomic localization, gene structure similarities, and functional redundancy of MGP2-10 suggest that they constitute a novel highly divergent protein family. Our data indicates that MGP2-10 function both as the primary amino acid resource for the developing larva and in the maintenance of milk homeostasis, similar to the function of the mammalian casein family of milk proteins. This study underscores the dynamic nature of the lactation cycle and identifies a novel family of lactation-specific proteins, unique to Glossina sp., that are essential to larval development. The specificity of MGP2-10 to tsetse and their critical role during lactation suggests that these proteins may be an excellent target for tsetse-specific population control approaches.
da Fonseca, Néli José; Lima Afonso, Marcelo Querino; Pedersolli, Natan Gonçalves; de Oliveira, Lucas Carrijo; Andrade, Dhiego Souto; Bleicher, Lucas
2017-10-28
Flaviviruses are responsible for serious diseases such as dengue, yellow fever, and zika fever. Their genomes encode a polyprotein which, after cleavage, results in three structural and seven non-structural proteins. Homologous proteins can be studied by conservation and coevolution analysis as detected in multiple sequence alignments, usually reporting positions which are strictly necessary for the structure and/or function of all members in a protein family or which are involved in a specific sub-class feature requiring the coevolution of residue sets. This study provides a complete conservation and coevolution analysis on all flaviviruses non-structural proteins, with results mapped on all well-annotated available sequences. A literature review on the residues found in the analysis enabled us to compile available information on their roles and distribution among different flaviviruses. Also, we provide the mapping of conserved and coevolved residues for all sequences currently in SwissProt as a supplementary material, so that particularities in different viruses can be easily analyzed. Copyright © 2017 Elsevier Inc. All rights reserved.
Greenwood, Edward JD; Matheson, Nicholas J; Wals, Kim; van den Boomen, Dick JH; Antrobus, Robin; Williamson, James C; Lehner, Paul J
2016-01-01
Viruses manipulate host factors to enhance their replication and evade cellular restriction. We used multiplex tandem mass tag (TMT)-based whole cell proteomics to perform a comprehensive time course analysis of >6500 viral and cellular proteins during HIV infection. To enable specific functional predictions, we categorized cellular proteins regulated by HIV according to their patterns of temporal expression. We focussed on proteins depleted with similar kinetics to APOBEC3C, and found the viral accessory protein Vif to be necessary and sufficient for CUL5-dependent proteasomal degradation of all members of the B56 family of regulatory subunits of the key cellular phosphatase PP2A (PPP2R5A-E). Quantitative phosphoproteomic analysis of HIV-infected cells confirmed Vif-dependent hyperphosphorylation of >200 cellular proteins, particularly substrates of the aurora kinases. The ability of Vif to target PPP2R5 subunits is found in primate and non-primate lentiviral lineages, and remodeling of the cellular phosphoproteome is therefore a second ancient and conserved Vif function. DOI: http://dx.doi.org/10.7554/eLife.18296.001 PMID:27690223
Baltoumas, Fotis A; Theodoropoulou, Margarita C; Hamodrakas, Stavros J
2016-06-01
A significant amount of experimental evidence suggests that G-protein coupled receptors (GPCRs) do not act exclusively as monomers but also form biologically relevant dimers and oligomers. However, the structural determinants, stoichiometry and functional importance of GPCR oligomerization remain topics of intense speculation. In this study we attempted to evaluate the nature and dynamics of GPCR oligomeric interactions. A representative set of GPCR homodimers were studied through Coarse-Grained Molecular Dynamics simulations, combined with interface analysis and concepts from network theory for the construction and analysis of dynamic structural networks. Our results highlight important structural determinants that seem to govern receptor dimer interactions. A conserved dynamic behavior was observed among different GPCRs, including receptors belonging in different GPCR classes. Specific GPCR regions were highlighted as the core of the interfaces. Finally, correlations of motion were observed between parts of the dimer interface and GPCR segments participating in ligand binding and receptor activation, suggesting the existence of mechanisms through which dimer formation may affect GPCR function. The results of this study can be used to drive experiments aimed at exploring GPCR oligomerization, as well as in the study of transmembrane protein-protein interactions in general.
NASA Astrophysics Data System (ADS)
Baltoumas, Fotis A.; Theodoropoulou, Margarita C.; Hamodrakas, Stavros J.
2016-06-01
A significant amount of experimental evidence suggests that G-protein coupled receptors (GPCRs) do not act exclusively as monomers but also form biologically relevant dimers and oligomers. However, the structural determinants, stoichiometry and functional importance of GPCR oligomerization remain topics of intense speculation. In this study we attempted to evaluate the nature and dynamics of GPCR oligomeric interactions. A representative set of GPCR homodimers were studied through Coarse-Grained Molecular Dynamics simulations, combined with interface analysis and concepts from network theory for the construction and analysis of dynamic structural networks. Our results highlight important structural determinants that seem to govern receptor dimer interactions. A conserved dynamic behavior was observed among different GPCRs, including receptors belonging in different GPCR classes. Specific GPCR regions were highlighted as the core of the interfaces. Finally, correlations of motion were observed between parts of the dimer interface and GPCR segments participating in ligand binding and receptor activation, suggesting the existence of mechanisms through which dimer formation may affect GPCR function. The results of this study can be used to drive experiments aimed at exploring GPCR oligomerization, as well as in the study of transmembrane protein-protein interactions in general.
GSyellow, a Multifaceted Tag for Functional Protein Analysis in Monocot and Dicot Plants.
Besbrugge, Nienke; Van Leene, Jelle; Eeckhout, Dominique; Cannoot, Bernard; Kulkarni, Shubhada R; De Winne, Nancy; Persiau, Geert; Van De Slijke, Eveline; Bontinck, Michiel; Aesaert, Stijn; Impens, Francis; Gevaert, Kris; Van Damme, Daniel; Van Lijsebettens, Mieke; Inzé, Dirk; Vandepoele, Klaas; Nelissen, Hilde; De Jaeger, Geert
2018-06-01
The ability to tag proteins has boosted the emergence of generic molecular methods for protein functional analysis. Fluorescent protein tags are used to visualize protein localization, and affinity tags enable the mapping of molecular interactions by, for example, tandem affinity purification or chromatin immunoprecipitation. To apply these widely used molecular techniques on a single transgenic plant line, we developed a multifunctional tandem affinity purification tag, named GS yellow , which combines the streptavidin-binding peptide tag with citrine yellow fluorescent protein. We demonstrated the versatility of the GS yellow tag in the dicot Arabidopsis ( Arabidopsis thaliana ) using a set of benchmark proteins. For proof of concept in monocots, we assessed the localization and dynamic interaction profile of the leaf growth regulator ANGUSTIFOLIA3 (AN3), fused to the GS yellow tag, along the growth zone of the maize ( Zea mays ) leaf. To further explore the function of ZmAN3, we mapped its DNA-binding landscape in the growth zone of the maize leaf through chromatin immunoprecipitation sequencing. Comparison with AN3 target genes mapped in the developing maize tassel or in Arabidopsis cell cultures revealed strong conservation of AN3 target genes between different maize tissues and across monocots and dicots, respectively. In conclusion, the GS yellow tag offers a powerful molecular tool for distinct types of protein functional analyses in dicots and monocots. As this approach involves transforming a single construct, it is likely to accelerate both basic and translational plant research. © 2018 American Society of Plant Biologists. All rights reserved.
Seminal plasma and sperm proteome of ring-tailed coatis (Nasua nasua, Linnaeus, 1766).
Silva, Herlon Victor Rodrigues; Rodriguez-Villamil, Paula; Magalhães, Francisco Felipe de; Nunes, Thalles Gothardo Pereira; Freitas, Luana Azevedo de; Ribeiro, Leandro Rodrigues; Silva, Alexandre Rodrigues; Moura, Arlindo A; Silva, Lúcia Daniel Machado da
2018-04-15
Ring-tailed coati is listed as a species of least concern in the International Union for Conservation of Nature (IUCN) Red List, however, there has been a sharp decline in their population. The present study was conducted to evaluate the major proteins of both seminal plasma and sperm in ring-tailed coatis. Semen sample was collected from three adult coatis and evaluated for their morphological characteristics. Further, the sample was centrifuged to separate spermatozoa from seminal plasma, and then stored in liquid nitrogen. The seminal plasma and sperm proteins were subjected to one-dimensional (1-D) sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE) and identified by mass spectrometry. Gene ontology and protein networks were analyzed using bioinformatics tools. Based on sperm concentration and average protein content of the semen, the concentration of protein/spermatozoon was found to be 104.69 ± 44.43 μg. The analysis of SDS-PAGE gels showed 20.3 ± 3.1 and 17 ± 2 protein bands/lane for seminal plasma and sperm, respectively. In-gel protein digestion and peptide analysis by mass spectrometry revealed 238 and 246 proteins in the seminal plasma and sperm, respectively. The gene ontology analysis revealed that the proteins of seminal plasma mainly participated in cellular (35%) and regulatory (21%) processes. According to their cellular localization, seminal plasma proteins were categorized as structural (18%), extracellular (17%), and nuclear (14%) proteins with molecular functions, such as catalytic activity (43%) and binding (43%). The sperm proteins were also involved in cellular (38%) and regulatory (23%) processes, and mainly categorized as extracellular (17%), nuclear (13%), and cytoplasmic (10%) proteins. The major molecular functions of the sperm proteins were catalytic activity (44%) and binding (42%). These results indicated that the seminal plasma of ring-tailed coati has an array of proteins that can potentially modulate several sperm functions, from sperm protection to oocyte binding. However, further studies are necessary to interpret the roles of these major seminal plasma proteins in coatis. Copyright © 2018 Elsevier Inc. All rights reserved.
Comparison of Normal and Breast Cancer Cell lines using Proteome, Genome and Interactome data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Patwardhan, Anil J.; Strittmatter, Eric F.; Camp, David G.
2005-12-01
Normal and cancer cell line proteomes were profiled using high throughput mass spectrometry techniques. Application of both protein-level and peptide-level sample fractionation combined with LC-MS/MS analysis enabled the confident identification of 2,235 unmodified proteins representing a broad range of functional and compartmental classes. An iterative multi-step search strategy was used to identify post-translational modifications and detected several proteins that are preferentially modified in cancer cells. Information regarding both unmodified and modified protein forms was combined with publicly available gene expression and protein-protein interaction data. The resulting integrated dataset revealed several functionally related proteins that are differentially regulated between normal andmore » cancer cell lines.« less
Dong, Zhaoming; Zhao, Ping; Wang, Chen; Zhang, Yan; Chen, Jianping; Wang, Xin; Lin, Ying; Xia, Qingyou
2013-11-01
Silkworms (Bombyx mori) produce massive amounts of silk proteins to make cocoons during the final stages of larval development. Although the major components, fibroin and sericin, have been the focus for a long time, few researchers have realized the complexity of the silk proteome. We collected seven kinds of silk fibers spun by silkworm larvae at different developmental stages: the silks spun by new hatched larvae, second instar day 0 larvae, third instar day 0 larvae, fourth instar day 0 larvae, and fourth instar molting larvae, the scaffold silk used to attach the cocoon to the substrate and the cocoon silk. Analysis by liquid chromatography-tandem mass spectrometry identified 500 proteins from the seven silks. In addition to the expected fibroins, sericins, and some known protease inhibitors, we also identified further protease inhibitors, enzymes, proteins of unknown function, and other proteins. Unsurprisingly, our quantitative results showed fibroins and sericins were the most abundant proteins in all seven silks. Except for fibroins and sericins, protease inhibitors, enzymes, and proteins of unknown function were more abundant than other proteins. We found significant change in silk protein compositions through development, being consistent with their different biological functions and complicated formation.
Structural and evolutionary analysis of Leishmania Alba proteins.
da Costa, Kauê Santana; Galúcio, João Marcos Pereira; Leonardo, Elvis Santos; Cardoso, Guelber; Leal, Élcio; Conde, Guilherme; Lameira, Jerônimo
2017-10-01
The Alba superfamily proteins share a common RNA-binding domain. These proteins participate in a variety of regulatory pathways by controlling developmental gene expression. They also interact with ribosomal subunits, translation factors, and other RNA-binding proteins. The Leishmania infantum genome encodes two Alba-domain proteins, LiAlba1 and LiAlba3. In this work, we used homology modeling, protein-protein docking, and molecular dynamics (MD) simulations to explore the details of the Alba1-Alba3-RNA complex from Leishmania infantum at the molecular level. In addition, we compared the structure of LiAlba3 with the human ribonuclease P component, Rpp20. We also mapped the ligand-binding residues on the Alba3 surface to analyze its druggability and performed mutational analyses in Alba3 using alanine scanning to identify residues involved in its function and structural stability. These results suggest that the RGG-box motif of LiAlba1 is important for protein function and stability. Finally, we discuss the function of Alba proteins in the context of pathogen adaptation to host cells. The data provided herein will facilitate further translational research regarding Alba structure and function. Copyright © 2017 Elsevier B.V. All rights reserved.
Proteome analysis of Physcomitrella patens exposed to progressive dehydration and rehydration.
Cui, Suxia; Hu, Jia; Guo, Shilei; Wang, Jie; Cheng, Yali; Dang, Xinxing; Wu, Lili; He, Yikun
2012-01-01
Physcomitrella patens is an extremely dehydration-tolerant moss. However, the molecular basis of its responses to loss of cellular water remains unclear. A comprehensive proteomic analysis of dehydration- and rehydration-responsive proteins has been conducted using quantitative two-dimensional difference in-gel electrophoresis (2D-DIGE), and traditional 2-D gel electrophoresis (2-DE) combined with MALDI TOF/TOF MS. Of the 216 differentially-expressed protein spots, 112 and 104 were dehydration- and rehydration-responsive proteins, respectively. The functional categories of the most differentially-expressed proteins were seed maturation, defence, protein synthesis and quality control, and energy production. Strikingly, most of the late embryogenesis abundant (LEA) proteins were expressed at a basal level under control conditions and their synthesis was strongly enhanced by dehydration, a pattern that was confirmed by RT-PCR. Actinoporins, phosphatidylethanolamine-binding protein, arabinogalactan protein, and phospholipase are the likely dominant players in the defence system. In addition, 24 proteins of unknown function were identified as novel dehydration- or rehydration-responsive proteins. Our data indicate that Physcomitrella adopts a rapid protein response mechanism to cope with dehydration in its leafy-shoot and basal expression levels of desiccation-tolerant proteins are rapidly upgraded at high levels under stress. This mechanism appears similar to that seen in angiosperm seeds.
Proteome analysis of Physcomitrella patens exposed to progressive dehydration and rehydration
Cui, Suxia; Hu, Jia; Guo, Shilei; Wang, Jie; Cheng, Yali; Dang, Xinxing; Wu, Lili; He, Yikun
2012-01-01
Physcomitrella patens is an extremely dehydration-tolerant moss. However, the molecular basis of its responses to loss of cellular water remains unclear. A comprehensive proteomic analysis of dehydration- and rehydration-responsive proteins has been conducted using quantitative two-dimensional difference in-gel electrophoresis (2D-DIGE), and traditional 2-D gel electrophoresis (2-DE) combined with MALDI TOF/TOF MS. Of the 216 differentially-expressed protein spots, 112 and 104 were dehydration- and rehydration-responsive proteins, respectively. The functional categories of the most differentially-expressed proteins were seed maturation, defence, protein synthesis and quality control, and energy production. Strikingly, most of the late embryogenesis abundant (LEA) proteins were expressed at a basal level under control conditions and their synthesis was strongly enhanced by dehydration, a pattern that was confirmed by RT-PCR. Actinoporins, phosphatidylethanolamine-binding protein, arabinogalactan protein, and phospholipase are the likely dominant players in the defence system. In addition, 24 proteins of unknown function were identified as novel dehydration- or rehydration-responsive proteins. Our data indicate that Physcomitrella adopts a rapid protein response mechanism to cope with dehydration in its leafy-shoot and basal expression levels of desiccation-tolerant proteins are rapidly upgraded at high levels under stress. This mechanism appears similar to that seen in angiosperm seeds. PMID:21994173
Conserved differences in protein sequence determine the human pathogenicity of Ebolaviruses
Pappalardo, Morena; Juliá, Miguel; Howard, Mark J.; Rossman, Jeremy S.; Michaelis, Martin; Wass, Mark N.
2016-01-01
Reston viruses are the only Ebolaviruses that are not pathogenic in humans. We analyzed 196 Ebolavirus genomes and identified specificity determining positions (SDPs) in all nine Ebolavirus proteins that distinguish Reston viruses from the four human pathogenic Ebolaviruses. A subset of these SDPs will explain the differences in human pathogenicity between Reston and the other four ebolavirus species. Structural analysis was performed to identify those SDPs that are likely to have a functional effect. This analysis revealed novel functional insights in particular for Ebolavirus proteins VP40 and VP24. The VP40 SDP P85T interferes with VP40 function by altering octamer formation. The VP40 SDP Q245P affects the structure and hydrophobic core of the protein and consequently protein function. Three VP24 SDPs (T131S, M136L, Q139R) are likely to impair VP24 binding to human karyopherin alpha5 (KPNA5) and therefore inhibition of interferon signaling. Since VP24 is critical for Ebolavirus adaptation to novel hosts, and only a few SDPs distinguish Reston virus VP24 from VP24 of other Ebolaviruses, human pathogenic Reston viruses may emerge. This is of concern since Reston viruses circulate in domestic pigs and can infect humans, possibly via airborne transmission. PMID:27009368
Conserved differences in protein sequence determine the human pathogenicity of Ebolaviruses.
Pappalardo, Morena; Juliá, Miguel; Howard, Mark J; Rossman, Jeremy S; Michaelis, Martin; Wass, Mark N
2016-03-24
Reston viruses are the only Ebolaviruses that are not pathogenic in humans. We analyzed 196 Ebolavirus genomes and identified specificity determining positions (SDPs) in all nine Ebolavirus proteins that distinguish Reston viruses from the four human pathogenic Ebolaviruses. A subset of these SDPs will explain the differences in human pathogenicity between Reston and the other four ebolavirus species. Structural analysis was performed to identify those SDPs that are likely to have a functional effect. This analysis revealed novel functional insights in particular for Ebolavirus proteins VP40 and VP24. The VP40 SDP P85T interferes with VP40 function by altering octamer formation. The VP40 SDP Q245P affects the structure and hydrophobic core of the protein and consequently protein function. Three VP24 SDPs (T131S, M136L, Q139R) are likely to impair VP24 binding to human karyopherin alpha5 (KPNA5) and therefore inhibition of interferon signaling. Since VP24 is critical for Ebolavirus adaptation to novel hosts, and only a few SDPs distinguish Reston virus VP24 from VP24 of other Ebolaviruses, human pathogenic Reston viruses may emerge. This is of concern since Reston viruses circulate in domestic pigs and can infect humans, possibly via airborne transmission.
Computation-Guided Backbone Grafting of a Discontinuous Motif onto a Protein Scaffold
DOE Office of Scientific and Technical Information (OSTI.GOV)
Azoitei, Mihai L.; Correia, Bruno E.; Ban, Yih-En Andrew
2012-02-07
The manipulation of protein backbone structure to control interaction and function is a challenge for protein engineering. We integrated computational design with experimental selection for grafting the backbone and side chains of a two-segment HIV gp120 epitope, targeted by the cross-neutralizing antibody b12, onto an unrelated scaffold protein. The final scaffolds bound b12 with high specificity and with affinity similar to that of gp120, and crystallographic analysis of a scaffold bound to b12 revealed high structural mimicry of the gp120-b12 complex structure. The method can be generalized to design other functional proteins through backbone grafting.
Smith, Everett Clinton; Smith, Stacy E; Carter, James R; Webb, Stacy R; Gibson, Kathleen M; Hellman, Lance M; Fried, Michael G; Dutch, Rebecca Ellis
2013-12-13
Paramyxovirus fusion (F) proteins promote membrane fusion between the viral envelope and host cell membranes, a critical early step in viral infection. Although mutational analyses have indicated that transmembrane (TM) domain residues can affect folding or function of viral fusion proteins, direct analysis of TM-TM interactions has proved challenging. To directly assess TM interactions, the oligomeric state of purified chimeric proteins containing the Staphylococcal nuclease (SN) protein linked to the TM segments from three paramyxovirus F proteins was analyzed by sedimentation equilibrium analysis in detergent and buffer conditions that allowed density matching. A monomer-trimer equilibrium best fit was found for all three SN-TM constructs tested, and similar fits were obtained with peptides corresponding to just the TM region of two different paramyxovirus F proteins. These findings demonstrate for the first time that class I viral fusion protein TM domains can self-associate as trimeric complexes in the absence of the rest of the protein. Glycine residues have been implicated in TM helix interactions, so the effect of mutations at Hendra F Gly-508 was assessed in the context of the whole F protein. Mutations G508I or G508L resulted in decreased cell surface expression of the fusogenic form, consistent with decreased stability of the prefusion form of the protein. Sedimentation equilibrium analysis of TM domains containing these mutations gave higher relative association constants, suggesting altered TM-TM interactions. Overall, these results suggest that trimeric TM interactions are important driving forces for protein folding, stability and membrane fusion promotion.
Vamparys, Lydie; Laurent, Benoist; Carbone, Alessandra
2016-01-01
ABSTRACT Protein–protein interactions play a key part in most biological processes and understanding their mechanism is a fundamental problem leading to numerous practical applications. The prediction of protein binding sites in particular is of paramount importance since proteins now represent a major class of therapeutic targets. Amongst others methods, docking simulations between two proteins known to interact can be a useful tool for the prediction of likely binding patches on a protein surface. From the analysis of the protein interfaces generated by a massive cross‐docking experiment using the 168 proteins of the Docking Benchmark 2.0, where all possible protein pairs, and not only experimental ones, have been docked together, we show that it is also possible to predict a protein's binding residues without having any prior knowledge regarding its potential interaction partners. Evaluating the performance of cross‐docking predictions using the area under the specificity‐sensitivity ROC curve (AUC) leads to an AUC value of 0.77 for the complete benchmark (compared to the 0.5 AUC value obtained for random predictions). Furthermore, a new clustering analysis performed on the binding patches that are scattered on the protein surface show that their distribution and growth will depend on the protein's functional group. Finally, in several cases, the binding‐site predictions resulting from the cross‐docking simulations will lead to the identification of an alternate interface, which corresponds to the interaction with a biomolecular partner that is not included in the original benchmark. Proteins 2016; 84:1408–1421. © 2016 The Authors Proteins: Structure, Function, and Bioinformatics Published by Wiley Periodicals, Inc. PMID:27287388
Yu, Jun; Luo, Xiaobin; Xu, Hua; Ma, Quan; Yuan, Jianhui; Li, Xuling; Chang, Raymond Chuen-Chung; Qu, Zhongsen; Huang, Xinfeng; Zhuang, Zhixiong; Liu, Jianjun; Yang, Xifei
2015-01-01
Alzheimer's disease (AD) is the most common neurodegenerative disease characterized by a progressive impairment of cognitive functions including spatial learning and memory. Excess copper exposure accelerates the development of AD; however, the potential mechanisms by which copper exacerbates the symptoms of AD remain unknown. In this study, we explored the effects of chronic copper exposure on cognitive function by treating 6 month-old triple AD transgenic (3xTg-AD) mice with 250 ppm copper sulfate in drinking water for 6 months, and identified several potential key molecules involved in the effects of chronic copper exposure on memory by proteomic analysis. The behavioral test showed that chronic copper exposure aggravated memory impairment of 3xTg-AD mice. Two-dimensional fluorescence difference gel electrophoresis (2D-DIGE) coupled with mass spectrometry revealed a total of 44 differentially expressed proteins (18 upregulated and 26 down-regulated) in hippocampus between the wild-type (WT) mice and non-exposed 3xTg-AD mice. A total of 40 differentially expressed proteins were revealed (20 upregulated and 20 down-regulated) in hippocampus between copper exposed and non-exposed 3xTg-AD mice. Among these differentially expressed proteins, complexin-1 and complexin-2, two memory associated proteins, were significantly decreased in hippocampus of 3xTg-AD mice compared with the WT mice. Furthermore, the expression of these two proteins was further down-regulated in 3xTg-AD mice when exposed to copper. The abnormal expression of complexin-1 and complexin-2 identified by proteomic analysis was verified by western blot analysis. Taken together, our data showed that chronic copper exposure accelerated memory impairment and altered the expression of proteins in hippocampus in 3xTg-AD mice. The functional analysis on the differentially expressed proteins suggested that complexin-1 and complexin-2 may be the key molecules involved in chronic copper exposure-aggravated memory impairment in AD.
Vega, Ana Isabel; Pérez-Cerdá, Celia; Abia, David; Gámez, Alejandra; Briones, Paz; Artuch, Rafael; Desviat, Lourdes R; Ugarte, Magdalena; Pérez, Belén
2011-08-01
Deficiency of phosphomannomutase (PMM2, MIM#601785) is the most common congenital disorder of glycosylation. Herein we report the genetic analysis of 22 Spanish PMM2 deficient patients and the functional analysis of 14 nucleotide changes in a prokaryotic expression system in order to elucidate their molecular pathogenesis. PMM2 activity assay revealed the presence of six protein changes with no enzymatic activities (p.R123Q, p.R141H, p.F157S, p.P184T, p.F207S and p.D209G) and seven mild protein changes with residual activities ranging from 16 to 54% (p.L32R, p.V44A p.D65Y, p.P113L p.T118S, p.T237M and p.C241S) and also one variant change with normal activity (p.E197A). The results obtained from Western blot analysis, degradation time courses of 11 protein changes and structural analysis of the PMM2 protein, suggest that the loss-of-function of most mutant proteins is based on their increased susceptibility to degradation or aggregation compared to the wild type protein, considering PMM2 deficiency as a conformational disease. We have identified exclusively catalytic protein change (p.D209G), catalytic protein changes affecting protein stability (p.R123Q and p.R141H), two protein changes disrupting the dimer interface (p.P113L and p.T118S) and several misfolding changes (p.L32R, p.V44A, p.D65Y, p.F157S, p.P184T, p.F207S, p.T237M and p.C241S). Our current work opens a promising therapeutic option using pharmacological chaperones to revert the effect of the characterized misfolding mutations identified in a wide range of PMM2 deficient patients.
Guo, Yong; Qiu, Li-Juan
2013-01-01
The Dof domain protein family is a classic plant-specific zinc-finger transcription factor family involved in a variety of biological processes. There is great diversity in the number of Dof genes in different plants. However, there are only very limited reports on the characterization of Dof transcription factors in soybean (Glycine max). In the present study, 78 putative Dof genes were identified from the whole-genome sequence of soybean. The predicted GmDof genes were non-randomly distributed within and across 19 out of 20 chromosomes and 97.4% (38 pairs) were preferentially retained duplicate paralogous genes located in duplicated regions of the genome. Soybean-specific segmental duplications contributed significantly to the expansion of the soybean Dof gene family. These Dof proteins were phylogenetically clustered into nine distinct subgroups among which the gene structure and motif compositions were considerably conserved. Comparative phylogenetic analysis of these Dof proteins revealed four major groups, similar to those reported for Arabidopsis and rice. Most of the GmDofs showed specific expression patterns based on RNA-seq data analyses. The expression patterns of some duplicate genes were partially redundant while others showed functional diversity, suggesting the occurrence of sub-functionalization during subsequent evolution. Comprehensive expression profile analysis also provided insights into the soybean-specific functional divergence among members of the Dof gene family. Cis-regulatory element analysis of these GmDof genes suggested diverse functions associated with different processes. Taken together, our results provide useful information for the functional characterization of soybean Dof genes by combining phylogenetic analysis with global gene-expression profiling.
Expanded microbial genome coverage and improved protein family annotation in the COG database.
Galperin, Michael Y; Makarova, Kira S; Wolf, Yuri I; Koonin, Eugene V
2015-01-01
Microbial genome sequencing projects produce numerous sequences of deduced proteins, only a small fraction of which have been or will ever be studied experimentally. This leaves sequence analysis as the only feasible way to annotate these proteins and assign to them tentative functions. The Clusters of Orthologous Groups of proteins (COGs) database (http://www.ncbi.nlm.nih.gov/COG/), first created in 1997, has been a popular tool for functional annotation. Its success was largely based on (i) its reliance on complete microbial genomes, which allowed reliable assignment of orthologs and paralogs for most genes; (ii) orthology-based approach, which used the function(s) of the characterized member(s) of the protein family (COG) to assign function(s) to the entire set of carefully identified orthologs and describe the range of potential functions when there were more than one; and (iii) careful manual curation of the annotation of the COGs, aimed at detailed prediction of the biological function(s) for each COG while avoiding annotation errors and overprediction. Here we present an update of the COGs, the first since 2003, and a comprehensive revision of the COG annotations and expansion of the genome coverage to include representative complete genomes from all bacterial and archaeal lineages down to the genus level. This re-analysis of the COGs shows that the original COG assignments had an error rate below 0.5% and allows an assessment of the progress in functional genomics in the past 12 years. During this time, functions of many previously uncharacterized COGs have been elucidated and tentative functional assignments of many COGs have been validated, either by targeted experiments or through the use of high-throughput methods. A particularly important development is the assignment of functions to several widespread, conserved proteins many of which turned out to participate in translation, in particular rRNA maturation and tRNA modification. The new version of the COGs is expected to become an important tool for microbial genomics. Published by Oxford University Press on behalf of Nucleic Acids Research 2014. This work is written by US Government employees and is in the public domain in the US.
Greco, Todd M.; Guise, Amanda J.; Cristea, Ileana M.
2016-01-01
In biological systems, proteins catalyze the fundamental reactions that underlie all cellular functions, including metabolic processes and cell survival and death pathways. These biochemical reactions are rarely accomplished alone. Rather, they involve a concerted effect from many proteins that may operate in a directed signaling pathway and/or may physically associate in a complex to achieve a specific enzymatic activity. Therefore, defining the composition and regulation of protein complexes is critical for understanding cellular functions. In this chapter, we describe an approach that uses quantitative mass spectrometry (MS) to assess the specificity and the relative stability of protein interactions. Isolation of protein complexes from mammalian cells is performed by rapid immunoaffinity purification, and followed by in-solution digestion and high-resolution mass spectrometry analysis. We employ complementary quantitative MS workflows to assess the specificity of protein interactions using label-free MS and statistical analysis, and the relative stability of the interactions using a metabolic labeling technique. For each candidate protein interaction, scores from the two workflows can be correlated to minimize nonspecific background and profile protein complex composition and relative stability. PMID:26867737
Comparative Analysis and Distribution of Omega-3 lcPUFA Biosynthesis Genes in Marine Molluscs
Surm, Joachim M.; Prentis, Peter J.; Pavasovic, Ana
2015-01-01
Recent research has identified marine molluscs as an excellent source of omega-3 long-chain polyunsaturated fatty acids (lcPUFAs), based on their potential for endogenous synthesis of lcPUFAs. In this study we generated a representative list of fatty acyl desaturase (Fad) and elongation of very long-chain fatty acid (Elovl) genes from major orders of Phylum Mollusca, through the interrogation of transcriptome and genome sequences, and various publicly available databases. We have identified novel and uncharacterised Fad and Elovl sequences in the following species: Anadara trapezia, Nerita albicilla, Nerita melanotragus, Crassostrea gigas, Lottia gigantea, Aplysia californica, Loligo pealeii and Chlamys farreri. Based on alignments of translated protein sequences of Fad and Elovl genes, the haeme binding motif and histidine boxes of Fad proteins, and the histidine box and seventeen important amino acids in Elovl proteins, were highly conserved. Phylogenetic analysis of aligned reference sequences was used to reconstruct the evolutionary relationships for Fad and Elovl genes separately. Multiple, well resolved clades for both the Fad and Elovl sequences were observed, suggesting that repeated rounds of gene duplication best explain the distribution of Fad and Elovl proteins across the major orders of molluscs. For Elovl sequences, one clade contained the functionally characterised Elovl5 proteins, while another clade contained proteins hypothesised to have Elovl4 function. Additional well resolved clades consisted only of uncharacterised Elovl sequences. One clade from the Fad phylogeny contained only uncharacterised proteins, while the other clade contained functionally characterised delta-5 desaturase proteins. The discovery of an uncharacterised Fad clade is particularly interesting as these divergent proteins may have novel functions. Overall, this paper presents a number of novel Fad and Elovl genes suggesting that many mollusc groups possess most of the required enzymes for the synthesis of lcPUFAs. PMID:26308548
Jaspard, Emmanuel; Macherel, David; Hunault, Gilles
2012-01-01
Late Embryogenesis Abundant Proteins (LEAPs) are ubiquitous proteins expected to play major roles in desiccation tolerance. Little is known about their structure - function relationships because of the scarcity of 3-D structures for LEAPs. The previous building of LEAPdb, a database dedicated to LEAPs from plants and other organisms, led to the classification of 710 LEAPs into 12 non-overlapping classes with distinct properties. Using this resource, numerous physico-chemical properties of LEAPs and amino acid usage by LEAPs have been computed and statistically analyzed, revealing distinctive features for each class. This unprecedented analysis allowed a rigorous characterization of the 12 LEAP classes, which differed also in multiple structural and physico-chemical features. Although most LEAPs can be predicted as intrinsically disordered proteins, the analysis indicates that LEAP class 7 (PF03168) and probably LEAP class 11 (PF04927) are natively folded proteins. This study thus provides a detailed description of the structural properties of this protein family opening the path toward further LEAP structure - function analysis. Finally, since each LEAP class can be clearly characterized by a unique set of physico-chemical properties, this will allow development of software to predict proteins as LEAPs. PMID:22615859
George, Kevin W; Chen, Amy; Jain, Aakriti; Batth, Tanveer S; Baidoo, Edward E K; Wang, George; Adams, Paul D; Petzold, Christopher J; Keasling, Jay D; Lee, Taek Soon
2014-08-01
The ability to rapidly assess and optimize heterologous pathway function is critical for effective metabolic engineering. Here, we develop a systematic approach to pathway analysis based on correlations between targeted proteins and metabolites and apply it to the microbial production of isopentenol, a promising biofuel. Starting with a seven-gene pathway, we performed a correlation analysis to reduce pathway complexity and identified two pathway proteins as the primary determinants of efficient isopentenol production. Aided by the targeted quantification of relevant pathway intermediates, we constructed and subsequently validated a conceptual model of isopentenol pathway function. Informed by our analysis, we assembled a strain which produced isopentenol at a titer 1.5 g/L, or 46% of theoretical yield. Our engineering approach allowed us to accurately identify bottlenecks and determine appropriate pathway balance. Paired with high-throughput cloning techniques and analytics, this strategy should prove useful for the analysis and optimization of increasingly complex heterologous pathways. © 2014 Wiley Periodicals, Inc.
Global Proteomics Analysis of Protein Lysine Methylation.
Cao, Xing-Jun; Garcia, Benjamin A
2016-11-01
Lysine methylation is a common protein post-translational modification dynamically mediated by protein lysine methyltransferases (PKMTs) and protein lysine demethylases (PKDMs). Beyond histone proteins, lysine methylation on non-histone proteins plays a substantial role in a variety of functions in cells and is closely associated with diseases such as cancer. A large body of evidence indicates that the dysregulation of some PKMTs leads to tumorigenesis via their non-histone substrates. However, most studies on other PKMTs have made slow progress owing to the lack of approaches for extensive screening of lysine methylation sites. However, recently, there has been a series of publications to perform large-scale analysis of protein lysine methylation. In this unit, we introduce a protocol for the global analysis of protein lysine methylation in cells by means of immunoaffinity enrichment and mass spectrometry. © 2016 by John Wiley & Sons, Inc. Copyright © 2016 John Wiley & Sons, Inc.
Predictive and comparative analysis of Ebolavirus proteins
Cong, Qian; Pei, Jimin; Grishin, Nick V
2015-01-01
Ebolavirus is the pathogen for Ebola Hemorrhagic Fever (EHF). This disease exhibits a high fatality rate and has recently reached a historically epidemic proportion in West Africa. Out of the 5 known Ebolavirus species, only Reston ebolavirus has lost human pathogenicity, while retaining the ability to cause EHF in long-tailed macaque. Significant efforts have been spent to determine the three-dimensional (3D) structures of Ebolavirus proteins, to study their interaction with host proteins, and to identify the functional motifs in these viral proteins. Here, in light of these experimental results, we apply computational analysis to predict the 3D structures and functional sites for Ebolavirus protein domains with unknown structure, including a zinc-finger domain of VP30, the RNA-dependent RNA polymerase catalytic domain and a methyltransferase domain of protein L. In addition, we compare sequences of proteins that interact with Ebolavirus proteins from RESTV-resistant primates with those from RESTV-susceptible monkeys. The host proteins that interact with GP and VP35 show an elevated level of sequence divergence between the RESTV-resistant and RESTV-susceptible species, suggesting that they may be responsible for host specificity. Meanwhile, we detect variable positions in protein sequences that are likely associated with the loss of human pathogenicity in RESTV, map them onto the 3D structures and compare their positions to known functional sites. VP35 and VP30 are significantly enriched in these potential pathogenicity determinants and the clustering of such positions on the surfaces of VP35 and GP suggests possible uncharacterized interaction sites with host proteins that contribute to the virulence of Ebolavirus. PMID:26158395
Predictive and comparative analysis of Ebolavirus proteins.
Cong, Qian; Pei, Jimin; Grishin, Nick V
2015-01-01
Ebolavirus is the pathogen for Ebola Hemorrhagic Fever (EHF). This disease exhibits a high fatality rate and has recently reached a historically epidemic proportion in West Africa. Out of the 5 known Ebolavirus species, only Reston ebolavirus has lost human pathogenicity, while retaining the ability to cause EHF in long-tailed macaque. Significant efforts have been spent to determine the three-dimensional (3D) structures of Ebolavirus proteins, to study their interaction with host proteins, and to identify the functional motifs in these viral proteins. Here, in light of these experimental results, we apply computational analysis to predict the 3D structures and functional sites for Ebolavirus protein domains with unknown structure, including a zinc-finger domain of VP30, the RNA-dependent RNA polymerase catalytic domain and a methyltransferase domain of protein L. In addition, we compare sequences of proteins that interact with Ebolavirus proteins from RESTV-resistant primates with those from RESTV-susceptible monkeys. The host proteins that interact with GP and VP35 show an elevated level of sequence divergence between the RESTV-resistant and RESTV-susceptible species, suggesting that they may be responsible for host specificity. Meanwhile, we detect variable positions in protein sequences that are likely associated with the loss of human pathogenicity in RESTV, map them onto the 3D structures and compare their positions to known functional sites. VP35 and VP30 are significantly enriched in these potential pathogenicity determinants and the clustering of such positions on the surfaces of VP35 and GP suggests possible uncharacterized interaction sites with host proteins that contribute to the virulence of Ebolavirus.
Huang, Sheng Yu; Chen, Sung Fang; Chen, Chun Hao; Huang, Hsuan Wei; Wu, Wen Guey; Sung, Wang Chou
2014-09-02
Snake venom consists of toxin proteins with multiple disulfide linkages to generate unique structures and biological functions. Determination of these cysteine connections usually requires the purification of each protein followed by structural analysis. In this study, dimethyl labeling coupled with LC-MS/MS and RADAR algorithm was developed to identify the disulfide bonds in crude snake venom. Without any protein separation, the disulfide linkages of several cytotoxins and PLA2 could be solved, including more than 20 disulfide bonds. The results show that this method is capable of analyzing protein mixture. In addition, the approach was also used to compare native cytotoxin 3 (CTX III) and its scrambled isomer, another category of protein mixture, for unknown disulfide bonds. Two disulfide-linked peptides were observed in the native CTX III, and 10 in its scrambled form, X-CTX III. This is the first study that reports a platform for the global cysteine connection analysis on a protein mixture. The proposed method is simple and automatic, offering an efficient tool for structural and functional studies of venom proteins.
The Protein Information Resource: an integrated public resource of functional annotation of proteins
Wu, Cathy H.; Huang, Hongzhan; Arminski, Leslie; Castro-Alvear, Jorge; Chen, Yongxing; Hu, Zhang-Zhi; Ledley, Robert S.; Lewis, Kali C.; Mewes, Hans-Werner; Orcutt, Bruce C.; Suzek, Baris E.; Tsugita, Akira; Vinayaka, C. R.; Yeh, Lai-Su L.; Zhang, Jian; Barker, Winona C.
2002-01-01
The Protein Information Resource (PIR) serves as an integrated public resource of functional annotation of protein data to support genomic/proteomic research and scientific discovery. The PIR, in collaboration with the Munich Information Center for Protein Sequences (MIPS) and the Japan International Protein Information Database (JIPID), produces the PIR-International Protein Sequence Database (PSD), the major annotated protein sequence database in the public domain, containing about 250 000 proteins. To improve protein annotation and the coverage of experimentally validated data, a bibliography submission system is developed for scientists to submit, categorize and retrieve literature information. Comprehensive protein information is available from iProClass, which includes family classification at the superfamily, domain and motif levels, structural and functional features of proteins, as well as cross-references to over 40 biological databases. To provide timely and comprehensive protein data with source attribution, we have introduced a non-redundant reference protein database, PIR-NREF. The database consists of about 800 000 proteins collected from PIR-PSD, SWISS-PROT, TrEMBL, GenPept, RefSeq and PDB, with composite protein names and literature data. To promote database interoperability, we provide XML data distribution and open database schema, and adopt common ontologies. The PIR web site (http://pir.georgetown.edu/) features data mining and sequence analysis tools for information retrieval and functional identification of proteins based on both sequence and annotation information. The PIR databases and other files are also available by FTP (ftp://nbrfa.georgetown.edu/pir_databases). PMID:11752247
Busk, Peter Kamp; Lange, Lene
2013-06-01
Functional prediction of carbohydrate-active enzymes is difficult due to low sequence identity. However, similar enzymes often share a few short motifs, e.g., around the active site, even when the overall sequences are very different. To exploit this notion for functional prediction of carbohydrate-active enzymes, we developed a simple algorithm, peptide pattern recognition (PPR), that can divide proteins into groups of sequences that share a set of short conserved sequences. When this method was used on 118 glycoside hydrolase 5 proteins with 9% average pairwise identity and representing four characterized enzymatic functions, 97% of the proteins were sorted into groups correlating with their enzymatic activity. Furthermore, we analyzed 8,138 glycoside hydrolase 13 proteins including 204 experimentally characterized enzymes with 28 different functions. There was a 91% correlation between group and enzyme activity. These results indicate that the function of carbohydrate-active enzymes can be predicted with high precision by finding short, conserved motifs in their sequences. The glycoside hydrolase 61 family is important for fungal biomass conversion, but only a few proteins of this family have been functionally characterized. Interestingly, PPR divided 743 glycoside hydrolase 61 proteins into 16 subfamilies useful for targeted investigation of the function of these proteins and pinpointed three conserved motifs with putative importance for enzyme activity. Furthermore, the conserved sequences were useful for cloning of new, subfamily-specific glycoside hydrolase 61 proteins from 14 fungi. In conclusion, identification of conserved sequence motifs is a new approach to sequence analysis that can predict carbohydrate-active enzyme functions with high precision.
Structural Basis of Interdomain Communication in the Hsc70 Chaperone
Jiang, Jianwen; Prasad, Kondury; Lafer, Eileen M.; Sousa, Rui
2015-01-01
Summary Hsp70 family proteins are highly conserved chaperones involved in protein folding, degradation, targeting and translocation, and protein complex remodeling. They are comprised of an N-terminal nucleotide binding domain (NBD) and a C-terminal protein substrate binding domain (SBD). ATP binding to the NBD alters SBD conformation and substrate binding kinetics, but an understanding of the mechanism of interdomain communication has been hampered by the lack of a crystal structure of an intact chaperone. Were-port here the 2.6 Å structure of a functionally intact bovine Hsc70 (bHsc70) and a mutational analysis of the observed interdomain interface and the immediately adjacent interdomain linker. This analysis identifies interdomain interactions critical for chaperone function and supports an allosteric mechanism in which the interdomain linker invades and disrupts the interdomain interface when ATP binds. PMID:16307916
Ohno, Yusuke; Kashio, Atsushi; Ogata, Ren; Ishitomi, Akihiro; Yamazaki, Yuki; Kihara, Akio
2012-01-01
Palmitoylation plays important roles in the regulation of protein localization, stability, and activity. The protein acyltransferases (PATs) have a common DHHC Cys-rich domain. Twenty-three DHHC proteins have been identified in humans. However, it is unclear whether all of these DHHC proteins function as PATs. In addition, their substrate specificities remain largely unknown. Here we develop a useful method to examine substrate specificities of PATs using a yeast expression system with six distinct model substrates. We identify 17 human DHHC proteins as PATs. Moreover, we classify 11 human and 5 yeast DHHC proteins into three classes (I, II, and III), based on the cellular localization of their respective substrates (class I, soluble proteins; class II, integral membrane proteins; class III, lipidated proteins). Our results may provide an important clue for understanding the function of individual DHHC proteins. PMID:23034182
Identification of new intrinsic proteins in Arabidopsis plasma membrane proteome.
Marmagne, Anne; Rouet, Marie-Aude; Ferro, Myriam; Rolland, Norbert; Alcon, Carine; Joyard, Jacques; Garin, Jérome; Barbier-Brygoo, Hélène; Ephritikhine, Geneviève
2004-07-01
Identification and characterization of anion channel genes in plants represent a goal for a better understanding of their central role in cell signaling, osmoregulation, nutrition, and metabolism. Though channel activities have been well characterized in plasma membrane by electrophysiology, the corresponding molecular entities are little documented. Indeed, the hydrophobic protein equipment of plant plasma membrane still remains largely unknown, though several proteomic approaches have been reported. To identify new putative transport systems, we developed a new proteomic strategy based on mass spectrometry analyses of a plasma membrane fraction enriched in hydrophobic proteins. We produced from Arabidopsis cell suspensions a highly purified plasma membrane fraction and characterized it in detail by immunological and enzymatic tests. Using complementary methods for the extraction of hydrophobic proteins and mass spectrometry analyses on mono-dimensional gels, about 100 proteins have been identified, 95% of which had never been found in previous proteomic studies. The inventory of the plasma membrane proteome generated by this approach contains numerous plasma membrane integral proteins, one-third displaying at least four transmembrane segments. The plasma membrane localization was confirmed for several proteins, therefore validating such proteomic strategy. An in silico analysis shows a correlation between the putative functions of the identified proteins and the expected roles for plasma membrane in transport, signaling, cellular traffic, and metabolism. This analysis also reveals 10 proteins that display structural properties compatible with transport functions and will constitute interesting targets for further functional studies.
Diray-Arce, Joann; Liu, Bin; Cupp, John D; Hunt, Travis; Nielsen, Brent L
2013-03-04
The Arabidopsis thaliana genome encodes a homologue of the full-length bacteriophage T7 gp4 protein, which is also homologous to the eukaryotic Twinkle protein. While the phage protein has both DNA primase and DNA helicase activities, in animal cells Twinkle is localized to mitochondria and has only DNA helicase activity due to sequence changes in the DNA primase domain. However, Arabidopsis and other plant Twinkle homologues retain sequence homology for both functional domains of the phage protein. The Arabidopsis Twinkle homologue has been shown by others to be dual targeted to mitochondria and chloroplasts. To determine the functional activity of the Arabidopsis protein we obtained the gene for the full-length Arabidopsis protein and expressed it in bacteria. The purified protein was shown to have both DNA primase and DNA helicase activities. Western blot and qRT-PCR analysis indicated that the Arabidopsis gene is expressed most abundantly in young leaves and shoot apex tissue, as expected if this protein plays a role in organelle DNA replication. This expression is closely correlated with the expression of organelle-localized DNA polymerase in the same tissues. Homologues from other plant species show close similarity by phylogenetic analysis. The results presented here indicate that the Arabidopsis phage T7 gp4/Twinkle homologue has both DNA primase and DNA helicase activities and may provide these functions for organelle DNA replication.
Feinauer, Christoph; Procaccini, Andrea; Zecchina, Riccardo; Weigt, Martin; Pagnani, Andrea
2014-01-01
In the course of evolution, proteins show a remarkable conservation of their three-dimensional structure and their biological function, leading to strong evolutionary constraints on the sequence variability between homologous proteins. Our method aims at extracting such constraints from rapidly accumulating sequence data, and thereby at inferring protein structure and function from sequence information alone. Recently, global statistical inference methods (e.g. direct-coupling analysis, sparse inverse covariance estimation) have achieved a breakthrough towards this aim, and their predictions have been successfully implemented into tertiary and quaternary protein structure prediction methods. However, due to the discrete nature of the underlying variable (amino-acids), exact inference requires exponential time in the protein length, and efficient approximations are needed for practical applicability. Here we propose a very efficient multivariate Gaussian modeling approach as a variant of direct-coupling analysis: the discrete amino-acid variables are replaced by continuous Gaussian random variables. The resulting statistical inference problem is efficiently and exactly solvable. We show that the quality of inference is comparable or superior to the one achieved by mean-field approximations to inference with discrete variables, as done by direct-coupling analysis. This is true for (i) the prediction of residue-residue contacts in proteins, and (ii) the identification of protein-protein interaction partner in bacterial signal transduction. An implementation of our multivariate Gaussian approach is available at the website http://areeweb.polito.it/ricerca/cmp/code. PMID:24663061
Wang, Yong-Qiang; Yang, Yong; Li, Li
2013-01-01
Chromoplasts are unique plastids that accumulate massive amounts of carotenoids. To gain a general and comparative characterization of chromoplast proteins, this study performed proteomic analysis of chromoplasts from six carotenoid-rich crops: watermelon, tomato, carrot, orange cauliflower, red papaya, and red bell pepper. Stromal and membrane proteins of chromoplasts were separated by 1D gel electrophoresis and analysed using nLC-MS/MS. A total of 953–2262 proteins from chromoplasts of different crop species were identified. Approximately 60% of the identified proteins were predicted to be plastid localized. Functional classification using MapMan bins revealed large numbers of proteins involved in protein metabolism, transport, amino acid metabolism, lipid metabolism, and redox in chromoplasts from all six species. Seventeen core carotenoid metabolic enzymes were identified. Phytoene synthase, phytoene desaturase, ζ-carotene desaturase, 9-cis-epoxycarotenoid dioxygenase, and carotenoid cleavage dioxygenase 1 were found in almost all crops, suggesting relative abundance of them among the carotenoid pathway enzymes. Chromoplasts from different crops contained abundant amounts of ATP synthase and adenine nucleotide translocator, which indicates an important role of ATP production and transport in chromoplast development. Distinctive abundant proteins were observed in chromoplast from different crops, including capsanthin/capsorubin synthase and fibrillins in pepper, superoxide dismutase in watermelon, carrot, and cauliflower, and glutathione-S-transferease in papaya. The comparative analysis of chromoplast proteins among six crop species offers new insights into the general metabolism and function of chromoplasts as well as the uniqueness of chromoplasts in specific crop species. This work provides reference datasets for future experimental study of chromoplast biogenesis, development, and regulation in plants. PMID:23314817
The Functional Impact of Alternative Splicing in Cancer.
Climente-González, Héctor; Porta-Pardo, Eduard; Godzik, Adam; Eyras, Eduardo
2017-08-29
Alternative splicing changes are frequently observed in cancer and are starting to be recognized as important signatures for tumor progression and therapy. However, their functional impact and relevance to tumorigenesis remain mostly unknown. We carried out a systematic analysis to characterize the potential functional consequences of alternative splicing changes in thousands of tumor samples. This analysis revealed that a subset of alternative splicing changes affect protein domain families that are frequently mutated in tumors and potentially disrupt protein-protein interactions in cancer-related pathways. Moreover, there was a negative correlation between the number of these alternative splicing changes in a sample and the number of somatic mutations in drivers. We propose that a subset of the alternative splicing changes observed in tumors may represent independent oncogenic processes that could be relevant to explain the functional transformations in cancer, and some of them could potentially be considered alternative splicing drivers (AS drivers). Copyright © 2017 The Author(s). Published by Elsevier Inc. All rights reserved.
Yang, Xue; Xiong, Qian; Wu, Ying; Li, Siting; Ge, Feng
2017-10-06
Circular RNAs (circRNAs), a class of widespread endogenous RNAs, play crucial roles in diverse biological processes and are potential biomarkers in diverse human diseases and cancers. Cerebellar-degeneration-related protein 1 antisense RNA (CDR1as), an oncogenic circRNA, is involved in human tumorigenesis and is dysregulated in hepatocellular carcinoma (HCC). However, the molecular mechanisms underlying CDR1as functions in HCC remain unclear. Here we explored the functions of CDR1as and searched for CDR1as-regulated proteins in HCC cells. A quantitative proteomics strategy was employed to globally identify CDR1as-regulated proteins in HCC cells. In total, we identified 330 differentially expressed proteins (DEPs) upon enhanced CDR1as expression in HepG2 cells, indicating that they could be proteins regulated by CDR1as. Bioinformatic analysis revealed that many DEPs were involved in cell proliferation and the cell cycle. Further functional studies of epidermal growth factor receptor (EGFR) found that CDR1as exerts its effects on cell proliferation at least in part through the regulation of EGFR expression. We further confirmed that CDR1as could inhibit the expression of microRNA-7 (miR-7). EGFR is a validated target of miR-7; therefore, CDR1as may exert its function by regulating EGFR expression via targeting miR-7 in HCC cells. Taken together, we revealed novel functions and underlying mechanisms of CDR1as in HCC cells. This study serves as the first proteome-wide analysis of a circRNA-regulated protein in cells and provides a reliable and highly efficient method for globally identifying circRNA-regulated proteins.
Kotzsch, Alexander; Pawolski, Damian; Milentyev, Alexander; Shevchenko, Anna; Scheffel, André; Poulsen, Nicole; Shevchenko, Andrej; Kröger, Nils
2016-01-01
The nano- and micropatterned biosilica cell walls of diatoms are remarkable examples of biological morphogenesis and possess highly interesting material properties. Only recently has it been demonstrated that biosilica-associated organic structures with specific nanopatterns (termed insoluble organic matrices) are general components of diatom biosilica. The model diatom Thalassiosira pseudonana contains three types of insoluble organic matrices: chitin meshworks, organic microrings, and organic microplates, the latter being described in the present study for the first time. To date, little is known about the molecular composition, intracellular assembly, and biological functions of organic matrices. Here we have performed structural and functional analyses of the organic microrings and organic microplates from T. pseudonana. Proteomics analysis yielded seven proteins of unknown function (termed SiMat proteins) together with five known silica biomineralization proteins (four cingulins and one silaffin). The location of SiMat1-GFP in the insoluble organic microrings and the similarity of tyrosine- and lysine-rich functional domains identifies this protein as a new member of the cingulin protein family. Mass spectrometric analysis indicates that most of the lysine residues of cingulins and the other insoluble organic matrix proteins are post-translationally modified by short polyamine groups, which are known to enhance the silica formation activity of proteins. Studies with recombinant cingulins (rCinY2 and rCinW2) demonstrate that acidic conditions (pH 5.5) trigger the assembly of mixed cingulin aggregates that have silica formation activity. Our results suggest an important role for cingulins in the biogenesis of organic microrings and support the hypothesis that this type of insoluble organic matrix functions in biosilica morphogenesis. PMID:26710847
Proteobionics: biomimetics in proteomics.
Sommer, Andrei P; Gheorghiu, Eleonora
2006-03-01
Proteomics was established 10 years ago by the analysis of microbial genomes via their protein complement or proteome. Bionics is an ancient art, which converts structures optimized by nature into advanced technical products. Previously, we analyzed survival modalities in nanobacteria and converted the interplay between survival-oriented protein functions and nanoscale mineral shells into models for advanced drug delivery. Exploiting protein functions observed in nature to design biomedical products and therapies could be named proteobionics. Here, we present examples for this new branch of nanoproteomics.
Meeting Report: Structural Determination of Environmentally Responsive Proteins
Reinlib, Leslie
2005-01-01
The three-dimensional structure of gene products continues to be a missing lynchpin between linear genome sequences and our understanding of the normal and abnormal function of proteins and pathways. Enhanced activity in this area is likely to lead to better understanding of how discrete changes in molecular patterns and conformation underlie functional changes in protein complexes and, with it, sensitivity of an individual to an exposure. The National Institute of Environmental Health Sciences convened a workshop of experts in structural determination and environmental health to solicit advice for future research in structural resolution relative to environmentally responsive proteins and pathways. The highest priorities recommended by the workshop were to support studies of structure, analysis, control, and design of conformational and functional states at molecular resolution for environmentally responsive molecules and complexes; promote understanding of dynamics, kinetics, and ligand responses; investigate the mechanisms and steps in posttranslational modifications, protein partnering, impact of genetic polymorphisms on structure/function, and ligand interactions; and encourage integrated experimental and computational approaches. The workshop participants also saw value in improving the throughput and purity of protein samples and macromolecular assemblies; developing optimal processes for design, production, and assembly of macromolecular complexes; encouraging studies on protein–protein and macromolecular interactions; and examining assemblies of individual proteins and their functions in pathways of interest for environmental health. PMID:16263521
Zhou, Mi; Tang, Min; Li, Shuiming; Peng, Li; Huang, Haojun; Fang, Qihua; Liu, Zhao; Xie, Peng; Li, Gao; Zhou, Jian
2018-06-21
For specific applications, gold nanoparticles (GNPs) are commonly functionalized with various biological ligands, including amino-free ligands such as amino acids, peptides, proteins, and nucleic acids. Upon entering a biological fluid, the protein corona that forms around GNPs can conceal the targeting ligands and sterically hinder the functional properties. The protein corona is routinely prepared by standard centrifugation or sucrose cushion centrifugation. However, such methodologies are not applicable to the exclusive analysis of a ligand-binding protein corona. In this study, we first proposed a lock-in strategy based on a combination of rapid crosslinking and stringent washing. Cysteine was used as a model of amino-free ligands and attached to GNPs. After corona formation in the human plasma, GNP cysteine and corona proteins were quickly fixed by 5 s of crosslinking with 7.5% formaldehyde. After stringent washing using SDS buffer with sonication, the cysteine-bound proteins were effectively separated from unbound proteins. Qualitative and quantitative analyses using a mass spectrometry-based proteomics approach indicated that the protein composition of the cysteine-binding corona from the new method was significantly different from the composition of the whole corona from the two conventional methods. Furthermore, network and formaldehyde-linked site analyses of cysteine-binding proteins provided useful information toward a better knowledge of the behavior of protein-ligand and protein-protein interactions. Collectively, our new strategy has the capability to particularly characterize the protein composition of a cysteine-binding corona. The presented methodology in principal provides a generic way to analyze a nanoparticle corona bound to amino-free ligands and has the potential to decipher corona-masked ligand functions.
iTRAQ-Based Proteomics Analysis and Network Integration for Kernel Tissue Development in Maize
Dong, Yongbin; Wang, Qilei; Du, Chunguang; Xiong, Wenwei; Li, Xinyu; Zhu, Sailan; Li, Yuling
2017-01-01
Grain weight is one of the most important yield components and a developmentally complex structure comprised of two major compartments (endosperm and pericarp) in maize (Zea mays L.), however, very little is known concerning the coordinated accumulation of the numerous proteins involved. Herein, we used isobaric tags for relative and absolute quantitation (iTRAQ)-based comparative proteomic method to analyze the characteristics of dynamic proteomics for endosperm and pericarp during grain development. Totally, 9539 proteins were identified for both components at four development stages, among which 1401 proteins were non-redundant, 232 proteins were specific in pericarp and 153 proteins were specific in endosperm. A functional annotation of the identified proteins revealed the importance of metabolic and cellular processes, and binding and catalytic activities for the tissue development. Three and 76 proteins involved in 49 Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways were integrated for the specific endosperm and pericarp proteins, respectively, reflecting their complex metabolic interactions. In addition, four proteins with important functions and different expression levels were chosen for gene cloning and expression analysis. Different concordance between mRNA level and the protein abundance was observed across different proteins, stages, and tissues as in previous research. These results could provide useful message for understanding the developmental mechanisms in grain development in maize. PMID:28837076
Predicting Amyloidogenic Proteins in the Proteomes of Plants.
Antonets, Kirill S; Nizhnikov, Anton A
2017-10-16
Amyloids are protein fibrils with characteristic spatial structure. Though amyloids were long perceived to be pathogens that cause dozens of incurable pathologies in humans and mammals, it is currently clear that amyloids also represent a functionally important form of protein structure implicated in a variety of biological processes in organisms ranging from archaea and bacteria to fungi and animals. Despite their social significance, plants remain the most poorly studied group of organisms in the field of amyloid biology. To date, amyloid properties have only been demonstrated in vitro or in heterologous systems for a small number of plant proteins. Here, for the first time, we performed a comprehensive analysis of the distribution of potentially amyloidogenic proteins in the proteomes of approximately 70 species of land plants using the Waltz and SARP (Sequence Analysis based on the Ranking of Probabilities) bioinformatic algorithms. We analyzed more than 2.9 million protein sequences and found that potentially amyloidogenic proteins are abundant in plant proteomes. We found that such proteins are overrepresented among membrane as well as DNA- and RNA-binding proteins of plants. Moreover, seed storage and defense proteins of most plant species are rich in amyloidogenic regions. Taken together, our data demonstrate the diversity of potentially amyloidogenic proteins in plant proteomes and suggest biological processes where formation of amyloids might be functionally important.
Discovering Conformational Sub-States Relevant to Protein Function
Ramanathan, Arvind; Savol, Andrej J.; Langmead, Christopher J.; Agarwal, Pratul K.; Chennubhotla, Chakra S.
2011-01-01
Background Internal motions enable proteins to explore a range of conformations, even in the vicinity of native state. The role of conformational fluctuations in the designated function of a protein is widely debated. Emerging evidence suggests that sub-groups within the range of conformations (or sub-states) contain properties that may be functionally relevant. However, low populations in these sub-states and the transient nature of conformational transitions between these sub-states present significant challenges for their identification and characterization. Methods and Findings To overcome these challenges we have developed a new computational technique, quasi-anharmonic analysis (QAA). QAA utilizes higher-order statistics of protein motions to identify sub-states in the conformational landscape. Further, the focus on anharmonicity allows identification of conformational fluctuations that enable transitions between sub-states. QAA applied to equilibrium simulations of human ubiquitin and T4 lysozyme reveals functionally relevant sub-states and protein motions involved in molecular recognition. In combination with a reaction pathway sampling method, QAA characterizes conformational sub-states associated with cis/trans peptidyl-prolyl isomerization catalyzed by the enzyme cyclophilin A. In these three proteins, QAA allows identification of conformational sub-states, with critical structural and dynamical features relevant to protein function. Conclusions Overall, QAA provides a novel framework to intuitively understand the biophysical basis of conformational diversity and its relevance to protein function. PMID:21297978
Luo, Di; Niu, Xiangli; Yu, Jinde; Yan, Jun; Gou, Xiaojun; Lu, Bao-Rong; Liu, Yongsheng
2012-09-01
Glycine betaine (GB) is a compatible quaternary amine that enables plants to tolerate abiotic stresses, including salt, drought and cold. In plants, GB is synthesized through two-step of successive oxidations from choline, catalyzed by choline monooxygenase (CMO) and betaine aldehyde dehydrogenase (BADH), respectively. Rice is considered as a typical non-GB accumulating species, although the entire genome sequencing revealed rice contains orthologs of both CMO and BADH. Several studies unraveled that rice has a functional BADH gene, but whether rice CMO gene (OsCMO) is functional or a pseudogene remains to be elucidated. In the present study, we report the functional characterization of rice CMO gene. The OsCMO gene was isolated from rice cv. Nipponbare (Oryza sativa L. ssp. japonica) using RT-PCR. Northern blot demonstrated the transcription of OsCMO is enhanced by salt stress. Transgenic tobacco plants overexpressing OsCMO results in increased GB content and elevated tolerance to salt stress. Immunoblotting analysis demonstrates that a functional OsCMO protein with correct size was present in transgenic tobacco but rarely accumulated in wild-type rice plants. Surprisingly, a large amount of truncated proteins derived from OsCMO was induced in the rice seedlings in response to salt stresses. This suggests that it is the lack of a functional OsCMO protein that presumably results in non-GB accumulation in the tested rice plant. Expression and transgenic studies demonstrate OsCMO is transcriptionally induced in response to salt stress and functions in increasing glycinebetaine accumulation and enhancing tolerance to salt stress. Immunoblotting analysis suggests that no accumulation of glycinebetaine in the Japonica rice plant presumably results from lack of a functional OsCMO protein.
Ladunga, I
1992-04-01
The markedly nonuniform, even systematic distribution of sequences in the protein "universe" has been analyzed by methods of protein taxonomy. Mapping of the natural hierarchical system of proteins has revealed some dense cores, i.e., well-defined clusterings of proteins that seem to be natural structural groupings, possibly seeds for a future protein taxonomy. The aim was not to force proteins into more or less man-made categories by discriminant analysis, but to find structurally similar groups, possibly of common evolutionary origin. Single-valued distance measures between pairs of superfamilies from the Protein Identification Resource were defined by two chi 2-like methods on tripeptide frequencies and the variable-length subsequence identity method derived from dot-matrix comparisons. Distance matrices were processed by several methods of cluster analysis to detect phylogenetic continuum between highly divergent proteins. Only well-defined clusters characterized by relatively unique structural, intracellular environmental, organismal, and functional attribute states were selected as major protein groups, including subsets of viral and Escherichia coli proteins, hormones, inhibitors, plant, ribosomal, serum and structural proteins, amino acid synthases, and clusters dominated by certain oxidoreductases and apolar and DNA-associated enzymes. The limited repertoire of functional patterns due to small genome size, the high rate of recombination, specific features of the bacterial membranes, or of the virus cycle canalize certain proteins of viruses and Gram-negative bacteria, respectively, to organismal groups.
Li, Shijun; Ehrhardt, David W.; Rhee, Seung Y.
2006-01-01
Cells are organized into a complex network of subcellular compartments that are specialized for various biological functions. Subcellular location is an important attribute of protein function. To facilitate systematic elucidation of protein subcellular location, we analyzed experimentally verified protein localization data of 1,300 Arabidopsis (Arabidopsis thaliana) proteins. The 1,300 experimentally verified proteins are distributed among 40 different compartments, with most of the proteins localized to four compartments: mitochondria (36%), nucleus (28%), plastid (17%), and cytosol (13.3%). About 19% of the proteins are found in multiple compartments, in which a high proportion (36.4%) is localized to both cytosol and nucleus. Characterization of the overrepresented Gene Ontology molecular functions and biological processes suggests that the Golgi apparatus and peroxisome may play more diverse functions but are involved in more specialized processes than other compartments. To support systematic empirical determination of protein subcellular localization using a technology called fluorescent tagging of full-length proteins, we developed a database and Web application to provide preselected green fluorescent protein insertion position and primer sequences for all Arabidopsis proteins to study their subcellular localization and to store experimentally verified protein localization images, videos, and their annotations of proteins generated using the fluorescent tagging of full-length proteins technology. The database can be searched, browsed, and downloaded using a Web browser at http://aztec.stanford.edu/gfp/. The software can also be downloaded from the same Web site for local installation. PMID:16617091
Bradford, Emily M; Vairamani, Kanimozhi; Shull, Gary E
2016-02-15
To investigate the intestinal functions of the NKCC1 Na(+)-K(+)-2Cl cotransporter (SLC12a2 gene), differential mRNA expression changes in NKCC1-null intestine were analyzed. Microarray analysis of mRNA from intestines of adult wild-type mice and gene-targeted NKCC1-null mice (n = 6 of each genotype) was performed to identify patterns of differential gene expression changes. Differential expression patterns were further examined by Gene Ontology analysis using the online Gorilla program, and expression changes of selected genes were verified using northern blot analysis and quantitative real time-polymerase chain reaction. Histological staining and immunofluorescence were performed to identify cell types in which upregulated pancreatic digestive enzymes were expressed. Genes typically associated with pancreatic function were upregulated. These included lipase, amylase, elastase, and serine proteases indicative of pancreatic exocrine function, as well as insulin and regenerating islet genes, representative of endocrine function. Northern blot analysis and immunohistochemistry showed that differential expression of exocrine pancreas mRNAs was specific to the duodenum and localized to a subset of goblet cells. In addition, a major pattern of changes involving differential expression of olfactory receptors that function in chemical sensing, as well as other chemosensing G-protein coupled receptors, was observed. These changes in chemosensory receptor expression may be related to the failure of intestinal function and dependency on parenteral nutrition observed in humans with SLC12a2 mutations. The results suggest that loss of NKCC1 affects not only secretion, but also goblet cell function and chemosensing of intestinal contents via G-protein coupled chemosensory receptors.
Lymphocyte signaling : beyond knockouts
Saveliev, Alexander; Tybulewicz, Victor L. J.
2016-01-01
The analysis of lymphocyte signaling was greatly enhanced by the advent of gene targeting, which allows the selective inactivation of a single gene. Whereas this gene ‘knockout’ approach is often informative, in many cases the phenotype resulting from gene ablation might not provide a complete picture of the function of the corresponding protein. If a protein has multiple functions within a single or several signaling pathways, or stabilizes other proteins in a complex, the phenotypic consequences of a gene knockout may manifest as a combination of several different perturbations. In these cases, gene targeting to ‘knockin’ subtle point mutations might provide more accurate insight into protein function. However, to be informative, such mutations must be carefully designed based on structural and biophysical data. PMID:19295633
FACETS: multi-faceted functional decomposition of protein interaction networks.
Seah, Boon-Siew; Bhowmick, Sourav S; Dewey, C Forbes
2012-10-15
The availability of large-scale curated protein interaction datasets has given rise to the opportunity to investigate higher level organization and modularity within the protein-protein interaction (PPI) network using graph theoretic analysis. Despite the recent progress, systems level analysis of high-throughput PPIs remains a daunting task because of the amount of data they present. In this article, we propose a novel PPI network decomposition algorithm called FACETS in order to make sense of the deluge of interaction data using Gene Ontology (GO) annotations. FACETS finds not just a single functional decomposition of the PPI network, but a multi-faceted atlas of functional decompositions that portray alternative perspectives of the functional landscape of the underlying PPI network. Each facet in the atlas represents a distinct interpretation of how the network can be functionally decomposed and organized. Our algorithm maximizes interpretative value of the atlas by optimizing inter-facet orthogonality and intra-facet cluster modularity. We tested our algorithm on the global networks from IntAct, and compared it with gold standard datasets from MIPS and KEGG. We demonstrated the performance of FACETS. We also performed a case study that illustrates the utility of our approach. Supplementary data are available at the Bioinformatics online. Our software is available freely for non-commercial purposes from: http://www.cais.ntu.edu.sg/~assourav/Facets/
USDA-ARS?s Scientific Manuscript database
Potato cyst nematodes (PCNs), including Globodera rostochiensis (Woll.), are important pests of potato. Plant parasitic nematodes produce multiple effector proteins, secreted from their stylets, to successfully infect their hosts. These include proteins that are delivered to the apoplast, as well as...
The interactome of CCT complex - A computational analysis.
Narayanan, Aswathy; Pullepu, Dileep; Kabir, M Anaul
2016-10-01
The eukaryotic chaperonin, CCT (Chaperonin Containing TCP1 or TriC-TCP-1 Ring Complex) has been subjected to physical and genetic analyses in S. cerevisiae which can be extrapolated to human CCT (hCCT), owing to its structural and functional similarities with yeast CCT (yCCT). Studies on hCCT and its interactome acquire an additional dimension, as it has been implicated in several disease conditions like neurodegeneration and cancer. We attempt to study its stress response role in general, which will be reflected in the aspects of human diseases and yeast physiology, through computational analysis of the interactome. Towards consolidating and analysing the interactome data, we prepared and compared the unique CCT-interacting protein lists for S. cerevisiae and H. sapiens, performed GO term classification and enrichment studies which provide information on the diversity in CCT interactome, in terms of protein classes in the data set. Enrichment with disease-associated proteins and pathways highlight the medical importance of CCT. Different analyses converge, suggesting the significance of WD-repeat proteins, protein kinases and cytoskeletal proteins in the interactome. The prevalence of proteasomal subunits and ribosomal proteins suggest a possible cross-talk between protein-synthesis, folding and degradation machinery. A network of chaperones and chaperonins that function in combination can also be envisaged from the CCT interactome-Hsp70 interactome analysis. Copyright © 2016 Elsevier Ltd. All rights reserved.
Retracing Evolution of Red Fluorescence in GFP-Like Proteins from Faviina Corals
Field, Steven F.; Matz, Mikhail V.
2010-01-01
Proteins of the green fluorescent protein family represent a convenient experimental model to study evolution of novelty at the molecular level. Here, we focus on the origin of Kaede-like red fluorescent proteins characteristic of the corals of the Faviina suborder. We demonstrate, using an original approach involving resurrection and analysis of the library of possible evolutionary intermediates, that it takes on the order of 12 mutations, some of which strongly interact epistatically, to fully recapitulate the evolution of a red fluorescent phenotype from the ancestral green. Five of the identified mutations would not have been found without the help of ancestral reconstruction, because the corresponding site states are shared between extant red and green proteins due to their recent descent from a dual-function common ancestor. Seven of the 12 mutations affect residues that are not in close contact with the chromophore and thus must exert their effect indirectly through adjustments of the overall protein fold; the relevance of these mutations could not have been anticipated from the purely theoretical analysis of the protein's structure. Our results introduce a powerful experimental approach for comparative analysis of functional specificity in protein families even in the cases of pronounced epistasis, provide foundation for the detailed studies of evolutionary trajectories leading to novelty and complexity, and will help rational modification of existing fluorescent labels. PMID:19793832
He, Jia-Hui; Sun, Jie-Li; Yan, Wen-Juan; Wang, Fang
2017-05-20
To identify the functions of the proteins containing the GGDEF or EAL domain in Lactobacillus acidophilus for investigation of the regulatory mechanism of c-di-GMP in this strain. The DNA fragments of NH13_07045-GGDEF, NH13_07050 and NH13_07055 from Lactobacillus acidophilus ATCC4356 were amplified by PCR and cloned into the expression vector pMAL-His-c2. After sequencing, the recombinant plasmids were transformed into competent Escherichia coli cells, which were induced by IPTG to express the recombinant proteins fused with maltose binding protein (MBP). The fusion proteins were purified using amylose resin column for diguanylate cyclase (DGC) or phosphodiesterase (PDE) activity assays in vitro followed by analysis with high-performance liquid chromatography (HPLC). The target DNA fragments were obtained by PCR, and their sequences were all identical to that in GenBank. The purified and concentrated fusion proteins, which were identified by SDS-PAGE and Western blotting, had relative molecular masses of 59 kD, 67 kD and 72 kD. HPLC analysis showed no DGC activity in NH13_07045-GGDEF, while PDE activity was found in NH13_07050 but not in NH13_07055. We obtained the protein encoded by NH13_07050 that possesses PDE activity in vitro. This protein may facilitate the evaluation of the regulatory function of c-di-GMP in Lactobacillus acidophilus.
Chen, Yanyu; Xie, Yong; Xu, Lai; Zhan, Shaohua; Xiao, Yi; Gao, Yanpan; Wu, Bin; Ge, Wei
2017-02-15
Tumor cells of colorectal cancer (CRC) release exosomes into the circulation. These exosomes can mediate communication between cells and affect various tumor-related processes in their target cells. We present a quantitative proteomics analysis of the exosomes purified from serum of patients with CRC and normal volunteers; data are available via ProteomeXchange with identifier PXD003875. We identified 918 proteins with an overlap of 725 Gene IDs in the Exocarta proteins list. Compared with the serum-purified exosomes (SPEs) of normal volunteers, we found 36 proteins upregulated and 22 proteins downregulated in the SPEs of CRC patients. Bioinformatics analysis revealed that upregulated proteins are involved in processes that modulate the pretumorigenic microenvironment for metastasis. In contrast, differentially expressed proteins (DEPs) that play critical roles in tumor growth and cell survival were principally downregulated. Our study demonstrates that SPEs of CRC patients play a pivotal role in promoting the tumor invasiveness, but have minimal influence on putative alterations in tumor survival or proliferation. According to bioinformatics analysis, we speculate that the protein contents of exosomes might be associated with whether they are involved in premetastatic niche establishment or growth and survival of metastatic tumor cells. This information will be helpful in elucidating the pathophysiological functions of tumor-derived exosomes, and aid in the development of CRC diagnostics and therapeutics. © 2016 UICC.
Dhanyalakshmi, K H; Naika, Mahantesha B N; Sajeevan, R S; Mathew, Oommen K; Shafi, K Mohamed; Sowdhamini, Ramanathan; N Nataraja, Karaba
2016-01-01
The modern sequencing technologies are generating large volumes of information at the transcriptome and genome level. Translation of this information into a biological meaning is far behind the race due to which a significant portion of proteins discovered remain as proteins of unknown function (PUFs). Attempts to uncover the functional significance of PUFs are limited due to lack of easy and high throughput functional annotation tools. Here, we report an approach to assign putative functions to PUFs, identified in the transcriptome of mulberry, a perennial tree commonly cultivated as host of silkworm. We utilized the mulberry PUFs generated from leaf tissues exposed to drought stress at whole plant level. A sequence and structure based computational analysis predicted the probable function of the PUFs. For rapid and easy annotation of PUFs, we developed an automated pipeline by integrating diverse bioinformatics tools, designated as PUFs Annotation Server (PUFAS), which also provides a web service API (Application Programming Interface) for a large-scale analysis up to a genome. The expression analysis of three selected PUFs annotated by the pipeline revealed abiotic stress responsiveness of the genes, and hence their potential role in stress acclimation pathways. The automated pipeline developed here could be extended to assign functions to PUFs from any organism in general. PUFAS web server is available at http://caps.ncbs.res.in/pufas/ and the web service is accessible at http://capservices.ncbs.res.in/help/pufas.
Evolution of sparsity and modularity in a model of protein allostery
NASA Astrophysics Data System (ADS)
Hemery, Mathieu; Rivoire, Olivier
2015-04-01
The sequence of a protein is not only constrained by its physical and biochemical properties under current selection, but also by features of its past evolutionary history. Understanding the extent and the form that these evolutionary constraints may take is important to interpret the information in protein sequences. To study this problem, we introduce a simple but physical model of protein evolution where selection targets allostery, the functional coupling of distal sites on protein surfaces. This model shows how the geometrical organization of couplings between amino acids within a protein structure can depend crucially on its evolutionary history. In particular, two scenarios are found to generate a spatial concentration of functional constraints: high mutation rates and fluctuating selective pressures. This second scenario offers a plausible explanation for the high tolerance of natural proteins to mutations and for the spatial organization of their least tolerant amino acids, as revealed by sequence analysis and mutagenesis experiments. It also implies a faculty to adapt to new selective pressures that is consistent with observations. The model illustrates how several independent functional modules may emerge within the same protein structure, depending on the nature of past environmental fluctuations. Our model thus relates the evolutionary history of proteins to the geometry of their functional constraints, with implications for decoding and engineering protein sequences.
Gold, Nicola D; Jackson, Richard M
2006-02-03
The rapid growth in protein structural data and the emergence of structural genomics projects have increased the need for automatic structure analysis and tools for function prediction. Small molecule recognition is critical to the function of many proteins; therefore, determination of ligand binding site similarity is important for understanding ligand interactions and may allow their functional classification. Here, we present a binding sites database (SitesBase) that given a known protein-ligand binding site allows rapid retrieval of other binding sites with similar structure independent of overall sequence or fold similarity. However, each match is also annotated with sequence similarity and fold information to aid interpretation of structure and functional similarity. Similarity in ligand binding sites can indicate common binding modes and recognition of similar molecules, allowing potential inference of function for an uncharacterised protein or providing additional evidence of common function where sequence or fold similarity is already known. Alternatively, the resource can provide valuable information for detailed studies of molecular recognition including structure-based ligand design and in understanding ligand cross-reactivity. Here, we show examples of atomic similarity between superfamily or more distant fold relatives as well as between seemingly unrelated proteins. Assignment of unclassified proteins to structural superfamiles is also undertaken and in most cases substantiates assignments made using sequence similarity. Correct assignment is also possible where sequence similarity fails to find significant matches, illustrating the potential use of binding site comparisons for newly determined proteins.
Kaur, Inderjeet; Zeeshan, Mohammad; Saini, Ekta; Kaushik, Abhinav; Mohmmed, Asif; Gupta, Dinesh; Malhotra, Pawan
2016-10-20
Post-transcriptional and post-translational modifications play a major role in Plasmodium life cycle regulation. Lysine methylation of histone proteins is well documented in several organisms, however in recent years lysine methylation of proteins outside histone code is emerging out as an important post-translational modification (PTM). In the present study we have performed global analysis of lysine methylation of proteins in asexual blood stages of Plasmodium falciparum development. We immunoprecipitated stage specific Plasmodium lysates using anti-methyl lysine specific antibodies that immunostained the asexual blood stage parasites. Using liquid chromatography and tandem mass spectrometry analysis, 570 lysine methylated proteins at three different blood stages were identified. Analysis of the peptide sequences identified 605 methylated sites within 422 proteins. Functional classification of the methylated proteins revealed that the proteins are mainly involved in nucleotide metabolic processes, chromatin organization, transport, homeostatic processes and protein folding. The motif analysis of the methylated lysine peptides reveals novel motifs. Many of the identified lysine methylated proteins are also interacting partners/substrates of PfSET domain proteins as revealed by STRING database analysis. Our findings suggest that the protein methylation at lysine residues is widespread in Plasmodium and plays an important regulatory role in diverse set of the parasite pathways.
Normal mode analysis and applications in biological physics.
Dykeman, Eric C; Sankey, Otto F
2010-10-27
Normal mode analysis has become a popular and often used theoretical tool in the study of functional motions in enzymes, viruses, and large protein assemblies. The use of normal modes in the study of these motions is often extremely fruitful since many of the functional motions of large proteins can be described using just a few normal modes which are intimately related to the overall structure of the protein. In this review, we present a broad overview of several popular methods used in the study of normal modes in biological physics including continuum elastic theory, the elastic network model, and a new all-atom method, recently developed, which is capable of computing a subset of the low frequency vibrational modes exactly. After a review of the various methods, we present several examples of applications of normal modes in the study of functional motions, with an emphasis on viral capsids.
Spectra-first feature analysis in clinical proteomics - A case study in renal cancer.
Goh, Wilson Wen Bin; Wong, Limsoon
2016-10-01
In proteomics, useful signal may be unobserved or lost due to the lack of confident peptide-spectral matches. Selection of differential spectra, followed by associative peptide/protein mapping may be a complementary strategy for improving sensitivity and comprehensiveness of analysis (spectra-first paradigm). This approach is complementary to the standard approach where functional analysis is performed only on the finalized protein list assembled from identified peptides from the spectra (protein-first paradigm). Based on a case study of renal cancer, we introduce a simple spectra-binning approach, MZ-bin. We demonstrate that differential spectra feature selection using MZ-bin is class-discriminative and can trace relevant proteins via spectra associative mapping. Moreover, proteins identified in this manner are more biologically coherent than those selected directly from the finalized protein list. Analysis of constituent peptides per protein reveals high expression inconsistency, suggesting that the measured protein expressions are in fact, poor approximations of true protein levels. Moreover, analysis at the level of constituent peptides may provide higher resolution insight into the underlying biology: Via MZ-bin, we identified for the first time differential splice forms for the known renal cancer marker MAPT. We conclude that the spectra-first analysis paradigm is a complementary strategy to the traditional protein-first paradigm and can provide deeper level insight.
Wang, Ting; Tan, Siow Ying; Mutilangi, William; Aykas, Didem P; Rodriguez-Saona, Luis E
2015-10-01
The objective of this study was to develop a simple and rapid method to differentiate whey protein types (WPC, WPI, and WPH) used for beverage manufacturing by combining the spectral signature collected from portable mid-infrared spectrometers and pattern recognition analysis. Whey protein powders from different suppliers are produced using a large number of processing and compositional variables, resulting in variation in composition, concentration, protein structure, and thus functionality. Whey protein powders including whey protein isolates, whey protein concentrates and whey protein hydrolysates were obtained from different suppliers and their spectra collected using portable mid-infrared spectrometers (single and triple reflection) by pressing the powder onto an Attenuated Total Reflectance (ATR) diamond crystal with a pressure clamp. Spectra were analyzed by soft independent modeling of class analogy (SIMCA) generating a classification model showing the ability to differentiate whey protein types by forming tight clusters with interclass distance values of >3, considered to be significantly different from each other. The major bands centered at 1640 and 1580 cm(-1) were responsible for separation and were associated with differences in amide I and amide II vibrations of proteins, respectively. Another important band in whey protein clustering was associated with carboxylate vibrations of acidic amino acids (∼1570 cm(-1)). The use of a portable mid-IR spectrometer combined with pattern recognition analysis showed potential for discriminating whey protein ingredients that can help to streamline the analytical procedure so that it is more applicable for field-based screening of ingredients. A rapid, simple and accurate method was developed to authenticate commercial whey protein products by using portable mid-infrared spectrometers combined with chemometrics, which could help ensure the functionality of whey protein ingredients in food applications. © 2015 Institute of Food Technologists®
Wang, Honglin; Sun, Yue; Chang, Jianhong; Zheng, Fangfang; Pei, Haixia; Yi, Yanjun; Chang, Caren; Dong, Chun-Hai
2016-07-01
Ethylene as a gaseous plant hormone is directly involved in various processes during plant growth and development. Much is known regarding the ethylene receptors and regulatory factors in the ethylene signal transduction pathway. In Arabidopsis thaliana, REVERSION-TO-ETHYLENE SENSITIVITY1 (RTE1) can interact with and positively regulates the ethylene receptor ETHYLENE RESPONSE1 (ETR1). In this study we report the identification and characterization of an RTE1-interacting protein, a putative Arabidopsis lipid transfer protein 1 (LTP1) of unknown function. Through bimolecular fluorescence complementation, a direct molecular interaction between LTP1 and RTE1 was verified in planta. Analysis of an LTP1-GFP fusion in transgenic plants and plasmolysis experiments revealed that LTP1 is localized to the cytoplasm. Analysis of ethylene responses showed that the ltp1 knockout is hypersensitive to 1-aminocyclopropanecarboxylic acid (ACC), while LTP1 overexpression confers insensitivity. Analysis of double mutants etr1-2 ltp1 and rte1-3 ltp1 demonstrates a regulatory function of LTP1 in ethylene receptor signaling through the molecular association with RTE1. This study uncovers a novel function of Arabidopsis LTP1 in the regulation of ethylene response and signaling.
Pietrosemoli, Natalia; García-Martín, Juan A; Solano, Roberto; Pazos, Florencio
2013-01-01
Intrinsically disordered proteins/regions (IDPs/IDRs) are currently recognized as a widespread phenomenon having key cellular functions. Still, many aspects of the function of these proteins need to be unveiled. IDPs conformational flexibility allows them to recognize and interact with multiple partners, and confers them larger interaction surfaces that may increase interaction speed. For this reason, molecular interactions mediated by IDPs/IDRs are particularly abundant in certain types of protein interactions, such as those of signaling and cell cycle control. We present the first large-scale study of IDPs in Arabidopsis thaliana, the most widely used model organism in plant biology, in order to get insight into the biological roles of these proteins in plants. The work includes a comparative analysis with the human proteome to highlight the differential use of disorder in both species. Results show that while human proteins are in general more disordered, certain functional classes, mainly related to environmental response, are significantly more enriched in disorder in Arabidopsis. We propose that because plants cannot escape from environmental conditions as animals do, they use disorder as a simple and fast mechanism, independent of transcriptional control, for introducing versatility in the interaction networks underlying these biological processes so that they can quickly adapt and respond to challenging environmental conditions.
Bacterial membrane proteomics.
Poetsch, Ansgar; Wolters, Dirk
2008-10-01
About one quarter to one third of all bacterial genes encode proteins of the inner or outer bacterial membrane. These proteins perform essential physiological functions, such as the import or export of metabolites, the homeostasis of metal ions, the extrusion of toxic substances or antibiotics, and the generation or conversion of energy. The last years have witnessed completion of a plethora of whole-genome sequences of bacteria important for biotechnology or medicine, which is the foundation for proteome and other functional genome analyses. In this review, we discuss the challenges in membrane proteome analysis, starting from sample preparation and leading to MS-data analysis and quantification. The current state of available proteomics technologies as well as their advantages and disadvantages will be described with a focus on shotgun proteomics. Then, we will briefly introduce the most abundant proteins and protein families present in bacterial membranes before bacterial membrane proteomics studies of the last years will be presented. It will be shown how these works enlarged our knowledge about the physiological adaptations that take place in bacteria during fine chemical production, bioremediation, protein overexpression, and during infections. Furthermore, several examples from literature demonstrate the suitability of membrane proteomics for the identification of antigens and different pathogenic strains, as well as the elucidation of membrane protein structure and function.
Comparative proteomic analysis of outer membrane protein 43 (omp43)-deficient Bartonella henselae.
Kang, Jun-Gu; Lee, Hee-Woo; Ko, Sungjin; Chae, Joon-Seok
2018-01-31
Outer membrane proteins (OMPs) of Gram-negative bacteria constitute the first line of defense protecting cells against environmental stresses including chemical, biophysical, and biological attacks. Although the 43-kDa OMP (OMP43) is major porin protein among Bartonella henselae -derived OMPs, its function remains unreported. In this study, OMP43-deficient mutant B. henselae (Δomp43) was generated to investigate OMP43 function. Interestingly, Δ omp 43 exhibited weaker proliferative ability than that of wild-type (WT) B. henselae . To study the differences in proteomic expression between WT and Δ omp 43, two-dimensional gel electrophoresis-based proteomic analysis was performed. Based on Clusters of Orthologus Groups functional assignments, 12 proteins were associated with metabolism, 7 proteins associated with information storage and processing, and 3 proteins associated with cellular processing and signaling. By semi-quantitative reverse transcriptase polymerase chain reaction, increases in tld D, efp, ntr X, pdh A, pur B, and ATPA mRNA expression and decreases in Rho and yfe A mRNA expression were confirmed in Δ omp 43. In conclusion, this is the first report showing that a loss of OMP43 expression in B. henselae leads to retarded proliferation. Furthermore, our proteomic data provide useful information for the further investigation of mechanisms related to the growth of B. henselae.
MutationAligner: a resource of recurrent mutation hotspots in protein domains in cancer.
Gauthier, Nicholas Paul; Reznik, Ed; Gao, Jianjiong; Sumer, Selcuk Onur; Schultz, Nikolaus; Sander, Chris; Miller, Martin L
2016-01-04
The MutationAligner web resource, available at http://www.mutationaligner.org, enables discovery and exploration of somatic mutation hotspots identified in protein domains in currently (mid-2015) more than 5000 cancer patient samples across 22 different tumor types. Using multiple sequence alignments of protein domains in the human genome, we extend the principle of recurrence analysis by aggregating mutations in homologous positions across sets of paralogous genes. Protein domain analysis enhances the statistical power to detect cancer-relevant mutations and links mutations to the specific biological functions encoded in domains. We illustrate how the MutationAligner database and interactive web tool can be used to explore, visualize and analyze mutation hotspots in protein domains across genes and tumor types. We believe that MutationAligner will be an important resource for the cancer research community by providing detailed clues for the functional importance of particular mutations, as well as for the design of functional genomics experiments and for decision support in precision medicine. MutationAligner is slated to be periodically updated to incorporate additional analyses and new data from cancer genomics projects. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Functional genomics in zebrafish permits rapid characterization of novel platelet membrane proteins.
O'Connor, Marie N; Salles, Isabelle I; Cvejic, Ana; Watkins, Nicholas A; Walker, Adam; Garner, Stephen F; Jones, Chris I; Macaulay, Iain C; Steward, Michael; Zwaginga, Jaap-Jan; Bray, Sarah L; Dudbridge, Frank; de Bono, Bernard; Goodall, Alison H; Deckmyn, Hans; Stemple, Derek L; Ouwehand, Willem H
2009-05-07
In this study, we demonstrate the suitability of the vertebrate Danio rerio (zebrafish) for functional screening of novel platelet genes in vivo by reverse genetics. Comparative transcript analysis of platelets and their precursor cell, the megakaryocyte, together with nucleated blood cell elements, endothelial cells, and erythroblasts, identified novel platelet membrane proteins with hitherto unknown roles in thrombus formation. We determined the phenotype induced by antisense morpholino oligonucleotide (MO)-based knockdown of 5 of these genes in a laser-induced arterial thrombosis model. To validate the model, the genes for platelet glycoprotein (GP) IIb and the coagulation protein factor VIII were targeted. MO-injected fish showed normal thrombus initiation but severely impaired thrombus growth, consistent with the mouse knockout phenotypes, and concomitant knockdown of both resulted in spontaneous bleeding. Knockdown of 4 of the 5 novel platelet proteins altered arterial thrombosis, as demonstrated by modified kinetics of thrombus initiation and/or development. We identified a putative role for BAMBI and LRRC32 in promotion and DCBLD2 and ESAM in inhibition of thrombus formation. We conclude that phenotypic analysis of MO-injected zebrafish is a fast and powerful method for initial screening of novel platelet proteins for function in thrombosis.
Functional genomics in zebrafish permits rapid characterization of novel platelet membrane proteins
O'Connor, Marie N.; Salles, Isabelle I.; Cvejic, Ana; Watkins, Nicholas A.; Walker, Adam; Garner, Stephen F.; Jones, Chris I.; Macaulay, Iain C.; Steward, Michael; Zwaginga, Jaap-Jan; Bray, Sarah L.; Dudbridge, Frank; de Bono, Bernard; Goodall, Alison H.; Stemple, Derek L.; Ouwehand, Willem H.
2009-01-01
In this study, we demonstrate the suitability of the vertebrate Danio rerio (zebrafish) for functional screening of novel platelet genes in vivo by reverse genetics. Comparative transcript analysis of platelets and their precursor cell, the megakaryocyte, together with nucleated blood cell elements, endothelial cells, and erythroblasts, identified novel platelet membrane proteins with hitherto unknown roles in thrombus formation. We determined the phenotype induced by antisense morpholino oligonucleotide (MO)–based knockdown of 5 of these genes in a laser-induced arterial thrombosis model. To validate the model, the genes for platelet glycoprotein (GP) IIb and the coagulation protein factor VIII were targeted. MO-injected fish showed normal thrombus initiation but severely impaired thrombus growth, consistent with the mouse knockout phenotypes, and concomitant knockdown of both resulted in spontaneous bleeding. Knockdown of 4 of the 5 novel platelet proteins altered arterial thrombosis, as demonstrated by modified kinetics of thrombus initiation and/or development. We identified a putative role for BAMBI and LRRC32 in promotion and DCBLD2 and ESAM in inhibition of thrombus formation. We conclude that phenotypic analysis of MO-injected zebrafish is a fast and powerful method for initial screening of novel platelet proteins for function in thrombosis. PMID:19109564
Compound Heterozygosity for Y Box Proteins Causes Sterility Due to Loss of Translational Repression
Sharma, Manju; Dearth, Andrea; Smith, Benjamin; Braun, Robert E.
2015-01-01
The Y-box proteins YBX2 and YBX3 bind RNA and DNA and are required for metazoan development and fertility. However, possible functional redundancy between YBX2 and YBX3 has prevented elucidation of their molecular function as RNA masking proteins and identification of their target RNAs. To investigate possible functional redundancy between YBX2 and YBX3, we attempted to construct Ybx2 -/- ;Ybx3 -/- double mutants using a previously reported Ybx2 -/- model and a newly generated global Ybx3 -/- model. Loss of YBX3 resulted in reduced male fertility and defects in spermatid differentiation. However, homozygous double mutants could not be generated as haploinsufficiency of both Ybx2 and Ybx3 caused sterility characterized by extensive defects in spermatid differentiation. RNA sequence analysis of mRNP and polysome occupancy in single and compound Ybx2/3 heterozygotes revealed loss of translational repression almost exclusively in the compound Ybx2/3 heterozygotes. RNAseq analysis also demonstrated that Y-box protein dose-dependent loss of translational regulation was inversely correlated with the presence of a Y box recognition target sequence, suggesting that Y box proteins bind RNA hierarchically to modulate translation in a range of targets. PMID:26646932
Brenner, Wolfram G; Leuendorf, Jan Erik; Cortleven, Anne; Martin, Laetitia B B; Schaller, Hubert; Schmülling, Thomas
2017-05-17
Protein degradation by the ubiquitin-26S proteasome pathway is important for the regulation of cellular processes, but the function of most F-box proteins relevant to substrate recognition is unknown. We describe the analysis of the gene Cytokinin-induced F-box encoding (CFB, AT3G44326), identified in a meta-analysis of cytokinin-related transcriptome studies as one of the most robust cytokinin response genes. F-box domain-dependent interaction with the E3 ubiquitin ligase complex component ASK1 classifies CFB as a functional F-box protein. Apart from F-box and transmembrane domains, CFB contains no known functional domains. CFB is expressed in all plant tissues, predominantly in root tissue. A ProCFB:GFP-GUS fusion gene showed strongest expression in the lateral root cap and during lateral root formation. CFB-GFP fusion proteins were mainly localized in the nucleus and the cytosol but also at the plasma membrane. cfb mutants had no discernible phenotype, but CFB overexpressing plants showed several defects, such as a white upper inflorescence stem, similar to the hypomorphic cycloartenol synthase mutant cas1-1. Both CFB overexpressing plants and cas1-1 mutants accumulated the CAS1 substrate 2,3-oxidosqualene in the white stem tissue, the latter even more after cytokinin treatment, indicating impairment of CAS1 function. This suggests that CFB may link cytokinin and the sterol biosynthesis pathway. © The Author 2017. Published by Oxford University Press on behalf of the Society for Experimental Biology.
HaloTag Technology: A Versatile Platform for Biomedical Applications
2015-01-01
Exploration of protein function and interaction is critical for discovering links among genomics, proteomics, and disease state; yet, the immense complexity of proteomics found in biological systems currently limits our investigational capacity. Although affinity and autofluorescent tags are widely employed for protein analysis, these methods have been met with limited success because they lack specificity and require multiple fusion tags and genetic constructs. As an alternative approach, the innovative HaloTag protein fusion platform allows protein function and interaction to be comprehensively analyzed using a single genetic construct with multiple capabilities. This is accomplished using a simplified process, in which a variable HaloTag ligand binds rapidly to the HaloTag protein (usually linked to the protein of interest) with high affinity and specificity. In this review, we examine all current applications of the HaloTag technology platform for biomedical applications, such as the study of protein isolation and purification, protein function, protein–protein and protein–DNA interactions, biological assays, in vitro cellular imaging, and in vivo molecular imaging. In addition, novel uses of the HaloTag platform are briefly discussed along with potential future applications. PMID:25974629
Partial cooperative unfolding in proteins as observed by hydrogen exchange mass spectrometry
Engen, John R.; Wales, Thomas E.; Chen, Shugui; Marzluff, Elaine M.; Hassell, Kerry M.; Weis, David D.; Smithgall, Thomas E.
2013-01-01
Many proteins do not exist in a single rigid conformation. Protein motions, or dynamics, exist and in many cases are important for protein function. The analysis of protein dynamics relies on biophysical techniques that can distinguish simultaneously existing populations of molecules and their rates of interconversion. Hydrogen exchange (HX) detected by mass spectrometry (MS) is contributing to our understanding of protein motions by revealing unfolding and dynamics on a wide timescale, ranging from seconds to hours to days. In this review we discuss HX MS-based analyses of protein dynamics, using our studies of multi-domain kinases as examples. Using HX MS, we have successfully probed protein dynamics and unfolding in the isolated SH3, SH2 and kinase domains of the c-Src and Abl kinase families, as well as the role of inter- and intra-molecular interactions in the global control of kinase function. Coupled with high-resolution structural information, HX MS has proved to be a powerful and versatile tool for the analysis of the conformational dynamics in these kinase systems, and has provided fresh insight regarding the regulatory control of these important signaling proteins. HX MS studies of dynamics are applicable not only to the proteins we illustrate here, but to a very wide range of proteins and protein systems, and should play a role in both classification of and greater understanding of the prevalence of protein motion. PMID:23682200
NASA Astrophysics Data System (ADS)
Xu, Jie; Wang, Hongqi; Kong, Dekang
2018-01-01
Although the degradation pathways of Polycyclic aromatic hydrocarbons (PAHs) have been extensively studied in many bacteria, the variations in the expression levels of the key functional regulation of proteins during catabolism are still not quantitatively understood. In this study, we compared two proteomic methods, that one is two-dimensional gel electrophoresis (2-DE), a traditional widely used way and the other is isobaric tags for relative and absolute quantization (iTRAQ), an innovative approach, in order to analyze the functional regulation at the protein level in high effective fluoranthene-degrading bacteria named Rhodococcus sp. BAP-1. The number of differentially expressed proteins identified using iTRAQ is much larger than employing 2-DE. Response to fluoranthene, the key over expressed proteins in BAP-1 were NADPH-dependent FMN reductase, 30S ribosomal protein S2, S-ribosylhomocysteinase, etc.; the significant down-regulated proteins were cytochrome ubiquinol oxidase subunit, NAD(P) transhydrogenase subunit alpha, 5-methyltetrahydropteroyltriglutamate-homocysteine methyltransferase, et al.
Necci, Marco; Piovesan, Damiano; Tosatto, Silvio C E
2016-12-01
Intrinsic disorder (ID) in proteins has been extensively described for the last decade; a large-scale classification of ID in proteins is mostly missing. Here, we provide an extensive analysis of ID in the protein universe on the UniProt database derived from sequence-based predictions in MobiDB. Almost half the sequences contain an ID region of at least five residues. About 9% of proteins have a long ID region of over 20 residues which are more abundant in Eukaryotic organisms and most frequently cover less than 20% of the sequence. A small subset of about 67,000 (out of over 80 million) proteins is fully disordered and mostly found in Viruses. Most proteins have only one ID, with short ID evenly distributed along the sequence and long ID overrepresented in the center. The charged residue composition of Das and Pappu was used to classify ID proteins by structural propensities and corresponding functional enrichment. Swollen Coils seem to be used mainly as structural components and in biosynthesis in both Prokaryotes and Eukaryotes. In Bacteria, they are confined in the nucleoid and in Viruses provide DNA binding function. Coils & Hairpins seem to be specialized in ribosome binding and methylation activities. Globules & Tadpoles bind antigens in Eukaryotes but are involved in killing other organisms and cytolysis in Bacteria. The Undefined class is used by Bacteria to bind toxic substances and mediate transport and movement between and within organisms in Viruses. Fully disordered proteins behave similarly, but are enriched for glycine residues and extracellular structures. © 2016 The Protein Society.
Proteomic analysis of human aqueous humor using multidimensional protein identification technology
Richardson, Matthew R.; Price, Marianne O.; Price, Francis W.; Pardo, Jennifer C.; Grandin, Juan C.; You, Jinsam; Wang, Mu
2009-01-01
Aqueous humor (AH) supports avascular tissues in the anterior segment of the eye, maintains intraocular pressure, and potentially influences the pathogenesis of ocular diseases. Nevertheless, the AH proteome is still poorly defined despite several previous efforts, which were hindered by interfering high abundance proteins, inadequate animal models, and limited proteomic technologies. To facilitate future investigations into AH function, the AH proteome was extensively characterized using an advanced proteomic approach. Samples from patients undergoing cataract surgery were pooled and depleted of interfering abundant proteins and thereby divided into two fractions: albumin-bound and albumin-depleted. Multidimensional Protein Identification Technology (MudPIT) was utilized for each fraction; this incorporates strong cation exchange chromatography to reduce sample complexity before reversed-phase liquid chromatography and tandem mass spectrometric analysis. Twelve proteins had multi-peptide, high confidence identifications in the albumin-bound fraction and 50 proteins had multi-peptide, high confidence identifications in the albumin-depleted fraction. Gene ontological analyses were performed to determine which cellular components and functions were enriched. Many proteins were previously identified in the AH and for several their potential role in the AH has been investigated; however, the majority of identified proteins were novel and only speculative roles can be suggested. The AH was abundant in anti-oxidant and immunoregulatory proteins as well as anti-angiogenic proteins, which may be involved in maintaining the avascular tissues. This is the first known report to extensively characterize and describe the human AH proteome and lays the foundation for future work regarding its function in homeostatic and pathologic states. PMID:20019884
Anderson, Jonathan P.; Hane, James K.; Stoll, Thomas; Pain, Nicholas; Hastie, Marcus L.; Kaur, Parwinder; Hoogland, Christine; Gorman, Jeffrey J.; Singh, Karam B.
2016-01-01
Rhizoctonia solani is an important root infecting pathogen of a range of food staples worldwide including wheat, rice, maize, soybean, potato and others. Conventional resistance breeding strategies are hindered by the absence of tractable genetic resistance in any crop host. Understanding the biology and pathogenicity mechanisms of this fungus is important for addressing these disease issues, however, little is known about how R. solani causes disease. This study capitalizes on recent genomic studies by applying mass spectrometry based proteomics to identify soluble, membrane-bound and culture filtrate proteins produced under wheat infection and vegetative growth conditions. Many of the proteins found in the culture filtrate had predicted functions relating to modification of the plant cell wall, a major activity required for pathogenesis on the plant host, including a number found only under infection conditions. Other infection related proteins included a high proportion of proteins with redox associated functions and many novel proteins without functional classification. The majority of infection only proteins tested were confirmed to show transcript up-regulation during infection including a thaumatin which increased susceptibility to R. solani when expressed in Nicotiana benthamiana. In addition, analysis of expression during infection of different plant hosts highlighted how the infection strategy of this broad host range pathogen can be adapted to the particular host being encountered. Data are available via ProteomeXchange with identifier PXD002806. PMID:26811357
Margaryan, Hasmik; Dorosh, Andriy; Capkova, Jana; Manaskova-Postlerova, Pavla; Philimonenko, Anatoly; Hozak, Pavel; Peknicova, Jana
2015-03-08
Sperm proteins are important for the sperm cell function in fertilization. Some of them are involved in the binding of sperm to the egg. We characterized the acrosomal sperm protein detected by a monoclonal antibody (MoAb) (Hs-8) that was prepared in our laboratory by immunization of BALB/c mice with human ejaculated sperms and we tested the possible role of this protein in the binding assay. Indirect immunofluorescence and immunogold labelling, gel electrophoresis, Western blotting and protein sequencing were used for Hs-8 antigen characterization. Functional analysis of GAPDHS from the sperm acrosome was performed in the boar model using sperm/zona pellucida binding assay. Monoclonal antibody Hs-8 is an anti-human sperm antibody that cross-reacts with the Hs-8-related protein in spermatozoa of other mammalian species (boar, mouse). In the immunofluorescence test, Hs-8 antibody recognized the protein localized in the acrosomal part of the sperm head and in the principal piece of the sperm flagellum. In immunoblotting test, MoAb Hs-8 labelled a protein of 45 kDa in the extract of human sperm. Sequence analysis identified protein Hs-8 as GAPDHS (glyceraldehyde 3-phosphate dehydrohenase-spermatogenic). For this reason, commercial mouse anti-GAPDHS MoAb was applied in control tests. Both antibodies showed similar staining patterns in immunofluorescence tests, in electron microscopy and in immunoblot analysis. Moreover, both Hs-8 and anti-GAPDHS antibodies blocked sperm/zona pellucida binding. GAPDHS is a sperm-specific glycolytic enzyme involved in energy production during spermatogenesis and sperm motility; its role in the sperm head is unknown. In this study, we identified the antigen with Hs8 antibody and confirmed its localization in the apical part of the sperm head in addition to the principal piece of the flagellum. In an indirect binding assay, we confirmed the potential role of GAPDHS as a binding protein that is involved in the secondary sperm/oocyte binding.
Haga, Ayako; Ogawara, Yoko; Kubota, Daisuke; Kitabayashi, Issay; Murakami, Yasufumi; Kondo, Tadashi
2013-06-01
Nucleophosmin (NPM) is a novel prognostic biomarker for Ewing's sarcoma. To evaluate the prognostic utility of NPM, we conducted an interactomic approach to characterize the NPM protein complex in Ewing's sarcoma cells. A gene suppression assay revealed that NPM promoted cell proliferation and the invasive properties of Ewing's sarcoma cells. FLAG-tag-based affinity purification coupled with liquid chromatography-tandem mass spectrometry identified 106 proteins in the NPM protein complex. The functional classification suggested that the NPM complex participates in critical biological events, including ribosome biogenesis, regulation of transcription and translation, and protein folding, that are mediated by these proteins. In addition to JAK1, a candidate prognostic biomarker for Ewing's sarcoma, the NPM complex, includes 11 proteins known as prognostic biomarkers for other malignancies. Meta-analysis of gene expression profiles of 32 patients with Ewing's sarcoma revealed that 6 of 106 were significantly and independently associated with survival period. These observations suggest a functional role as well as prognostic value of these NPM complex proteins in Ewing's sarcoma. Further, our study suggests the potential applications of interactomics in conjunction with meta-analysis for biomarker discovery. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brown, Roslyn N.; Sanford, James A.; Park, Jea H.
Towards developing a systems-level pathobiological understanding of Salmonella enterica, we performed a subcellular proteomic analysis of this pathogen grown under standard laboratory and infection-mimicking conditions in vitro. Analysis of proteins from cytoplasmic, inner membrane, periplasmic, and outer membrane fractions yielded coverage of over 30% of the theoretical proteome. Confident subcellular location could be assigned to over 1000 proteins, with good agreement between experimentally observed location and predicted/known protein properties. Comparison of protein location under the different environmental conditions provided insight into dynamic protein localization and possible moonlighting (multiple function) activities. Notable examples of dynamic localization were the response regulators ofmore » two-component regulatory systems (e.g., ArcB, PhoQ). The DNA-binding protein Dps that is generally regarded as cytoplasmic was significantly enriched in the outer membrane for all growth conditions examined, suggestive of moonlighting activities. These observations imply the existence of unknown transport mechanisms and novel functions for a subset of Salmonella proteins. Overall, this work provides a catalog of experimentally verified subcellular protein location for Salmonella and a framework for further investigations using computational modeling.« less
Proteomic analysis of symbiosome membranes in Cnidaria-dinoflagellate endosymbiosis.
Peng, Shao-En; Wang, Yu-Bao; Wang, Li-Hsueh; Chen, Wan-Nan Uang; Lu, Chi-Yu; Fang, Lee-Shing; Chen, Chii-Shiarng
2010-03-01
Symbiosomes are specific intracellular membrane-bound vacuoles containing microalgae in a mutualistic Cnidaria (host)-dinoflagellate (symbiont) association. The symbiosome membrane is originally derived from host plasma membranes during phagocytosis of the symbiont; however, its molecular components and functions are not clear. In order to investigate the protein components of the symbiosome membranes, homogenous symbiosomes were isolated from the sea anemone Aiptasia pulchella and their purities and membrane intactness examined by Western blot analysis for host contaminants and microscopic analysis using various fluorescent probes, respectively. Pure and intact symbiosomes were then subjected to biotinylation by a cell impermeant agent (Biotin-XX sulfosuccinimidyl ester) to label membrane surface proteins. The biotinylated proteins, both Triton X-100 soluble and insoluble fractions, were subjected to 2-D SDS-PAGE and identified by MS using an LC-nano-ESI-MS/MS. A total of 17 proteins were identified. Based on their different subcellular origins and functional categories, it indicates that symbiosome membranes serve as the interface for interaction between host and symbiont by fulfilling several crucial cellular functions such as those of membrane receptors/cell recognition, cytoskeletal remodeling, ATP synthesis/proton homeostasis, transporters, stress responses/chaperones, and anti-apoptosis. The results of proteomic analysis not only indicate the molecular identity of the symbiosome membrane, but also provide insight into the possible role of symbiosome membranes during the endosymbiotic association.
Handfield, Louis-François; Chong, Yolanda T.; Simmons, Jibril; Andrews, Brenda J.; Moses, Alan M.
2013-01-01
Protein subcellular localization has been systematically characterized in budding yeast using fluorescently tagged proteins. Based on the fluorescence microscopy images, subcellular localization of many proteins can be classified automatically using supervised machine learning approaches that have been trained to recognize predefined image classes based on statistical features. Here, we present an unsupervised analysis of protein expression patterns in a set of high-resolution, high-throughput microscope images. Our analysis is based on 7 biologically interpretable features which are evaluated on automatically identified cells, and whose cell-stage dependency is captured by a continuous model for cell growth. We show that it is possible to identify most previously identified localization patterns in a cluster analysis based on these features and that similarities between the inferred expression patterns contain more information about protein function than can be explained by a previous manual categorization of subcellular localization. Furthermore, the inferred cell-stage associated to each fluorescence measurement allows us to visualize large groups of proteins entering the bud at specific stages of bud growth. These correspond to proteins localized to organelles, revealing that the organelles must be entering the bud in a stereotypical order. We also identify and organize a smaller group of proteins that show subtle differences in the way they move around the bud during growth. Our results suggest that biologically interpretable features based on explicit models of cell morphology will yield unprecedented power for pattern discovery in high-resolution, high-throughput microscopy images. PMID:23785265
Frigolet, Maria E; Torres, Nimbe; Uribe-Figueroa, Laura; Rangel, Claudia; Jimenez-Sanchez, Gerardo; Tovar, Armando R
2011-02-01
Obesity is associated with an increase in adipose tissue mass due to an imbalance between high dietary energy intake and low physical activity; however, the type of dietary protein may contribute to its development. The aim of the present work was to study the effect of soy protein versus casein on white adipose tissue genome profiling, and the metabolic functions of adipocytes in rats with diet-induced obesity. The results showed that rats fed a Soy Protein High-Fat (Soy HF) diet gained less weight and had lower serum leptin concentration than rats fed a Casein High-Fat (Cas HF) diet, despite similar energy intake. Histological studies indicated that rats fed the Soy HF diet had significantly smaller adipocytes than those fed the Cas HF diet, and this was associated with a lower triglyceride/DNA content. Fatty acid synthesis in isolated adipocytes was reduced by the amount of fat consumed but not by the type of protein ingested. Expression of genes of fatty acid oxidation increased in adipose tissue of rats fed Soy diets; microarray analysis revealed that Soy protein consumption modified the expression of 90 genes involved in metabolic functions and inflammatory response in adipose tissue. Network analysis showed that the expression of leptin was regulated by the type of dietary protein and it was identified as a central regulator of the expression of lipid metabolism genes in adipose tissue. Thus, soy maintains the size and metabolic functions of adipose tissue through biochemical adaptations, adipokine secretion, and global changes in gene expression. Copyright © 2011 Elsevier Inc. All rights reserved.
The Cytoplasmic Zinc Finger Protein ZPR1 Accumulates in the Nucleolus of Proliferating Cells
Galcheva-Gargova, Zoya; Gangwani, Laxman; Konstantinov, Konstantin N.; Mikrut, Monique; Theroux, Steven J.; Enoch, Tamar; Davis, Roger J.
1998-01-01
The zinc finger protein ZPR1 translocates from the cytoplasm to the nucleus after treatment of cells with mitogens. The function of nuclear ZPR1 has not been defined. Here we demonstrate that ZPR1 accumulates in the nucleolus of proliferating cells. The role of ZPR1 was examined using a gene disruption strategy. Cells lacking ZPR1 are not viable. Biochemical analysis demonstrated that the loss of ZPR1 caused disruption of nucleolar function, including preribosomal RNA expression. These data establish ZPR1 as an essential protein that is required for normal nucleolar function in proliferating cells. PMID:9763455
Purification of Plant Receptor Kinases from Plant Plasma Membranes.
Lee, Jin Suk
2017-01-01
Receptor kinases play a central role in various biological processes, but due to their low abundance and highly hydrophobic and dynamic nature, only a few of them have been functionally characterized, and their partners and ligands remain unidentified. Receptor protein extraction and purification from plant tissues is one of the most challenging steps for the success of various biochemical analyses to characterize their function. Immunoprecipitation is a widely used and selective method for enriching or purifying a specific protein. Here we describe two different optimized protein purification protocols, batch and on-chip immunoprecipitation, which efficiently isolate plant membrane receptor kinases for functional analysis.
Protein domains of unknown function are essential in bacteria.
Goodacre, Norman F; Gerloff, Dietlind L; Uetz, Peter
2013-12-31
More than 20% of all protein domains are currently annotated as "domains of unknown function" (DUFs). About 2,700 DUFs are found in bacteria compared with just over 1,500 in eukaryotes. Over 800 DUFs are shared between bacteria and eukaryotes, and about 300 of these are also present in archaea. A total of 2,786 bacterial Pfam domains even occur in animals, including 320 DUFs. Evolutionary conservation suggests that many of these DUFs are important. Here we show that 355 essential proteins in 16 model bacterial species contain 238 DUFs, most of which represent single-domain proteins, clearly establishing the biological essentiality of DUFs. We suggest that experimental research should focus on conserved and essential DUFs (eDUFs) for functional analysis given their important function and wide taxonomic distribution, including bacterial pathogens. The functional units of proteins are domains. Typically, each domain has a distinct structure and function. Genomes encode thousands of domains, and many of the domains have no known function (domains of unknown function [DUFs]). They are often ignored as of little relevance, given that many of them are found in only a few genomes. Here we show that many DUFs are essential DUFs (eDUFs) based on their presence in essential proteins. We also show that eDUFs are often essential even if they are found in relatively few genomes. However, in general, more common DUFs are more often essential than rare DUFs.
An improved method for functional similarity analysis of genes based on Gene Ontology.
Tian, Zhen; Wang, Chunyu; Guo, Maozu; Liu, Xiaoyan; Teng, Zhixia
2016-12-23
Measures of gene functional similarity are essential tools for gene clustering, gene function prediction, evaluation of protein-protein interaction, disease gene prioritization and other applications. In recent years, many gene functional similarity methods have been proposed based on the semantic similarity of GO terms. However, these leading approaches may make errorprone judgments especially when they measure the specificity of GO terms as well as the IC of a term set. Therefore, how to estimate the gene functional similarity reliably is still a challenging problem. We propose WIS, an effective method to measure the gene functional similarity. First of all, WIS computes the IC of a term by employing its depth, the number of its ancestors as well as the topology of its descendants in the GO graph. Secondly, WIS calculates the IC of a term set by means of considering the weighted inherited semantics of terms. Finally, WIS estimates the gene functional similarity based on the IC overlap ratio of term sets. WIS is superior to some other representative measures on the experiments of functional classification of genes in a biological pathway, collaborative evaluation of GO-based semantic similarity measures, protein-protein interaction prediction and correlation with gene expression. Further analysis suggests that WIS takes fully into account the specificity of terms and the weighted inherited semantics of terms between GO terms. The proposed WIS method is an effective and reliable way to compare gene function. The web service of WIS is freely available at http://nclab.hit.edu.cn/WIS/ .
Tandem-repeat protein domains across the tree of life.
Jernigan, Kristin K; Bordenstein, Seth R
2015-01-01
Tandem-repeat protein domains, composed of repeated units of conserved stretches of 20-40 amino acids, are required for a wide array of biological functions. Despite their diverse and fundamental functions, there has been no comprehensive assessment of their taxonomic distribution, incidence, and associations with organismal lifestyle and phylogeny. In this study, we assess for the first time the abundance of armadillo (ARM) and tetratricopeptide (TPR) repeat domains across all three domains in the tree of life and compare the results to our previous analysis on ankyrin (ANK) repeat domains in this journal. All eukaryotes and a majority of the bacterial and archaeal genomes analyzed have a minimum of one TPR and ARM repeat. In eukaryotes, the fraction of ARM-containing proteins is approximately double that of TPR and ANK-containing proteins, whereas bacteria and archaea are enriched in TPR-containing proteins relative to ARM- and ANK-containing proteins. We show in bacteria that phylogenetic history, rather than lifestyle or pathogenicity, is a predictor of TPR repeat domain abundance, while neither phylogenetic history nor lifestyle predicts ARM repeat domain abundance. Surprisingly, pathogenic bacteria were not enriched in TPR-containing proteins, which have been associated within virulence factors in certain species. Taken together, this comparative analysis provides a newly appreciated view of the prevalence and diversity of multiple types of tandem-repeat protein domains across the tree of life. A central finding of this analysis is that tandem repeat domain-containing proteins are prevalent not just in eukaryotes, but also in bacterial and archaeal species.
Bi, Rui; Pan, Yiou; Shang, Qingli; Peng, Tianfei; Yang, Shuang; Wang, Shang; Xin, Xuecheng; Liu, Yan; Xi, Jinghui
2016-09-01
Lambda-cyhalothrin is now widely used in China to control the soybean aphid Aphis glycines. To dissect the resistance mechanism, a laboratory-selected resistant soybean aphid strain (CRR) was established with a 43.42-fold resistance ratio to λ-cyhalothrin than the susceptible strain (CSS) in adult aphids. In this study, a comparative proteomic analysis between the CRR and CSS strains revealed important differences between the susceptible and resistant strains of soybean aphids for λ-cyhalothrin. Approximately 493 protein spots were detected in two-dimensional polyacrylamide gel electrophoresis (2-DE). Thirty-six protein spots displayed differential expression of >2-fold in the CRR strain compared to the CSS strain. Out of these 36 protein spots, 21 had elevated and 15 had decreased expression. Twenty-four differentially expressed proteins were identified by MALDI TOF MS/MS and categorized into the functional groups cytoskeleton-related protein, carbohydrate and energy metabolism, protein folding, antioxidant system, and nucleotide and amino acid metabolism. Function analysis showed that cytoskeleton-related proteins and energy metabolism proteins have been associated with the λ-cyhalothrin resistance of A. glycines. The differential expression of λ-cyhalothrin responsive proteins reflected the overall change in cellular structure and metabolism after insecticide treatment in aphids. In summary, our studies improve understanding of the molecular mechanism resistance of soybean aphid to lambda-cyhalothrin, which will facilitate the development of rational approaches to improve the management of this pest and to improve the yield of soybean. Copyright © 2016. Published by Elsevier Inc.
Proteomic analysis of Toxocara canis excretory and secretory (TES) proteins.
Sperotto, Rita Leal; Kremer, Frederico Schmitt; Aires Berne, Maria Elisabeth; Costa de Avila, Luciana F; da Silva Pinto, Luciano; Monteiro, Karina Mariante; Caumo, Karin Silva; Ferreira, Henrique Bunselmeyer; Berne, Natália; Borsuk, Sibele
2017-01-01
Toxocariasis is a neglected disease, and its main etiological agent is the nematode Toxocara canis. Serological diagnosis is performed by an enzyme-linked immunosorbent assay using T. canis excretory and secretory (TES) antigens produced by in vitro cultivation of larvae. Identification of TES proteins can be useful for the development of new diagnostic strategies since few TES components have been described so far. Herein, we report the results obtained by proteomic analysis of TES proteins using a liquid chromatography-tandem mass spectrometry (LC-MS/MS) approach. TES fractions were separated by one-dimensional SDS-PAGE and analyzed by LC-MS/MS. The MS/MS spectra were compared with a database of protein sequences deduced from the genome sequence of T. canis, and a total of 19 proteins were identified. Classification according to the signal peptide prediction using the SignalP server showed that seven of the identified proteins were extracellular, 10 had cytoplasmic or nuclear localization, while the subcellular localization of two proteins was unknown. Analysis of molecular functions by BLAST2GO showed that the majority of the gene ontology (GO) terms associated with the proteins present in the TES sample were associated with binding functions, including but not limited to protein binding (GO:0005515), inorganic ion binding (GO:0043167), and organic cyclic compound binding (GO:0097159). This study provides additional information about the exoproteome of T. canis, which can lead to the development of new strategies for diagnostics or vaccination. Copyright © 2016 Elsevier B.V. All rights reserved.
Analysis of molecular assemblies by flow cytometry: determinants of Gi1 and by binding
NASA Astrophysics Data System (ADS)
Sarvazyan, Noune A.; Neubig, Richard R.
1998-05-01
We report here a novel application of flow cytometry for the quantitative analysis of the high affinity interaction between membrane proteins both in detergent solutions and when reconstituted into lipid vesicles. The approach is further advanced to permit the analysis of binding to expressed protein complexes in native cell membranes. The G protein heterotrimer signal transduction function links the extracellularly activated transmembrane receptors and intracellular effectors. Upon activation, (alpha) and (beta) (gamma) subunits of G protein undergo a dissociation/association cycle on the cell membrane interface. The binding parameters of solubilized G protein (alpha) and (beta) (gamma) subunits have been defined but little is known quantitatively about their interactions in the membrane. Using a novel flow cytometry approach, the binding of low nanomolar concentrations of fluorescein-labeled G(alpha) i1 (F- (alpha) ) to (beta) (gamma) both in detergent solution and in a lipid environment was quantitatively compared. Unlabeled (beta) $gama reconstituted in biotinylated phospholipid vesicles bound F-(alpha) tightly (Kd 6 - 12 nM) while the affinity for biotinylated-(beta) (gamma) in Lubrol was even higher (Kd of 2.9 nM). The application of this approach to proteins expressed in native cell membranes will advance our understanding of G protein function in context of receptor and effector interaction. More generally, this approach can be applied to study the interaction of any fluorescently labeled protein with a membrane protein which can be expressed in Sf9 plasma membranes.
Li, Hong; Wang, Yi; Zhang, Lei; Lu, Haojie; Zhou, Zhongjun; Wei, Liming; Yang, Pengyuan
2015-12-07
Novel magnetic silica nanoparticles functionalized with layer-by-layer detonation nanodiamonds (dNDs) were prepared by coating single submicron-size magnetite particles with silica and subsequently modified with dNDs. The resulting layer-by-layer dND functionalized magnetic silica microspheres (Fe3O4@SiO2@[dND]n) exhibit a well-defined magnetite-core-silica-shell structure and possess a high content of magnetite, which endow them with high dispersibility and excellent magnetic responsibility. Meanwhile, dNDs are known for their high affinity and biocompatibility towards peptides or proteins. Thus, a novel convenient, fast and efficient pretreatment approach of low-abundance peptides or proteins was successfully established with Fe3O4@SiO2@[dND]n microspheres. The signal intensity of low-abundance peptides was improved by at least two to three orders of magnitude in mass spectrometry analysis. The novel microsphere also showed good tolerance to salt. Even with a high concentration of salt, peptides or proteins could be isolated effectively from samples. Therefore, the convenient and efficient enrichment process of this novel layer-by-layer dND-functionalized microsphere makes it a promising candidate for isolation of protein in a large volume of culture supernatant for secretome analysis. In the application of Fe3O4@SiO2@[dND]n in the secretome of hepatoma cells, 1473 proteins were identified and covered a broad range of pI and molecular weight, including 377 low molecular weight proteins.
van Herwijnen, Martijn J C; Zonneveld, Marijke I; Goerdayal, Soenita; Nolte-'t Hoen, Esther N M; Garssen, Johan; Stahl, Bernd; Maarten Altelaar, A F; Redegeld, Frank A; Wauben, Marca H M
2016-11-01
Breast milk contains several macromolecular components with distinctive functions, whereby milk fat globules and casein micelles mainly provide nutrition to the newborn, and whey contains molecules that can stimulate the newborn's developing immune system and gastrointestinal tract. Although extracellular vesicles (EV) have been identified in breast milk, their physiological function and composition has not been addressed in detail. EV are submicron sized vehicles released by cells for intercellular communication via selectively incorporated lipids, nucleic acids, and proteins. Because of the difficulty in separating EV from other milk components, an in-depth analysis of the proteome of human milk-derived EV is lacking. In this study, an extensive LC-MS/MS proteomic analysis was performed of EV that had been purified from breast milk of seven individual donors using a recently established, optimized density-gradient-based EV isolation protocol. A total of 1963 proteins were identified in milk-derived EV, including EV-associated proteins like CD9, Annexin A5, and Flotillin-1, with a remarkable overlap between the different donors. Interestingly, 198 of the identified proteins are not present in the human EV database Vesiclepedia, indicating that milk-derived EV harbor proteins not yet identified in EV of different origin. Similarly, the proteome of milk-derived EV was compared with that of other milk components. For this, data from 38 published milk proteomic studies were combined in order to construct the total milk proteome, which consists of 2698 unique proteins. Remarkably, 633 proteins identified in milk-derived EV have not yet been identified in human milk to date. Interestingly, these novel proteins include proteins involved in regulation of cell growth and controlling inflammatory signaling pathways, suggesting that milk-derived EVs could support the newborn's developing gastrointestinal tract and immune system. Overall, this study provides an expansion of the whole milk proteome and illustrates that milk-derived EV are macromolecular components with a unique functional proteome. © 2016 by The American Society for Biochemistry and Molecular Biology, Inc.
Selective Loss of Cysteine Residues and Disulphide Bonds in a Potato Proteinase Inhibitor II Family
Li, Xiu-Qing; Zhang, Tieling; Donnelly, Danielle
2011-01-01
Disulphide bonds between cysteine residues in proteins play a key role in protein folding, stability, and function. Loss of a disulphide bond is often associated with functional differentiation of the protein. The evolution of disulphide bonds is still actively debated; analysis of naturally occurring variants can promote understanding of the protein evolutionary process. One of the disulphide bond-containing protein families is the potato proteinase inhibitor II (PI-II, or Pin2, for short) superfamily, which is found in most solanaceous plants and participates in plant development, stress response, and defence. Each PI-II domain contains eight cysteine residues (8C), and two similar PI-II domains form a functional protein that has eight disulphide bonds and two non-identical reaction centres. It is still unclear which patterns and processes affect cysteine residue loss in PI-II. Through cDNA sequencing and data mining, we found six natural variants missing cysteine residues involved in one or two disulphide bonds at the first reaction centre. We named these variants Pi7C and Pi6C for the proteins missing one or two pairs of cysteine residues, respectively. This PI-II-7C/6C family was found exclusively in potato. The missing cysteine residues were in bonding pairs but distant from one another at the nucleotide/protein sequence level. The non-synonymous/synonymous substitution (Ka/Ks) ratio analysis suggested a positive evolutionary gene selection for Pi6C and various Pi7C. The selective deletion of the first reaction centre cysteine residues that are structure-level-paired but sequence-level-distant in PI-II illustrates the flexibility of PI-II domains and suggests the functionality of their transient gene versions during evolution. PMID:21494600
Draeger, Christian; Ndinyanka Fabrice, Tohnyui; Gineau, Emilie; Mouille, Grégory; Kuhn, Benjamin M; Moller, Isabel; Abdou, Marie-Therese; Frey, Beat; Pauly, Markus; Bacic, Antony; Ringli, Christoph
2015-06-24
Leucine-rich repeat extensins (LRXs) are extracellular proteins consisting of an N-terminal leucine-rich repeat (LRR) domain and a C-terminal extensin domain containing the typical features of this class of structural hydroxyproline-rich glycoproteins (HRGPs). The LRR domain is likely to bind an interaction partner, whereas the extensin domain has an anchoring function to insolubilize the protein in the cell wall. Based on the analysis of the root hair-expressed LRX1 and LRX2 of Arabidopsis thaliana, LRX proteins are important for cell wall development. The importance of LRX proteins in non-root hair cells and on the structural changes induced by mutations in LRX genes remains elusive. The LRX gene family of Arabidopsis consists of eleven members, of which LRX3, LRX4, and LRX5 are expressed in aerial organs, such as leaves and stem. The importance of these LRX genes for plant development and particularly cell wall formation was investigated. Synergistic effects of mutations with gradually more severe growth retardation phenotypes in double and triple mutants suggest a similar function of the three genes. Analysis of cell wall composition revealed a number of changes to cell wall polysaccharides in the mutants. LRX3, LRX4, and LRX5, and most likely LRX proteins in general, are important for cell wall development. Due to the complexity of changes in cell wall structures in the lrx mutants, the exact function of LRX proteins remains to be determined. The increasingly strong growth-defect phenotypes in double and triple mutants suggests that the LRX proteins have similar functions and that they are important for proper plant development.
Wong, Sienna; Jin, J-P
2017-01-01
Study of folded structure of proteins provides insights into their biological functions, conformational dynamics and molecular evolution. Current methods of elucidating folded structure of proteins are laborious, low-throughput, and constrained by various limitations. Arising from these methods is the need for a sensitive, quantitative, rapid and high-throughput method not only analysing the folded structure of proteins, but also to monitor dynamic changes under physiological or experimental conditions. In this focused review, we outline the foundation and limitations of current protein structure-determination methods prior to discussing the advantages of an emerging antibody epitope analysis for applications in structural, conformational and evolutionary studies of proteins. We discuss the application of this method using representative examples in monitoring allosteric conformation of regulatory proteins and the determination of the evolutionary lineage of related proteins and protein isoforms. The versatility of the method described herein is validated by the ability to modulate a variety of assay parameters to meet the needs of the user in order to monitor protein conformation. Furthermore, the assay has been used to clarify the lineage of troponin isoforms beyond what has been depicted by sequence homology alone, demonstrating the nonlinear evolutionary relationship between primary structure and tertiary structure of proteins. The antibody epitope analysis method is a highly adaptable technique of protein conformation elucidation, which can be easily applied without the need for specialized equipment or technical expertise. When applied in a systematic and strategic manner, this method has the potential to reveal novel and biomedically meaningful information for structure-function relationship and evolutionary lineage of proteins. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Ye, Chaohui; Ilghari, Dariush; Niu, Jianlou; Xie, Yaoyao; Wang, Yan; Wang, Chao; Li, Xiaokun; Liu, Bailin; Huang, Zhifeng
2012-08-31
An in-depth understanding of molecular basis by which smart polymers assist protein refolding can lead us to develop a more effective polymer for protein refolding. In this report, to investigate structure-function relationship of pH-sensitive smart polymers, a series of poly(methylacrylic acid (MAc)-acrylic acid (AA))s with different MAc/AA ratios and molecular weights were synthesized and then their abilities in refolding of denatured lysozyme were compared by measuring the lytic activity of the refolded lysozyme. Based on our analysis, there were optimal MAc/AA ratio (44% MAc), M(w) (1700 Da), and copolymer concentration (0.1%, w/v) at which the highest yield of protein refolding was achieved. Fluorescence, circular dichroism, and RP-HPLC analysis reported in this study demonstrated that the presence of P(MAc-AA)s in the refolding buffer significantly improved the refolding yield of denatured lysozyme without affecting the overall structure of the enzyme. Importantly, our bioseparation analysis, together with the analysis of zeta potential and particle size of the copolymer in refolding buffers with different copolymer concentrations, suggested that the polymer provided a negatively charged surface for an electrostatic interaction with the denatured lysozyme molecules and thereby minimized the hydrophobic-prone aggregation of unfolded proteins during the process of refolding. Copyright © 2012 Elsevier B.V. All rights reserved.
2013-01-01
Background Nucleoside phosphorylases (NPs) have been extensively investigated in human and bacterial systems for their role in metabolic nucleotide salvaging and links to oncogenesis. In plants, NP-like proteins have not been comprehensively studied, likely because there is no evidence of a metabolic function in nucleoside salvage. However, in the forest trees genus Populus a family of NP-like proteins function as an important ecophysiological adaptation for inter- and intra-seasonal nitrogen storage and cycling. Results We conducted phylogenetic analyses to determine the distribution and evolution of NP-like proteins in plants. These analyses revealed two major clusters of NP-like proteins in plants. Group I proteins were encoded by genes across a wide range of plant taxa while proteins encoded by Group II genes were dominated by species belonging to the order Malpighiales and included the Populus Bark Storage Protein (BSP) and WIN4-like proteins. Additionally, we evaluated the NP-like genes in Populus by examining the transcript abundance of the 13 NP-like genes found in the Populus genome in various tissues of plants exposed to long-day (LD) and short-day (SD) photoperiods. We found that all 13 of the Populus NP-like genes belonging to either Group I or II are expressed in various tissues in both LD and SD conditions. Tests of natural selection and expression evolution analysis of the Populus genes suggests that divergence in gene expression may have occurred recently during the evolution of Populus, which supports the adaptive maintenance models. Lastly, in silico analysis of cis-regulatory elements in the promoters of the 13 NP-like genes in Populus revealed common regulatory elements known to be involved in light regulation, stress/pathogenesis and phytohormone responses. Conclusion In Populus, the evolution of the NP-like protein and gene family has been shaped by duplication events and natural selection. Expression data suggest that previously uncharacterized NP-like proteins may function in nutrient sensing and/or signaling. These proteins are members of Group I NP-like proteins, which are widely distributed in many plant taxa. We conclude that NP-like proteins may function in plants, although this function is undefined. PMID:23957885
Zhan, Yiling; Guo, Shuyuan
2015-01-01
Bacillus thuringiensis (Bt) is capable of producing a chitin-binding protein believed to be functionally important to bacteria during the stationary phase of its growth cycle. In this paper, the chitin-binding domain 3 protein HD73_3189 from B. thuringiensis has been analyzed by computer technology. Primary and secondary structural analyses demonstrated that HD73_3189 is negatively charged and contains several α-helices, aperiodical coils and β-strands. Domain and motif analyses revealed that HD73_3189 contains a signal peptide, an N-terminal chitin binding 3 domains, two copies of a fibronectin-like domain 3 and a C-terminal carbohydrate binding domain classified as CBM_5_12. Moreover, analysis predicted the protein's associated localization site to be the cell wall. Ligand site prediction determined that amino acid residues GLU-312, TRP-334, ILE-341 and VAL-382 exposed on the surface of the target protein exhibit polar interactions with the substrate.
Gagliardi, Assunta; Lamboglia, Egidio; Bianchi, Laura; Landi, Claudia; Armini, Alessandro; Ciolfi, Silvia; Bini, Luca; Marri, Laura
2016-03-01
The aim of this work was the functional and proteomic analysis of a mutant, W3110 Bgl(+) /10, isolated from a batch culture of an Escherichia coli K-12 strain maintained at room temperature without addition of nutrients for 10 years. When the mutant was evaluated in competition experiments in co-culture with the wild-type, it exhibited the growth advantage in stationary phase (GASP) phenotype. Proteomes of the GASP mutant and its parental strain were compared by using a 2DE coupled with MS approach. Several differentially expressed proteins were detected and many of them were successful identified by mass spectrometry. Identified expression-changing proteins were grouped into three functional categories: metabolism, protein synthesis, chaperone and stress responsive proteins. Among them, the prevalence was ascribable to the "metabolism" group (72%) for the GASP mutant, and to "chaperones and stress responsive proteins" group for the parental strain (48%). © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Transcriptomic analysis of Arabidopsis developing stems: a close-up on cell wall genes
Minic, Zoran; Jamet, Elisabeth; San-Clemente, Hélène; Pelletier, Sandra; Renou, Jean-Pierre; Rihouey, Christophe; Okinyo, Denis PO; Proux, Caroline; Lerouge, Patrice; Jouanin, Lise
2009-01-01
Background Different strategies (genetics, biochemistry, and proteomics) can be used to study proteins involved in cell biogenesis. The availability of the complete sequences of several plant genomes allowed the development of transcriptomic studies. Although the expression patterns of some Arabidopsis thaliana genes involved in cell wall biogenesis were identified at different physiological stages, detailed microarray analysis of plant cell wall genes has not been performed on any plant tissues. Using transcriptomic and bioinformatic tools, we studied the regulation of cell wall genes in Arabidopsis stems, i.e. genes encoding proteins involved in cell wall biogenesis and genes encoding secreted proteins. Results Transcriptomic analyses of stems were performed at three different developmental stages, i.e., young stems, intermediate stage, and mature stems. Many genes involved in the synthesis of cell wall components such as polysaccharides and monolignols were identified. A total of 345 genes encoding predicted secreted proteins with moderate or high level of transcripts were analyzed in details. The encoded proteins were distributed into 8 classes, based on the presence of predicted functional domains. Proteins acting on carbohydrates and proteins of unknown function constituted the two most abundant classes. Other proteins were proteases, oxido-reductases, proteins with interacting domains, proteins involved in signalling, and structural proteins. Particularly high levels of expression were established for genes encoding pectin methylesterases, germin-like proteins, arabinogalactan proteins, fasciclin-like arabinogalactan proteins, and structural proteins. Finally, the results of this transcriptomic analyses were compared with those obtained through a cell wall proteomic analysis from the same material. Only a small proportion of genes identified by previous proteomic analyses were identified by transcriptomics. Conversely, only a few proteins encoded by genes having moderate or high level of transcripts were identified by proteomics. Conclusion Analysis of the genes predicted to encode cell wall proteins revealed that about 345 genes had moderate or high levels of transcripts. Among them, we identified many new genes possibly involved in cell wall biogenesis. The discrepancies observed between results of this transcriptomic study and a previous proteomic study on the same material revealed post-transcriptional mechanisms of regulation of expression of genes encoding cell wall proteins. PMID:19149885
Zhao, Linjie; Sun, Tanlin; Pei, Jianfeng; Ouyang, Qi
2015-01-01
It has been a consensus in cancer research that cancer is a disease caused primarily by genomic alterations, especially somatic mutations. However, the mechanism of mutation-induced oncogenesis is not fully understood. Here, we used the mitochondrial apoptotic pathway as a case study and performed a systematic analysis of integrating pathway dynamics with protein interaction kinetics to quantitatively investigate the causal molecular mechanism of mutation-induced oncogenesis. A mathematical model of the regulatory network was constructed to establish the functional role of dynamic bifurcation in the apoptotic process. The oncogenic mutation enrichment of each of the protein functional domains involved was found strongly correlated with the parameter sensitivity of the bifurcation point. We further dissected the causal mechanism underlying this correlation by evaluating the mutational influence on protein interaction kinetics using molecular dynamics simulation. We analyzed 29 matched mutant–wild-type and 16 matched SNP—wild-type protein systems. We found that the binding kinetics changes reflected by the changes of free energy changes induced by protein interaction mutations, which induce variations in the sensitive parameters of the bifurcation point, were a major cause of apoptosis pathway dysfunction, and mutations involved in sensitive interaction domains show high oncogenic potential. Our analysis provided a molecular basis for connecting protein mutations, protein interaction kinetics, network dynamics properties, and physiological function of a regulatory network. These insights provide a framework for coupling mutation genotype to tumorigenesis phenotype and help elucidate the logic of cancer initiation. PMID:26170328