Sample records for protein levels predict

  1. Annotation of Alternatively Spliced Proteins and Transcripts with Protein-Folding Algorithms and Isoform-Level Functional Networks.

    PubMed

    Li, Hongdong; Zhang, Yang; Guan, Yuanfang; Menon, Rajasree; Omenn, Gilbert S

    2017-01-01

    Tens of thousands of splice isoforms of proteins have been catalogued as predicted sequences from transcripts in humans and other species. Relatively few have been characterized biochemically or structurally. With the extensive development of protein bioinformatics, the characterization and modeling of isoform features, isoform functions, and isoform-level networks have advanced notably. Here we present applications of the I-TASSER family of algorithms for folding and functional predictions and the IsoFunc, MIsoMine, and Hisonet data resources for isoform-level analyses of network and pathway-based functional predictions and protein-protein interactions. Hopefully, predictions and insights from protein bioinformatics will stimulate many experimental validation studies.

  2. Multi-level machine learning prediction of protein-protein interactions in Saccharomyces cerevisiae.

    PubMed

    Zubek, Julian; Tatjewski, Marcin; Boniecki, Adam; Mnich, Maciej; Basu, Subhadip; Plewczynski, Dariusz

    2015-01-01

    Accurate identification of protein-protein interactions (PPI) is the key step in understanding proteins' biological functions, which are typically context-dependent. Many existing PPI predictors rely on aggregated features from protein sequences, however only a few methods exploit local information about specific residue contacts. In this work we present a two-stage machine learning approach for prediction of protein-protein interactions. We start with the carefully filtered data on protein complexes available for Saccharomyces cerevisiae in the Protein Data Bank (PDB) database. First, we build linear descriptions of interacting and non-interacting sequence segment pairs based on their inter-residue distances. Secondly, we train machine learning classifiers to predict binary segment interactions for any two short sequence fragments. The final prediction of the protein-protein interaction is done using the 2D matrix representation of all-against-all possible interacting sequence segments of both analysed proteins. The level-I predictor achieves 0.88 AUC for micro-scale, i.e., residue-level prediction. The level-II predictor improves the results further by a more complex learning paradigm. We perform 30-fold macro-scale, i.e., protein-level cross-validation experiment. The level-II predictor using PSIPRED-predicted secondary structure reaches 0.70 precision, 0.68 recall, and 0.70 AUC, whereas other popular methods provide results below 0.6 threshold (recall, precision, AUC). Our results demonstrate that multi-scale sequence features aggregation procedure is able to improve the machine learning results by more than 10% as compared to other sequence representations. Prepared datasets and source code for our experimental pipeline are freely available for download from: http://zubekj.github.io/mlppi/ (open source Python implementation, OS independent).

  3. NRfamPred: a proteome-scale two level method for prediction of nuclear receptor proteins and their sub-families.

    PubMed

    Kumar, Ravindra; Kumari, Bandana; Srivastava, Abhishikha; Kumar, Manish

    2014-10-29

    Nuclear receptor proteins (NRP) are transcription factor that regulate many vital cellular processes in animal cells. NRPs form a super-family of phylogenetically related proteins and divided into different sub-families on the basis of ligand characteristics and their functions. In the post-genomic era, when new proteins are being added to the database in a high-throughput mode, it becomes imperative to identify new NRPs using information from amino acid sequence alone. In this study we report a SVM based two level prediction systems, NRfamPred, using dipeptide composition of proteins as input. At the 1st level, NRfamPred screens whether the query protein is NRP or non-NRP; if the query protein belongs to NRP class, prediction moves to 2nd level and predicts the sub-family. Using leave-one-out cross-validation, we were able to achieve an overall accuracy of 97.88% at the 1st level and an overall accuracy of 98.11% at the 2nd level with dipeptide composition. Benchmarking on independent datasets showed that NRfamPred had comparable accuracy to other existing methods, developed on the same dataset. Our method predicted the existence of 76 NRPs in the human proteome, out of which 14 are novel NRPs. NRfamPred also predicted the sub-families of these 14 NRPs.

  4. Predicting the Dynamics of Protein Abundance

    PubMed Central

    Mehdi, Ahmed M.; Patrick, Ralph; Bailey, Timothy L.; Bodén, Mikael

    2014-01-01

    Protein synthesis is finely regulated across all organisms, from bacteria to humans, and its integrity underpins many important processes. Emerging evidence suggests that the dynamic range of protein abundance is greater than that observed at the transcript level. Technological breakthroughs now mean that sequencing-based measurement of mRNA levels is routine, but protocols for measuring protein abundance remain both complex and expensive. This paper introduces a Bayesian network that integrates transcriptomic and proteomic data to predict protein abundance and to model the effects of its determinants. We aim to use this model to follow a molecular response over time, from condition-specific data, in order to understand adaptation during processes such as the cell cycle. With microarray data now available for many conditions, the general utility of a protein abundance predictor is broad. Whereas most quantitative proteomics studies have focused on higher organisms, we developed a predictive model of protein abundance for both Saccharomyces cerevisiae and Schizosaccharomyces pombe to explore the latitude at the protein level. Our predictor primarily relies on mRNA level, mRNA–protein interaction, mRNA folding energy and half-life, and tRNA adaptation. The combination of key features, allowing for the low certainty and uneven coverage of experimental observations, gives comparatively minor but robust prediction accuracy. The model substantially improved the analysis of protein regulation during the cell cycle: predicted protein abundance identified twice as many cell-cycle-associated proteins as experimental mRNA levels. Predicted protein abundance was more dynamic than observed mRNA expression, agreeing with experimental protein abundance from a human cell line. We illustrate how the same model can be used to predict the folding energy of mRNA when protein abundance is available, lending credence to the emerging view that mRNA folding affects translation efficiency. The software and data used in this research are available at http://bioinf.scmb.uq.edu.au/proteinabundance/. PMID:24532840

  5. Predicting the dynamics of protein abundance.

    PubMed

    Mehdi, Ahmed M; Patrick, Ralph; Bailey, Timothy L; Bodén, Mikael

    2014-05-01

    Protein synthesis is finely regulated across all organisms, from bacteria to humans, and its integrity underpins many important processes. Emerging evidence suggests that the dynamic range of protein abundance is greater than that observed at the transcript level. Technological breakthroughs now mean that sequencing-based measurement of mRNA levels is routine, but protocols for measuring protein abundance remain both complex and expensive. This paper introduces a Bayesian network that integrates transcriptomic and proteomic data to predict protein abundance and to model the effects of its determinants. We aim to use this model to follow a molecular response over time, from condition-specific data, in order to understand adaptation during processes such as the cell cycle. With microarray data now available for many conditions, the general utility of a protein abundance predictor is broad. Whereas most quantitative proteomics studies have focused on higher organisms, we developed a predictive model of protein abundance for both Saccharomyces cerevisiae and Schizosaccharomyces pombe to explore the latitude at the protein level. Our predictor primarily relies on mRNA level, mRNA-protein interaction, mRNA folding energy and half-life, and tRNA adaptation. The combination of key features, allowing for the low certainty and uneven coverage of experimental observations, gives comparatively minor but robust prediction accuracy. The model substantially improved the analysis of protein regulation during the cell cycle: predicted protein abundance identified twice as many cell-cycle-associated proteins as experimental mRNA levels. Predicted protein abundance was more dynamic than observed mRNA expression, agreeing with experimental protein abundance from a human cell line. We illustrate how the same model can be used to predict the folding energy of mRNA when protein abundance is available, lending credence to the emerging view that mRNA folding affects translation efficiency. The software and data used in this research are available at http://bioinf.scmb.uq.edu.au/proteinabundance/.

  6. Predicting helix–helix interactions from residue contacts in membrane proteins

    PubMed Central

    Lo, Allan; Chiu, Yi-Yuan; Rødland, Einar Andreas; Lyu, Ping-Chiang; Sung, Ting-Yi; Hsu, Wen-Lian

    2009-01-01

    Motivation: Helix–helix interactions play a critical role in the structure assembly, stability and function of membrane proteins. On the molecular level, the interactions are mediated by one or more residue contacts. Although previous studies focused on helix-packing patterns and sequence motifs, few of them developed methods specifically for contact prediction. Results: We present a new hierarchical framework for contact prediction, with an application in membrane proteins. The hierarchical scheme consists of two levels: in the first level, contact residues are predicted from the sequence and their pairing relationships are further predicted in the second level. Statistical analyses on contact propensities are combined with other sequence and structural information for training the support vector machine classifiers. Evaluated on 52 protein chains using leave-one-out cross validation (LOOCV) and an independent test set of 14 protein chains, the two-level approach consistently improves the conventional direct approach in prediction accuracy, with 80% reduction of input for prediction. Furthermore, the predicted contacts are then used to infer interactions between pairs of helices. When at least three predicted contacts are required for an inferred interaction, the accuracy, sensitivity and specificity are 56%, 40% and 89%, respectively. Our results demonstrate that a hierarchical framework can be applied to eliminate false positives (FP) while reducing computational complexity in predicting contacts. Together with the estimated contact propensities, this method can be used to gain insights into helix-packing in membrane proteins. Availability: http://bio-cluster.iis.sinica.edu.tw/TMhit/ Contact: tsung@iis.sinica.edu.tw Supplementary information:Supplementary data are available at Bioinformatics online. PMID:19244388

  7. PRODUCTION AND ECONOMIC OPTIMIZATION OF DIETARY PROTEIN AND CARBOHYDRATE IN THE CULTURE OF JUVENILE SEA URCHIN Lytechinus variegatus

    PubMed Central

    Heflin, Laura E.; Makowsky, Robert; Taylor, J. Christopher; Williams, Michael B.; Lawrence, Addison L.; Watts, Stephen A.

    2016-01-01

    Juvenile Lytechinus variegatus (ca. 3.95± 0.54 g) were fed one of 10 formulated diets with different protein (ranging from 11- 43%) and carbohydrate (12 or 18%; brackets determined from previous studies) levels. Urchins (n= 16 per treatment) were fed a daily sub-satiation ration equivalent to 2.0% of average body weight for 10 weeks. Our objective was (1) to create predictive models of growth, production and efficiency outcomes and (2) to generate economic analysis models in relation to these dietary outcomes for juvenile L. variegatus held in culture. At dietary protein levels below ca. 30%, models for most growth and production outcomes predicted increased rates of growth and production among urchins fed diets containing 18% dietary carbohydrate levels as compared to urchins fed diets containing 12% dietary carbohydrate. For most outcomes, growth and production was predicted to increase with increasing level of dietary protein up to ca. 30%, after which, no further increase in growth and production were predicted. Likewise, dry matter production efficiency was predicted to increase with increasing protein level up to ca. 30%, with urchins fed diets with 18% carbohydrate exhibiting greater efficiency than those fed diets with 12% carbohydrate. The energetic cost of dry matter production was optimal at protein levels less than those required for maximal weight gain and gonad production, suggesting an increased energetic cost (decreased energy efficiency) is required to increase gonad production relative to somatic growth. Economic analysis models predict when cost of feed ingredients are low, the lowest cost per gram of wet weight gain will occur at 18% dietary carbohydrate and ca. 25- 30% dietary protein. In contrast, lowest cost per gram of wet weight gain will occur at 12% dietary carbohydrate and ca. 35- 40% dietary protein when feed ingredient costs are high or average. For both 18 and 12% levels of dietary carbohydrate, cost per gram of wet weight gain is predicted to be maximized at low dietary protein levels, regardless of feed ingredient costs. These models will compare dietary requirements and growth outcomes in relation to economic costs and provide insight for future commercialization of sea urchin aquaculture. PMID:28082753

  8. PRODUCTION AND ECONOMIC OPTIMIZATION OF DIETARY PROTEIN AND CARBOHYDRATE IN THE CULTURE OF JUVENILE SEA URCHIN Lytechinus variegatus.

    PubMed

    Heflin, Laura E; Makowsky, Robert; Taylor, J Christopher; Williams, Michael B; Lawrence, Addison L; Watts, Stephen A

    2016-10-01

    Juvenile Lytechinus variegatus (ca. 3.95± 0.54 g) were fed one of 10 formulated diets with different protein (ranging from 11- 43%) and carbohydrate (12 or 18%; brackets determined from previous studies) levels. Urchins (n= 16 per treatment) were fed a daily sub-satiation ration equivalent to 2.0% of average body weight for 10 weeks. Our objective was (1) to create predictive models of growth, production and efficiency outcomes and (2) to generate economic analysis models in relation to these dietary outcomes for juvenile L. variegatus held in culture. At dietary protein levels below ca. 30%, models for most growth and production outcomes predicted increased rates of growth and production among urchins fed diets containing 18% dietary carbohydrate levels as compared to urchins fed diets containing 12% dietary carbohydrate. For most outcomes, growth and production was predicted to increase with increasing level of dietary protein up to ca. 30%, after which, no further increase in growth and production were predicted. Likewise, dry matter production efficiency was predicted to increase with increasing protein level up to ca. 30%, with urchins fed diets with 18% carbohydrate exhibiting greater efficiency than those fed diets with 12% carbohydrate. The energetic cost of dry matter production was optimal at protein levels less than those required for maximal weight gain and gonad production, suggesting an increased energetic cost (decreased energy efficiency) is required to increase gonad production relative to somatic growth. Economic analysis models predict when cost of feed ingredients are low, the lowest cost per gram of wet weight gain will occur at 18% dietary carbohydrate and ca. 25- 30% dietary protein. In contrast, lowest cost per gram of wet weight gain will occur at 12% dietary carbohydrate and ca. 35- 40% dietary protein when feed ingredient costs are high or average. For both 18 and 12% levels of dietary carbohydrate, cost per gram of wet weight gain is predicted to be maximized at low dietary protein levels, regardless of feed ingredient costs. These models will compare dietary requirements and growth outcomes in relation to economic costs and provide insight for future commercialization of sea urchin aquaculture.

  9. Enhancing interacting residue prediction with integrated contact matrix prediction in protein-protein interaction.

    PubMed

    Du, Tianchuan; Liao, Li; Wu, Cathy H

    2016-12-01

    Identifying the residues in a protein that are involved in protein-protein interaction and identifying the contact matrix for a pair of interacting proteins are two computational tasks at different levels of an in-depth analysis of protein-protein interaction. Various methods for solving these two problems have been reported in the literature. However, the interacting residue prediction and contact matrix prediction were handled by and large independently in those existing methods, though intuitively good prediction of interacting residues will help with predicting the contact matrix. In this work, we developed a novel protein interacting residue prediction system, contact matrix-interaction profile hidden Markov model (CM-ipHMM), with the integration of contact matrix prediction and the ipHMM interaction residue prediction. We propose to leverage what is learned from the contact matrix prediction and utilize the predicted contact matrix as "feedback" to enhance the interaction residue prediction. The CM-ipHMM model showed significant improvement over the previous method that uses the ipHMM for predicting interaction residues only. It indicates that the downstream contact matrix prediction could help the interaction site prediction.

  10. Protein disorder is positively correlated with gene expression in E. coli

    PubMed Central

    Paliy, Oleg; Gargac, Shawn M.; Cheng, Yugong; Uversky, Vladimir N.; Dunker, A. Keith

    2009-01-01

    We considered on a global scale the relationship between the predicted fraction of protein disorder and RNA and protein expression in E. coli. Fraction of protein disorder correlated positively with both measured RNA expression levels of E. coli genes in three different growth media and with predicted abundance levels of E. coli proteins. Though weak, the correlation was highly significant. Correlation of protein disorder with RNA expression did not depend on the growth rate of E. coli cultures and was not caused by a small subset of genes showing exceptionally high concordance in their disorder and expression levels. Global analysis was complemented by detailed consideration of several groups of proteins. PMID:18465893

  11. Concomitant prediction of function and fold at the domain level with GO-based profiles.

    PubMed

    Lopez, Daniel; Pazos, Florencio

    2013-01-01

    Predicting the function of newly sequenced proteins is crucial due to the pace at which these raw sequences are being obtained. Almost all resources for predicting protein function assign functional terms to whole chains, and do not distinguish which particular domain is responsible for the allocated function. This is not a limitation of the methodologies themselves but it is due to the fact that in the databases of functional annotations these methods use for transferring functional terms to new proteins, these annotations are done on a whole-chain basis. Nevertheless, domains are the basic evolutionary and often functional units of proteins. In many cases, the domains of a protein chain have distinct molecular functions, independent from each other. For that reason resources with functional annotations at the domain level, as well as methodologies for predicting function for individual domains adapted to these resources are required.We present a methodology for predicting the molecular function of individual domains, based on a previously developed database of functional annotations at the domain level. The approach, which we show outperforms a standard method based on sequence searches in assigning function, concomitantly predicts the structural fold of the domains and can give hints on the functionally important residues associated to the predicted function.

  12. Blind predictions of protein interfaces by docking calculations in CAPRI.

    PubMed

    Lensink, Marc F; Wodak, Shoshana J

    2010-11-15

    Reliable prediction of the amino acid residues involved in protein-protein interfaces can provide valuable insight into protein function, and inform mutagenesis studies, and drug design applications. A fast-growing number of methods are being proposed for predicting protein interfaces, using structural information, energetic criteria, or sequence conservation or by integrating multiple criteria and approaches. Overall however, their performance remains limited, especially when applied to nonobligate protein complexes, where the individual components are also stable on their own. Here, we evaluate interface predictions derived from protein-protein docking calculations. To this end we measure the overlap between the interfaces in models of protein complexes submitted by 76 participants in CAPRI (Critical Assessment of Predicted Interactions) and those of 46 observed interfaces in 20 CAPRI targets corresponding to nonobligate complexes. Our evaluation considers multiple models for each target interface, submitted by different participants, using a variety of docking methods. Although this results in a substantial variability in the prediction performance across participants and targets, clear trends emerge. Docking methods that perform best in our evaluation predict interfaces with average recall and precision levels of about 60%, for a small majority (60%) of the analyzed interfaces. These levels are significantly higher than those obtained for nonobligate complexes by most extant interface prediction methods. We find furthermore that a sizable fraction (24%) of the interfaces in models ranked as incorrect in the CAPRI assessment are actually correctly predicted (recall and precision ≥50%), and that these models contribute to 70% of the correct docking-based interface predictions overall. Our analysis proves that docking methods are much more successful in identifying interfaces than in predicting complexes, and suggests that these methods have an excellent potential of addressing the interface prediction challenge. © 2010 Wiley-Liss, Inc.

  13. TANGLE: Two-Level Support Vector Regression Approach for Protein Backbone Torsion Angle Prediction from Primary Sequences

    PubMed Central

    Song, Jiangning; Tan, Hao; Wang, Mingjun; Webb, Geoffrey I.; Akutsu, Tatsuya

    2012-01-01

    Protein backbone torsion angles (Phi) and (Psi) involve two rotation angles rotating around the Cα-N bond (Phi) and the Cα-C bond (Psi). Due to the planarity of the linked rigid peptide bonds, these two angles can essentially determine the backbone geometry of proteins. Accordingly, the accurate prediction of protein backbone torsion angle from sequence information can assist the prediction of protein structures. In this study, we develop a new approach called TANGLE (Torsion ANGLE predictor) to predict the protein backbone torsion angles from amino acid sequences. TANGLE uses a two-level support vector regression approach to perform real-value torsion angle prediction using a variety of features derived from amino acid sequences, including the evolutionary profiles in the form of position-specific scoring matrices, predicted secondary structure, solvent accessibility and natively disordered region as well as other global sequence features. When evaluated based on a large benchmark dataset of 1,526 non-homologous proteins, the mean absolute errors (MAEs) of the Phi and Psi angle prediction are 27.8° and 44.6°, respectively, which are 1% and 3% respectively lower than that using one of the state-of-the-art prediction tools ANGLOR. Moreover, the prediction of TANGLE is significantly better than a random predictor that was built on the amino acid-specific basis, with the p-value<1.46e-147 and 7.97e-150, respectively by the Wilcoxon signed rank test. As a complementary approach to the current torsion angle prediction algorithms, TANGLE should prove useful in predicting protein structural properties and assisting protein fold recognition by applying the predicted torsion angles as useful restraints. TANGLE is freely accessible at http://sunflower.kuicr.kyoto-u.ac.jp/~sjn/TANGLE/. PMID:22319565

  14. Prediction of recombinant protein overexpression in Escherichia coli using a machine learning based model (RPOLP).

    PubMed

    Habibi, Narjeskhatoon; Norouzi, Alireza; Mohd Hashim, Siti Z; Shamsir, Mohd Shahir; Samian, Razip

    2015-11-01

    Recombinant protein overexpression, an important biotechnological process, is ruled by complex biological rules which are mostly unknown, is in need of an intelligent algorithm so as to avoid resource-intensive lab-based trial and error experiments in order to determine the expression level of the recombinant protein. The purpose of this study is to propose a predictive model to estimate the level of recombinant protein overexpression for the first time in the literature using a machine learning approach based on the sequence, expression vector, and expression host. The expression host was confined to Escherichia coli which is the most popular bacterial host to overexpress recombinant proteins. To provide a handle to the problem, the overexpression level was categorized as low, medium and high. A set of features which were likely to affect the overexpression level was generated based on the known facts (e.g. gene length) and knowledge gathered from related literature. Then, a representative sub-set of features generated in the previous objective was determined using feature selection techniques. Finally a predictive model was developed using random forest classifier which was able to adequately classify the multi-class imbalanced small dataset constructed. The result showed that the predictive model provided a promising accuracy of 80% on average, in estimating the overexpression level of a recombinant protein. Copyright © 2015 Elsevier Ltd. All rights reserved.

  15. Sequence Alignment to Predict Across Species Susceptibility ...

    EPA Pesticide Factsheets

    Conservation of a molecular target across species can be used as a line-of-evidence to predict the likelihood of chemical susceptibility. The web-based Sequence Alignment to Predict Across Species Susceptibility (SeqAPASS) tool was developed to simplify, streamline, and quantitatively assess protein sequence/structural similarity across taxonomic groups as a means to predict relative intrinsic susceptibility. The intent of the tool is to allow for evaluation of any potential protein target, so it is amenable to variable degrees of protein characterization, depending on available information about the chemical/protein interaction and the molecular target itself. To allow for flexibility in the analysis, a layered strategy was adopted for the tool. The first level of the SeqAPASS analysis compares primary amino acid sequences to a query sequence, calculating a metric for sequence similarity (including detection of candidate orthologs), the second level evaluates sequence similarity within selected domains (e.g., ligand-binding domain, DNA binding domain), and the third level of analysis compares individual amino acid residue positions identified as being of importance for protein conformation and/or ligand binding upon chemical perturbation. Each level of the SeqAPASS analysis provides increasing evidence to apply toward rapid, screening-level assessments of probable cross species susceptibility. Such analyses can support prioritization of chemicals for further ev

  16. Improving membrane protein expression by optimizing integration efficiency

    PubMed Central

    2017-01-01

    The heterologous overexpression of integral membrane proteins in Escherichia coli often yields insufficient quantities of purifiable protein for applications of interest. The current study leverages a recently demonstrated link between co-translational membrane integration efficiency and protein expression levels to predict protein sequence modifications that improve expression. Membrane integration efficiencies, obtained using a coarse-grained simulation approach, robustly predicted effects on expression of the integral membrane protein TatC for a set of 140 sequence modifications, including loop-swap chimeras and single-residue mutations distributed throughout the protein sequence. Mutations that improve simulated integration efficiency were 4-fold enriched with respect to improved experimentally observed expression levels. Furthermore, the effects of double mutations on both simulated integration efficiency and experimentally observed expression levels were cumulative and largely independent, suggesting that multiple mutations can be introduced to yield higher levels of purifiable protein. This work provides a foundation for a general method for the rational overexpression of integral membrane proteins based on computationally simulated membrane integration efficiencies. PMID:28918393

  17. Predicting protein-protein interactions on a proteome scale by matching evolutionary and structural similarities at interfaces using PRISM.

    PubMed

    Tuncbag, Nurcan; Gursoy, Attila; Nussinov, Ruth; Keskin, Ozlem

    2011-08-11

    Prediction of protein-protein interactions at the structural level on the proteome scale is important because it allows prediction of protein function, helps drug discovery and takes steps toward genome-wide structural systems biology. We provide a protocol (termed PRISM, protein interactions by structural matching) for large-scale prediction of protein-protein interactions and assembly of protein complex structures. The method consists of two components: rigid-body structural comparisons of target proteins to known template protein-protein interfaces and flexible refinement using a docking energy function. The PRISM rationale follows our observation that globally different protein structures can interact via similar architectural motifs. PRISM predicts binding residues by using structural similarity and evolutionary conservation of putative binding residue 'hot spots'. Ultimately, PRISM could help to construct cellular pathways and functional, proteome-scale annotation. PRISM is implemented in Python and runs in a UNIX environment. The program accepts Protein Data Bank-formatted protein structures and is available at http://prism.ccbb.ku.edu.tr/prism_protocol/.

  18. Estimation of daily protein intake based on spot urine urea nitrogen concentration in chronic kidney disease patients.

    PubMed

    Kanno, Hiroko; Kanda, Eiichiro; Sato, Asako; Sakamoto, Kaori; Kanno, Yoshihiko

    2016-04-01

    Determination of daily protein intake in the management of chronic kidney disease (CKD) requires precision. Inaccuracies in recording dietary intake occur, and estimation from total urea excretion presents hurdles owing to the difficulty of collecting whole urine for 24 h. Spot urine has been used for measuring daily sodium intake and urinary protein excretion. In this cross-sectional study, we investigated whether urea nitrogen (UN) concentration in spot urine can be used to predict daily protein intake instead of the 24-h urine collection in 193 Japanese CKD patients (Stages G1-G5). After patient randomization into 2 datasets for the development and validation of models, bootstrapping was used to develop protein intake estimation models. The parameters for the candidate multivariate regression models were male gender, age, body mass index (BMI), diabetes mellitus, dyslipidemia, proteinuria, estimated glomerular filtration rate, serum albumin level, spot urinary UN and creatinine level, and spot urinary UN/creatinine levels. The final model contained BMI and spot urinary UN level. The final model was selected because of the higher correlation between the predicted and measured protein intakes r = 0.558 (95 % confidence interval 0.400, 0.683), and the smaller distribution of the difference between the measured and predicted protein intakes than those of the other models. The results suggest that UN concentration in spot urine may be used to estimate daily protein intake and that a prediction formula would be useful for nutritional control in CKD patients.

  19. Analysis of deep learning methods for blind protein contact prediction in CASP12.

    PubMed

    Wang, Sheng; Sun, Siqi; Xu, Jinbo

    2018-03-01

    Here we present the results of protein contact prediction achieved in CASP12 by our RaptorX-Contact server, which is an early implementation of our deep learning method for contact prediction. On a set of 38 free-modeling target domains with a median family size of around 58 effective sequences, our server obtained an average top L/5 long- and medium-range contact accuracy of 47% and 44%, respectively (L = length). A complete implementation has an average accuracy of 59% and 57%, respectively. Our deep learning method formulates contact prediction as a pixel-level image labeling problem and simultaneously predicts all residue pairs of a protein using a combination of two deep residual neural networks, taking as input the residue conservation information, predicted secondary structure and solvent accessibility, contact potential, and coevolution information. Our approach differs from existing methods mainly in (1) formulating contact prediction as a pixel-level image labeling problem instead of an image-level classification problem; (2) simultaneously predicting all contacts of an individual protein to make effective use of contact occurrence patterns; and (3) integrating both one-dimensional and two-dimensional deep convolutional neural networks to effectively learn complex sequence-structure relationship including high-order residue correlation. This paper discusses the RaptorX-Contact pipeline, both contact prediction and contact-based folding results, and finally the strength and weakness of our method. © 2017 Wiley Periodicals, Inc.

  20. Identification of residue pairing in interacting β-strands from a predicted residue contact map.

    PubMed

    Mao, Wenzhi; Wang, Tong; Zhang, Wenxuan; Gong, Haipeng

    2018-04-19

    Despite the rapid progress of protein residue contact prediction, predicted residue contact maps frequently contain many errors. However, information of residue pairing in β strands could be extracted from a noisy contact map, due to the presence of characteristic contact patterns in β-β interactions. This information may benefit the tertiary structure prediction of mainly β proteins. In this work, we propose a novel ridge-detection-based β-β contact predictor to identify residue pairing in β strands from any predicted residue contact map. Our algorithm RDb 2 C adopts ridge detection, a well-developed technique in computer image processing, to capture consecutive residue contacts, and then utilizes a novel multi-stage random forest framework to integrate the ridge information and additional features for prediction. Starting from the predicted contact map of CCMpred, RDb 2 C remarkably outperforms all state-of-the-art methods on two conventional test sets of β proteins (BetaSheet916 and BetaSheet1452), and achieves F1-scores of ~ 62% and ~ 76% at the residue level and strand level, respectively. Taking the prediction of the more advanced RaptorX-Contact as input, RDb 2 C achieves impressively higher performance, with F1-scores reaching ~ 76% and ~ 86% at the residue level and strand level, respectively. In a test of structural modeling using the top 1 L predicted contacts as constraints, for 61 mainly β proteins, the average TM-score achieves 0.442 when using the raw RaptorX-Contact prediction, but increases to 0.506 when using the improved prediction by RDb 2 C. Our method can significantly improve the prediction of β-β contacts from any predicted residue contact maps. Prediction results of our algorithm could be directly applied to effectively facilitate the practical structure prediction of mainly β proteins. All source data and codes are available at http://166.111.152.91/Downloads.html or the GitHub address of https://github.com/wzmao/RDb2C .

  1. ComplexContact: a web server for inter-protein contact prediction using deep learning.

    PubMed

    Zeng, Hong; Wang, Sheng; Zhou, Tianming; Zhao, Feifeng; Li, Xiufeng; Wu, Qing; Xu, Jinbo

    2018-05-22

    ComplexContact (http://raptorx2.uchicago.edu/ComplexContact/) is a web server for sequence-based interfacial residue-residue contact prediction of a putative protein complex. Interfacial residue-residue contacts are critical for understanding how proteins form complex and interact at residue level. When receiving a pair of protein sequences, ComplexContact first searches for their sequence homologs and builds two paired multiple sequence alignments (MSA), then it applies co-evolution analysis and a CASP-winning deep learning (DL) method to predict interfacial contacts from paired MSAs and visualizes the prediction as an image. The DL method was originally developed for intra-protein contact prediction and performed the best in CASP12. Our large-scale experimental test further shows that ComplexContact greatly outperforms pure co-evolution methods for inter-protein contact prediction, regardless of the species.

  2. A predictive biophysical model of translational coupling to coordinate and control protein expression in bacterial operons

    PubMed Central

    Tian, Tian; Salis, Howard M.

    2015-01-01

    Natural and engineered genetic systems require the coordinated expression of proteins. In bacteria, translational coupling provides a genetically encoded mechanism to control expression level ratios within multi-cistronic operons. We have developed a sequence-to-function biophysical model of translational coupling to predict expression level ratios in natural operons and to design synthetic operons with desired expression level ratios. To quantitatively measure ribosome re-initiation rates, we designed and characterized 22 bi-cistronic operon variants with systematically modified intergenic distances and upstream translation rates. We then derived a thermodynamic free energy model to calculate de novo initiation rates as a result of ribosome-assisted unfolding of intergenic RNA structures. The complete biophysical model has only five free parameters, but was able to accurately predict downstream translation rates for 120 synthetic bi-cistronic and tri-cistronic operons with rationally designed intergenic regions and systematically increased upstream translation rates. The biophysical model also accurately predicted the translation rates of the nine protein atp operon, compared to ribosome profiling measurements. Altogether, the biophysical model quantitatively predicts how translational coupling controls protein expression levels in synthetic and natural bacterial operons, providing a deeper understanding of an important post-transcriptional regulatory mechanism and offering the ability to rationally engineer operons with desired behaviors. PMID:26117546

  3. Ferritin levels predict severe dengue.

    PubMed

    Soundravally, R; Agieshkumar, B; Daisy, M; Sherin, J; Cleetus, C C

    2015-02-01

    Currently, no tests are available to monitor and predict severity and outcome of dengue. To find potential markers that predict dengue severity, the present study validated the serum level of three acute-phase proteins α-1 antitrypsin, ceruloplasmin and ferritin in a pool of severe dengue cases compared to non-severe forms and other febrile illness controls. A total of 96 patients were divided into two equal groups with group 'A' comprising dengue-infected cases and group 'B' with other febrile illness cases negative for dengue. Out of 48 dengue-infected cases, 13 had severe dengue and the remaining 35 were classified as non-severe dengue. Immunoassays were performed to evaluate the serum levels of acute-phase proteins both on the day of admission and on the day of defervescence. The efficiency of individual proteins in predicting the disease severity was assessed using receiver operating characteristic curve. The study did not find any significant difference in the levels of α-1 antitrypsin between the clinical groups. A significant increase in the levels of ceruloplasmin around defervescence in severe cases compared to non-severe and other febrile controls was observed and this is the first report describing the potential association of ceruloplasmin and dengue severity. Interestingly, a steady increase in the level of serum ferritin was recorded throughout the course of illness. Among all the three proteins, the elevated ferritin level could predict the disease severity with highest sensitivity and specificity of 76.9 and 83.3 %, respectively, on the day of admission and the same was found to be 90 and 91.6 % around defervescence. On the basis of this diagnostic efficiency, we propose that ferritin may serve as a potential biomarker for an early prediction of disease severity.

  4. Quantitative Protein Topography Analysis and High-Resolution Structure Prediction Using Hydroxyl Radical Labeling and Tandem-Ion Mass Spectrometry (MS)*

    PubMed Central

    Kaur, Parminder; Kiselar, Janna; Yang, Sichun; Chance, Mark R.

    2015-01-01

    Hydroxyl radical footprinting based MS for protein structure assessment has the goal of understanding ligand induced conformational changes and macromolecular interactions, for example, protein tertiary and quaternary structure, but the structural resolution provided by typical peptide-level quantification is limiting. In this work, we present experimental strategies using tandem-MS fragmentation to increase the spatial resolution of the technique to the single residue level to provide a high precision tool for molecular biophysics research. Overall, in this study we demonstrated an eightfold increase in structural resolution compared with peptide level assessments. In addition, to provide a quantitative analysis of residue based solvent accessibility and protein topography as a basis for high-resolution structure prediction; we illustrate strategies of data transformation using the relative reactivity of side chains as a normalization strategy and predict side-chain surface area from the footprinting data. We tested the methods by examination of Ca+2-calmodulin showing highly significant correlations between surface area and side-chain contact predictions for individual side chains and the crystal structure. Tandem ion based hydroxyl radical footprinting-MS provides quantitative high-resolution protein topology information in solution that can fill existing gaps in structure determination for large proteins and macromolecular complexes. PMID:25687570

  5. fRMSDPred: Predicting Local RMSD Between Structural Fragments Using Sequence Information

    DTIC Science & Technology

    2007-04-04

    machine learning approaches for estimating the RMSD value of a pair of protein fragments. These estimated fragment-level RMSD values can be used to construct the alignment, assess the quality of an alignment, and identify high-quality alignment segments. We present algorithms to solve this fragment-level RMSD prediction problem using a supervised learning framework based on support vector regression and classification that incorporates protein profiles, predicted secondary structure, effective information encoding schemes, and novel second-order pairwise exponential kernel

  6. Improving protein-protein interaction prediction using evolutionary information from low-quality MSAs.

    PubMed

    Várnai, Csilla; Burkoff, Nikolas S; Wild, David L

    2017-01-01

    Evolutionary information stored in multiple sequence alignments (MSAs) has been used to identify the interaction interface of protein complexes, by measuring either co-conservation or co-mutation of amino acid residues across the interface. Recently, maximum entropy related correlated mutation measures (CMMs) such as direct information, decoupling direct from indirect interactions, have been developed to identify residue pairs interacting across the protein complex interface. These studies have focussed on carefully selected protein complexes with large, good-quality MSAs. In this work, we study protein complexes with a more typical MSA consisting of fewer than 400 sequences, using a set of 79 intramolecular protein complexes. Using a maximum entropy based CMM at the residue level, we develop an interface level CMM score to be used in re-ranking docking decoys. We demonstrate that our interface level CMM score compares favourably to the complementarity trace score, an evolutionary information-based score measuring co-conservation, when combined with the number of interface residues, a knowledge-based potential and the variability score of individual amino acid sites. We also demonstrate, that, since co-mutation and co-complementarity in the MSA contain orthogonal information, the best prediction performance using evolutionary information can be achieved by combining the co-mutation information of the CMM with co-conservation information of a complementarity trace score, predicting a near-native structure as the top prediction for 41% of the dataset. The method presented is not restricted to small MSAs, and will likely improve interface prediction also for complexes with large and good-quality MSAs.

  7. Improving protein fold recognition by extracting fold-specific features from predicted residue-residue contacts.

    PubMed

    Zhu, Jianwei; Zhang, Haicang; Li, Shuai Cheng; Wang, Chao; Kong, Lupeng; Sun, Shiwei; Zheng, Wei-Mou; Bu, Dongbo

    2017-12-01

    Accurate recognition of protein fold types is a key step for template-based prediction of protein structures. The existing approaches to fold recognition mainly exploit the features derived from alignments of query protein against templates. These approaches have been shown to be successful for fold recognition at family level, but usually failed at superfamily/fold levels. To overcome this limitation, one of the key points is to explore more structurally informative features of proteins. Although residue-residue contacts carry abundant structural information, how to thoroughly exploit these information for fold recognition still remains a challenge. In this study, we present an approach (called DeepFR) to improve fold recognition at superfamily/fold levels. The basic idea of our approach is to extract fold-specific features from predicted residue-residue contacts of proteins using deep convolutional neural network (DCNN) technique. Based on these fold-specific features, we calculated similarity between query protein and templates, and then assigned query protein with fold type of the most similar template. DCNN has showed excellent performance in image feature extraction and image recognition; the rational underlying the application of DCNN for fold recognition is that contact likelihood maps are essentially analogy to images, as they both display compositional hierarchy. Experimental results on the LINDAHL dataset suggest that even using the extracted fold-specific features alone, our approach achieved success rate comparable to the state-of-the-art approaches. When further combining these features with traditional alignment-related features, the success rate of our approach increased to 92.3%, 82.5% and 78.8% at family, superfamily and fold levels, respectively, which is about 18% higher than the state-of-the-art approach at fold level, 6% higher at superfamily level and 1% higher at family level. An independent assessment on SCOP_TEST dataset showed consistent performance improvement, indicating robustness of our approach. Furthermore, bi-clustering results of the extracted features are compatible with fold hierarchy of proteins, implying that these features are fold-specific. Together, these results suggest that the features extracted from predicted contacts are orthogonal to alignment-related features, and the combination of them could greatly facilitate fold recognition at superfamily/fold levels and template-based prediction of protein structures. Source code of DeepFR is freely available through https://github.com/zhujianwei31415/deepfr, and a web server is available through http://protein.ict.ac.cn/deepfr. zheng@itp.ac.cn or dbu@ict.ac.cn. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  8. Prediction of Carbohydrate Binding Sites on Protein Surfaces with 3-Dimensional Probability Density Distributions of Interacting Atoms

    PubMed Central

    Tsai, Keng-Chang; Jian, Jhih-Wei; Yang, Ei-Wen; Hsu, Po-Chiang; Peng, Hung-Pin; Chen, Ching-Tai; Chen, Jun-Bo; Chang, Jeng-Yih; Hsu, Wen-Lian; Yang, An-Suei

    2012-01-01

    Non-covalent protein-carbohydrate interactions mediate molecular targeting in many biological processes. Prediction of non-covalent carbohydrate binding sites on protein surfaces not only provides insights into the functions of the query proteins; information on key carbohydrate-binding residues could suggest site-directed mutagenesis experiments, design therapeutics targeting carbohydrate-binding proteins, and provide guidance in engineering protein-carbohydrate interactions. In this work, we show that non-covalent carbohydrate binding sites on protein surfaces can be predicted with relatively high accuracy when the query protein structures are known. The prediction capabilities were based on a novel encoding scheme of the three-dimensional probability density maps describing the distributions of 36 non-covalent interacting atom types around protein surfaces. One machine learning model was trained for each of the 30 protein atom types. The machine learning algorithms predicted tentative carbohydrate binding sites on query proteins by recognizing the characteristic interacting atom distribution patterns specific for carbohydrate binding sites from known protein structures. The prediction results for all protein atom types were integrated into surface patches as tentative carbohydrate binding sites based on normalized prediction confidence level. The prediction capabilities of the predictors were benchmarked by a 10-fold cross validation on 497 non-redundant proteins with known carbohydrate binding sites. The predictors were further tested on an independent test set with 108 proteins. The residue-based Matthews correlation coefficient (MCC) for the independent test was 0.45, with prediction precision and sensitivity (or recall) of 0.45 and 0.49 respectively. In addition, 111 unbound carbohydrate-binding protein structures for which the structures were determined in the absence of the carbohydrate ligands were predicted with the trained predictors. The overall prediction MCC was 0.49. Independent tests on anti-carbohydrate antibodies showed that the carbohydrate antigen binding sites were predicted with comparable accuracy. These results demonstrate that the predictors are among the best in carbohydrate binding site predictions to date. PMID:22848404

  9. DNCON2: improved protein contact prediction using two-level deep convolutional neural networks.

    PubMed

    Adhikari, Badri; Hou, Jie; Cheng, Jianlin

    2018-05-01

    Significant improvements in the prediction of protein residue-residue contacts are observed in the recent years. These contacts, predicted using a variety of coevolution-based and machine learning methods, are the key contributors to the recent progress in ab initio protein structure prediction, as demonstrated in the recent CASP experiments. Continuing the development of new methods to reliably predict contact maps is essential to further improve ab initio structure prediction. In this paper we discuss DNCON2, an improved protein contact map predictor based on two-level deep convolutional neural networks. It consists of six convolutional neural networks-the first five predict contacts at 6, 7.5, 8, 8.5 and 10 Å distance thresholds, and the last one uses these five predictions as additional features to predict final contact maps. On the free-modeling datasets in CASP10, 11 and 12 experiments, DNCON2 achieves mean precisions of 35, 50 and 53.4%, respectively, higher than 30.6% by MetaPSICOV on CASP10 dataset, 34% by MetaPSICOV on CASP11 dataset and 46.3% by Raptor-X on CASP12 dataset, when top L/5 long-range contacts are evaluated. We attribute the improved performance of DNCON2 to the inclusion of short- and medium-range contacts into training, two-level approach to prediction, use of the state-of-the-art optimization and activation functions, and a novel deep learning architecture that allows each filter in a convolutional layer to access all the input features of a protein of arbitrary length. The web server of DNCON2 is at http://sysbio.rnet.missouri.edu/dncon2/ where training and testing datasets as well as the predictions for CASP10, 11 and 12 free-modeling datasets can also be downloaded. Its source code is available at https://github.com/multicom-toolbox/DNCON2/. chengji@missouri.edu. Supplementary data are available at Bioinformatics online.

  10. Systems analysis of apoptosis protein expression allows the case-specific prediction of cell death responsiveness of melanoma cells

    PubMed Central

    Passante, E; Würstle, M L; Hellwig, C T; Leverkus, M; Rehm, M

    2013-01-01

    Many cancer entities and their associated cell line models are highly heterogeneous in their responsiveness to apoptosis inducers and, despite a detailed understanding of the underlying signaling networks, cell death susceptibility currently cannot be predicted reliably from protein expression profiles. Here, we demonstrate that an integration of quantitative apoptosis protein expression data with pathway knowledge can predict the cell death responsiveness of melanoma cell lines. By a total of 612 measurements, we determined the absolute expression (nM) of 17 core apoptosis regulators in a panel of 11 melanoma cell lines, and enriched these data with systems-level information on apoptosis pathway topology. By applying multivariate statistical analysis and multi-dimensional pattern recognition algorithms, the responsiveness of individual cell lines to tumor necrosis factor-related apoptosis-inducing ligand (TRAIL) or dacarbazine (DTIC) could be predicted with very high accuracy (91 and 82% correct predictions), and the most effective treatment option for individual cell lines could be pre-determined in silico. In contrast, cell death responsiveness was poorly predicted when not taking knowledge on protein–protein interactions into account (55 and 36% correct predictions). We also generated mathematical predictions on whether anti-apoptotic Bcl-2 family members or x-linked inhibitor of apoptosis protein (XIAP) can be targeted to enhance TRAIL responsiveness in individual cell lines. Subsequent experiments, making use of pharmacological Bcl-2/Bcl-xL inhibition or siRNA-based XIAP depletion, confirmed the accuracy of these predictions. We therefore demonstrate that cell death responsiveness to TRAIL or DTIC can be predicted reliably in a large number of melanoma cell lines when investigating expression patterns of apoptosis regulators in the context of their network-level interplay. The capacity to predict responsiveness at the cellular level may contribute to personalizing anti-cancer treatments in the future. PMID:23933815

  11. Lessons from (co-)evolution in the docking of proteins and peptides for CAPRI Rounds 28-35.

    PubMed

    Yu, Jinchao; Andreani, Jessica; Ochsenbein, Françoise; Guerois, Raphaël

    2017-03-01

    Computational protein-protein docking is of great importance for understanding protein interactions at the structural level. Critical assessment of prediction of interactions (CAPRI) experiments provide the protein docking community with a unique opportunity to blindly test methods based on real-life cases and help accelerate methodology development. For CAPRI Rounds 28-35, we used an automatic docking pipeline integrating the coarse-grained co-evolution-based potential InterEvScore. This score was developed to exploit the information contained in the multiple sequence alignments of binding partners and selectively recognize co-evolved interfaces. Together with Zdock/Frodock for rigid-body docking, SOAP-PP for atomic potential and Rosetta applications for structural refinement, this pipeline reached high performance on a majority of targets. For protein-peptide docking and interfacial water position predictions, we also explored different means of taking evolutionary information into account. Overall, our group ranked 1 st by correctly predicting 10 targets, composed of 1 High, 7 Medium and 2 Acceptable predictions. Excellent and Outstanding levels of accuracy were reached for each of the two water prediction targets, respectively. Altogether, in 15 out of 18 targets in total, evolutionary information, either through co-evolution or conservation analyses, could provide key constraints to guide modeling towards the most likely assemblies. These results open promising perspectives regarding the way evolutionary information can be valuable to improve docking prediction accuracy. Proteins 2017; 85:378-390. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  12. A Practical Predictive Index for Intra-abdominal Septic Complications After Primary Anastomosis for Crohn's Disease: Change in C-Reactive Protein Level Before Surgery.

    PubMed

    Zuo, Lugen; Li, Yi; Wang, Honggang; Zhu, Weiming; Zhang, Wei; Gong, Jianfeng; Li, Ning; Li, Jieshou

    2015-08-01

    Postoperative intra-abdominal septic complications are difficult to manage in Crohn's disease, which makes prevention especially important. The purpose of this study was to examine the risk factors for intra-abdominal septic complications after primary anastomosis for Crohn's disease and to seek a practical predictive index for intra-abdominal septic complications. This was a retrospective study. The study was conducted in a tertiary referral hospital. Based on a computerized database of 344 patients with Crohn's disease who underwent primary anastomosis between 2004 and 2013, the patients were placed into an intra-abdominal septic complications group and a group without intra-abdominal septic complications. Univariate and multivariate analyses were performed to identify risk factors, and the predictive accuracy of possible predictors was assessed using receiver operating characteristic curves. Overall, 39 patients (11.34%) developed intra-abdominal septic complications. Preoperative C-reactive protein level >10 mg/L was found to be an independent risk factor (p < 0.01) for intra-abdominal septic complications. For prediction of intra-abdominal septic complications, receiver operating characteristic curve analysis showed that a C-reactive protein cutoff of 14.50 mg/L provided negative and positive predictive values of 96.84% and 34.07%. In addition, the change in C-reactive protein levels over the 2 weeks before surgery was greater in the intra-abdominal septic complications group than the group with no intra-abdominal septic complications (p < 0.01), and the directions of change were opposite, upward in the former and downward in the latter. Apart from being a risk factor for intra-abdominal septic complications (p < 0.01), receiver operating characteristic curve analysis showed that the change in C-reactive protein levels before surgery had a negative predictive value for intra-abdominal septic complications of 98.66% and a positive predictive value of 76.09%. This was a retrospective study. Changes in C-reactive protein before surgical treatment of Crohn's disease could serve as a practical predictive index for postoperative intra-abdominal septic complications.

  13. Serum levels of S100B from jugular bulb as a biomarker of poor prognosis in patients with severe acute brain injury.

    PubMed

    Ballesteros, María A; Rubio-Lopez, María I; San Martín, María; Padilla, Ana; López-Hoyos, Marcos; Llorca, Javier; Miñambres, Eduardo

    2018-02-15

    To evaluate the correlation between protein S100B concentrations measured in the jugular bulb as well as at peripheral level and the prognostic usefulness of this marker. A prospective study of all patients admitted to the intensive care unit with acute brain damage was carried out. Peripheral and jugular bulb blood samples were collected upon admission and every 24h for three days. The endpoints were brain death diagnosis and the Glasgow Outcome Scale score after 6months. A total of 83 patients were included. Jugular protein S100B levels were greater than systemic levels upon admission and also after 24 and 72h (mean difference>0). Jugular protein S100B levels showed acceptable precision in predicting brain death both upon admission [AUC 0.67 (95% CI 0.53-0.80)] and after 48h [AUC 0.73 (95% CI 0.57-0.89)]. Similar results were obtained regarding the capacity of jugular protein S100B levels upon admission to predict an unfavourable outcome (AUC 0.69 (95% CI 0.56-0.79)). The gradient upon admission (jugular-peripheral levels) showed its capacity to predict the development of brain death [AUC 0.74 (95% CI 0.62-0.86)] and together with the Glasgow Coma Scale constituted the independent factors associated with the development of brain death. Regional protein S100B determinations are higher than systemic determinations, thus confirming the cerebral origin of protein S100B. The transcranial protein S100B gradient is correlated to the development of brain death. Copyright © 2017. Published by Elsevier B.V.

  14. Co-evolutionary Analysis of Domains in Interacting Proteins Reveals Insights into Domain–Domain Interactions Mediating Protein–Protein Interactions

    PubMed Central

    Jothi, Raja; Cherukuri, Praveen F.; Tasneem, Asba; Przytycka, Teresa M.

    2006-01-01

    Recent advances in functional genomics have helped generate large-scale high-throughput protein interaction data. Such networks, though extremely valuable towards molecular level understanding of cells, do not provide any direct information about the regions (domains) in the proteins that mediate the interaction. Here, we performed co-evolutionary analysis of domains in interacting proteins in order to understand the degree of co-evolution of interacting and non-interacting domains. Using a combination of sequence and structural analysis, we analyzed protein–protein interactions in F1-ATPase, Sec23p/Sec24p, DNA-directed RNA polymerase and nuclear pore complexes, and found that interacting domain pair(s) for a given interaction exhibits higher level of co-evolution than the noninteracting domain pairs. Motivated by this finding, we developed a computational method to test the generality of the observed trend, and to predict large-scale domain–domain interactions. Given a protein–protein interaction, the proposed method predicts the domain pair(s) that is most likely to mediate the protein interaction. We applied this method on the yeast interactome to predict domain–domain interactions, and used known domain–domain interactions found in PDB crystal structures to validate our predictions. Our results show that the prediction accuracy of the proposed method is statistically significant. Comparison of our prediction results with those from two other methods reveals that only a fraction of predictions are shared by all the three methods, indicating that the proposed method can detect known interactions missed by other methods. We believe that the proposed method can be used with other methods to help identify previously unrecognized domain–domain interactions on a genome scale, and could potentially help reduce the search space for identifying interaction sites. PMID:16949097

  15. Reduction of lymphocyte G protein-coupled receptor kinase-2 (GRK2) after exercise training predicts survival in patients with heart failure.

    PubMed

    Rengo, Giuseppe; Galasso, Gennaro; Femminella, Grazia D; Parisi, Valentina; Zincarelli, Carmela; Pagano, Gennaro; De Lucia, Claudio; Cannavo, Alessandro; Liccardo, Daniela; Marciano, Caterina; Vigorito, Carlo; Giallauria, Francesco; Ferrara, Nicola; Furgi, Giuseppe; Filardi, Pasquale Perrone; Koch, Walter J; Leosco, Dario

    2014-01-01

    Increased cardiac G protein-coupled receptor kinase-2 (GRK2) expression has a pivotal role at inducing heart failure (HF)-related β-adrenergic receptor (βAR) dysfunction. Importantly, abnormalities of βAR signalling in the failing heart, including GRK2 overexpression, are mirrored in circulating lymphocytes and correlate with HF severity. Exercise training has been shown to exert several beneficial effects on the failing heart, including normalization of cardiac βAR function and GRK2 protein levels. In the present study, we evaluated whether lymphocyte GRK2 levels and short-term changes of this kinase after an exercise training programme can predict long-term survival in HF patients. For this purpose, we prospectively studied 193 HF patients who underwent a 3-month exercise training programme. Lymphocyte GRK2 protein levels, plasma N-terminal pro-brain natriuretic peptide, and norepinephrine were measured at baseline and after training along with clinical and functional parameters (left ventricular ejection fraction, NYHA class, and peak-VO2). Cardiac-related mortality was evaluated during a mean follow-up period of 37 ± 20 months. Exercise was associated with a significant reduction of lymphocyte GRK2 protein levels (from 1.29 ± 0.52 to 1.16 ± 0.65 densitometric units, p < 0.0001). Importantly, exercise related changes of GRK2 (delta values) robustly predicted survival in our study population. Interestingly, HF patients who did not show reduced lymphocyte GRK2 protein levels after training presented the poorest outcome. Our data offer the first demonstration that changes of lymphocyte GRK2 after exercise training can strongly predict outcome in advanced HF.

  16. The DynaMine webserver: predicting protein dynamics from sequence.

    PubMed

    Cilia, Elisa; Pancsa, Rita; Tompa, Peter; Lenaerts, Tom; Vranken, Wim F

    2014-07-01

    Protein dynamics are important for understanding protein function. Unfortunately, accurate protein dynamics information is difficult to obtain: here we present the DynaMine webserver, which provides predictions for the fast backbone movements of proteins directly from their amino-acid sequence. DynaMine rapidly produces a profile describing the statistical potential for such movements at residue-level resolution. The predicted values have meaning on an absolute scale and go beyond the traditional binary classification of residues as ordered or disordered, thus allowing for direct dynamics comparisons between protein regions. Through this webserver, we provide molecular biologists with an efficient and easy to use tool for predicting the dynamical characteristics of any protein of interest, even in the absence of experimental observations. The prediction results are visualized and can be directly downloaded. The DynaMine webserver, including instructive examples describing the meaning of the profiles, is available at http://dynamine.ibsquare.be. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  17. Identification and Validation of Human Missing Proteins and Peptides in Public Proteome Databases: Data Mining Strategy.

    PubMed

    Elguoshy, Amr; Hirao, Yoshitoshi; Xu, Bo; Saito, Suguru; Quadery, Ali F; Yamamoto, Keiko; Mitsui, Toshiaki; Yamamoto, Tadashi

    2017-12-01

    In an attempt to complete human proteome project (HPP), Chromosome-Centric Human Proteome Project (C-HPP) launched the journey of missing protein (MP) investigation in 2012. However, 2579 and 572 protein entries in the neXtProt (2017-1) are still considered as missing and uncertain proteins, respectively. Thus, in this study, we proposed a pipeline to analyze, identify, and validate human missing and uncertain proteins in open-access transcriptomics and proteomics databases. Analysis of RNA expression pattern for missing proteins in Human protein Atlas showed that 28% of them, such as Olfactory receptor 1I1 ( O60431 ), had no RNA expression, suggesting the necessity to consider uncommon tissues for transcriptomic and proteomic studies. Interestingly, 21% had elevated expression level in a particular tissue (tissue-enriched proteins), indicating the importance of targeting such proteins in their elevated tissues. Additionally, the analysis of RNA expression level for missing proteins showed that 95% had no or low expression level (0-10 transcripts per million), indicating that low abundance is one of the major obstacles facing the detection of missing proteins. Moreover, missing proteins are predicted to generate fewer predicted unique tryptic peptides than the identified proteins. Searching for these predicted unique tryptic peptides that correspond to missing and uncertain proteins in the experimental peptide list of open-access MS-based databases (PA, GPM) resulted in the detection of 402 missing and 19 uncertain proteins with at least two unique peptides (≥9 aa) at <(5 × 10 -4 )% FDR. Finally, matching the native spectra for the experimentally detected peptides with their SRMAtlas synthetic counterparts at three transition sources (QQQ, QTOF, QTRAP) gave us an opportunity to validate 41 missing proteins by ≥2 proteotypic peptides.

  18. Predicting protein folding rate change upon point mutation using residue-level coevolutionary information.

    PubMed

    Mallik, Saurav; Das, Smita; Kundu, Sudip

    2016-01-01

    Change in folding kinetics of globular proteins upon point mutation is crucial to a wide spectrum of biological research, such as protein misfolding, toxicity, and aggregations. Here we seek to address whether residue-level coevolutionary information of globular proteins can be informative to folding rate changes upon point mutations. Generating residue-level coevolutionary networks of globular proteins, we analyze three parameters: relative coevolution order (rCEO), network density (ND), and characteristic path length (CPL). A point mutation is considered to be equivalent to a node deletion of this network and respective percentage changes in rCEO, ND, CPL are found linearly correlated (0.84, 0.73, and -0.61, respectively) with experimental folding rate changes. The three parameters predict the folding rate change upon a point mutation with 0.031, 0.045, and 0.059 standard errors, respectively. © 2015 Wiley Periodicals, Inc.

  19. Protein Structure and Function Prediction Using I-TASSER

    PubMed Central

    Yang, Jianyi; Zhang, Yang

    2016-01-01

    I-TASSER is a hierarchical protocol for automated protein structure prediction and structure-based function annotation. Starting from the amino acid sequence of target proteins, I-TASSER first generates full-length atomic structural models from multiple threading alignments and iterative structural assembly simulations followed by atomic-level structure refinement. The biological functions of the protein, including ligand-binding sites, enzyme commission number, and gene ontology terms, are then inferred from known protein function databases based on sequence and structure profile comparisons. I-TASSER is freely available as both an on-line server and a stand-alone package. This unit describes how to use the I-TASSER protocol to generate structure and function prediction and how to interpret the prediction results, as well as alternative approaches for further improving the I-TASSER modeling quality for distant-homologous and multi-domain protein targets. PMID:26678386

  20. Blind test of physics-based prediction of protein structures.

    PubMed

    Shell, M Scott; Ozkan, S Banu; Voelz, Vincent; Wu, Guohong Albert; Dill, Ken A

    2009-02-01

    We report here a multiprotein blind test of a computer method to predict native protein structures based solely on an all-atom physics-based force field. We use the AMBER 96 potential function with an implicit (GB/SA) model of solvation, combined with replica-exchange molecular-dynamics simulations. Coarse conformational sampling is performed using the zipping and assembly method (ZAM), an approach that is designed to mimic the putative physical routes of protein folding. ZAM was applied to the folding of six proteins, from 76 to 112 monomers in length, in CASP7, a community-wide blind test of protein structure prediction. Because these predictions have about the same level of accuracy as typical bioinformatics methods, and do not utilize information from databases of known native structures, this work opens up the possibility of predicting the structures of membrane proteins, synthetic peptides, or other foldable polymers, for which there is little prior knowledge of native structures. This approach may also be useful for predicting physical protein folding routes, non-native conformations, and other physical properties from amino acid sequences.

  1. Blind Test of Physics-Based Prediction of Protein Structures

    PubMed Central

    Shell, M. Scott; Ozkan, S. Banu; Voelz, Vincent; Wu, Guohong Albert; Dill, Ken A.

    2009-01-01

    We report here a multiprotein blind test of a computer method to predict native protein structures based solely on an all-atom physics-based force field. We use the AMBER 96 potential function with an implicit (GB/SA) model of solvation, combined with replica-exchange molecular-dynamics simulations. Coarse conformational sampling is performed using the zipping and assembly method (ZAM), an approach that is designed to mimic the putative physical routes of protein folding. ZAM was applied to the folding of six proteins, from 76 to 112 monomers in length, in CASP7, a community-wide blind test of protein structure prediction. Because these predictions have about the same level of accuracy as typical bioinformatics methods, and do not utilize information from databases of known native structures, this work opens up the possibility of predicting the structures of membrane proteins, synthetic peptides, or other foldable polymers, for which there is little prior knowledge of native structures. This approach may also be useful for predicting physical protein folding routes, non-native conformations, and other physical properties from amino acid sequences. PMID:19186130

  2. Exploiting Amino Acid Composition for Predicting Protein-Protein Interactions

    PubMed Central

    Roy, Sushmita; Martinez, Diego; Platero, Harriett; Lane, Terran; Werner-Washburne, Margaret

    2009-01-01

    Background Computational prediction of protein interactions typically use protein domains as classifier features because they capture conserved information of interaction surfaces. However, approaches relying on domains as features cannot be applied to proteins without any domain information. In this paper, we explore the contribution of pure amino acid composition (AAC) for protein interaction prediction. This simple feature, which is based on normalized counts of single or pairs of amino acids, is applicable to proteins from any sequenced organism and can be used to compensate for the lack of domain information. Results AAC performed at par with protein interaction prediction based on domains on three yeast protein interaction datasets. Similar behavior was obtained using different classifiers, indicating that our results are a function of features and not of classifiers. In addition to yeast datasets, AAC performed comparably on worm and fly datasets. Prediction of interactions for the entire yeast proteome identified a large number of novel interactions, the majority of which co-localized or participated in the same processes. Our high confidence interaction network included both well-studied and uncharacterized proteins. Proteins with known function were involved in actin assembly and cell budding. Uncharacterized proteins interacted with proteins involved in reproduction and cell budding, thus providing putative biological roles for the uncharacterized proteins. Conclusion AAC is a simple, yet powerful feature for predicting protein interactions, and can be used alone or in conjunction with protein domains to predict new and validate existing interactions. More importantly, AAC alone performs at par with existing, but more complex, features indicating the presence of sequence-level information that is predictive of interaction, but which is not necessarily restricted to domains. PMID:19936254

  3. Protein domain analysis of genomic sequence data reveals regulation of LRR related domains in plant transpiration in Ficus.

    PubMed

    Lang, Tiange; Yin, Kangquan; Liu, Jinyu; Cao, Kunfang; Cannon, Charles H; Du, Fang K

    2014-01-01

    Predicting protein domains is essential for understanding a protein's function at the molecular level. However, up till now, there has been no direct and straightforward method for predicting protein domains in species without a reference genome sequence. In this study, we developed a functionality with a set of programs that can predict protein domains directly from genomic sequence data without a reference genome. Using whole genome sequence data, the programming functionality mainly comprised DNA assembly in combination with next-generation sequencing (NGS) assembly methods and traditional methods, peptide prediction and protein domain prediction. The proposed new functionality avoids problems associated with de novo assembly due to micro reads and small single repeats. Furthermore, we applied our functionality for the prediction of leucine rich repeat (LRR) domains in four species of Ficus with no reference genome, based on NGS genomic data. We found that the LRRNT_2 and LRR_8 domains are related to plant transpiration efficiency, as indicated by the stomata index, in the four species of Ficus. The programming functionality established in this study provides new insights for protein domain prediction, which is particularly timely in the current age of NGS data expansion.

  4. COPRED: prediction of fold, GO molecular function and functional residues at the domain level.

    PubMed

    López, Daniel; Pazos, Florencio

    2013-07-15

    Only recently the first resources devoted to the functional annotation of proteins at the domain level started to appear. The next step is to develop specific methodologies for predicting function at the domain level based on these resources, and to implement them in web servers to be used by the community. In this work, we present COPRED, a web server for the concomitant prediction of fold, molecular function and functional sites at the domain level, based on a methodology for domain molecular function prediction and a resource of domain functional annotations previously developed and benchmarked. COPRED can be freely accessed at http://csbg.cnb.csic.es/copred. The interface works in all standard web browsers. WebGL (natively supported by most browsers) is required for the in-line preview and manipulation of protein 3D structures. The website includes a detailed help section and usage examples. pazos@cnb.csic.es.

  5. Level of C - reactive protein as an indicator for prognosis of premature uterine contractions.

    PubMed

    Najat Nakishbandy, Bayar M; Barawi, Sabat A M

    2014-01-01

    high concentrations of maternal C-reactive protein have been associated with adverse pregnancy outcome, and premature uterine contraction may be predicted by elevated levels of C-reactive protein. This may ultimately be simple and cost-effective enough to introduce as a low-risk screening program. an observational case control study was performed from May 1st, 2010 to December 1st, 2010 at Maternity Teaching Hospital-Erbil/ Kurdistan Region/ Iraq. The sample size was (200) cases. Hundred of them were presented with premature uterine contractions at 24(+0)-36(+6) weeks. The other hundred were control group at same gestational ages. The level of C-reactive protein was determined in both groups and both groups were followed till delivery. (93) out of (100) women with premature uterine contractions had elevated level of C-Reactive protein and 91% delivered prematurely while in the control group only (9) out of (100) women had elevated level of C-reactive protein and only 8% of them delivered preterm. Differences were statistically highly significant. C-reactive protein can be used as a biomarker in prediction of premature delivery when it is associated with premature uterine contractions. As well it can be used as a screening test to detect cases that are at risk of premature delivery.

  6. Prognostic Value of Lymphocyte G Protein-Coupled Receptor Kinase-2 Protein Levels in Patients With Heart Failure

    PubMed Central

    Rengo, Giuseppe; Pagano, Gennaro; Filardi, Pasquale Perrone; Femminella, Grazia Daniela; Parisi, Valentina; Cannavo, Alessandro; Liccardo, Daniela; Komici, Klara; Gambino, Giuseppina; D’Amico, Maria Loreta; de Lucia, Claudio; Paolillo, Stefania; Trimarco, Bruno; Vitale, Dino Franco; Ferrara, Nicola; Koch, Walter J; Leosco, Dario

    2016-01-01

    Rationale Sympathetic nervous system (SNS) hyperactivity is associated with poor prognosis in patients with HF, yet routine assessment of SNS activation is not recommended for clinical practice. Myocardial G protein-coupled receptor kinase 2 (GRK2) is up-regulated in heart failure (HF) patients, causing dysfunctional β-adrenergic receptor signaling. Importantly, myocardial GRK2 levels correlate with levels found in peripheral lymphocytes of HF patients. Objective The independent prognostic value of blood GRK2 measurements in HF patients has never been investigated, thus, the purpose of the present study was to evaluate whether lymphocyte GRK2 levels predict clinical outcome in HF patients. Methods and Results We prospectively studied 257 HF patients with mean left ventricular ejection fraction (LVEF) of 31.4±8.5%. At the time of enrollment, plasma norepinephrine, serum NT-proBNP and lymphocyte GRK2 levels, as well as clinical and instrumental variables were measured. The prognostic value of GRK2 to predict cardiovascular (CV) death and all-cause mortality was assessed using the Cox proportional hazard model including demographic, clinical, instrumental and laboratory data. Over a mean follow-up period of 37.5±20.2 months (range: 3–60 months) there were 102 CV deaths. Age, LVEF, NYHA class, Chronic Obstructive Pulmonary Disease, Chronic Kidney Disease, N-terminal-pro Brain Natriuretic Peptide, and lymphocyte GRK2 protein levels were independent predictors of CV mortality in HF patients. GRK2 levels showed an additional prognostic and clinical value over demographic and clinical variables. The independent prognostic value of lymphocyte GRK2 levels was also confirmed for all-cause mortality. Conclusion Lymphocyte GRK2 protein levels can independently predict prognosis in patients with HF. PMID:26884616

  7. Transcriptomic analysis of Arabidopsis developing stems: a close-up on cell wall genes

    PubMed Central

    Minic, Zoran; Jamet, Elisabeth; San-Clemente, Hélène; Pelletier, Sandra; Renou, Jean-Pierre; Rihouey, Christophe; Okinyo, Denis PO; Proux, Caroline; Lerouge, Patrice; Jouanin, Lise

    2009-01-01

    Background Different strategies (genetics, biochemistry, and proteomics) can be used to study proteins involved in cell biogenesis. The availability of the complete sequences of several plant genomes allowed the development of transcriptomic studies. Although the expression patterns of some Arabidopsis thaliana genes involved in cell wall biogenesis were identified at different physiological stages, detailed microarray analysis of plant cell wall genes has not been performed on any plant tissues. Using transcriptomic and bioinformatic tools, we studied the regulation of cell wall genes in Arabidopsis stems, i.e. genes encoding proteins involved in cell wall biogenesis and genes encoding secreted proteins. Results Transcriptomic analyses of stems were performed at three different developmental stages, i.e., young stems, intermediate stage, and mature stems. Many genes involved in the synthesis of cell wall components such as polysaccharides and monolignols were identified. A total of 345 genes encoding predicted secreted proteins with moderate or high level of transcripts were analyzed in details. The encoded proteins were distributed into 8 classes, based on the presence of predicted functional domains. Proteins acting on carbohydrates and proteins of unknown function constituted the two most abundant classes. Other proteins were proteases, oxido-reductases, proteins with interacting domains, proteins involved in signalling, and structural proteins. Particularly high levels of expression were established for genes encoding pectin methylesterases, germin-like proteins, arabinogalactan proteins, fasciclin-like arabinogalactan proteins, and structural proteins. Finally, the results of this transcriptomic analyses were compared with those obtained through a cell wall proteomic analysis from the same material. Only a small proportion of genes identified by previous proteomic analyses were identified by transcriptomics. Conversely, only a few proteins encoded by genes having moderate or high level of transcripts were identified by proteomics. Conclusion Analysis of the genes predicted to encode cell wall proteins revealed that about 345 genes had moderate or high levels of transcripts. Among them, we identified many new genes possibly involved in cell wall biogenesis. The discrepancies observed between results of this transcriptomic study and a previous proteomic study on the same material revealed post-transcriptional mechanisms of regulation of expression of genes encoding cell wall proteins. PMID:19149885

  8. Efficient search, mapping, and optimization of multi-protein genetic systems in diverse bacteria

    PubMed Central

    Farasat, Iman; Kushwaha, Manish; Collens, Jason; Easterbrook, Michael; Guido, Matthew; Salis, Howard M

    2014-01-01

    Developing predictive models of multi-protein genetic systems to understand and optimize their behavior remains a combinatorial challenge, particularly when measurement throughput is limited. We developed a computational approach to build predictive models and identify optimal sequences and expression levels, while circumventing combinatorial explosion. Maximally informative genetic system variants were first designed by the RBS Library Calculator, an algorithm to design sequences for efficiently searching a multi-protein expression space across a > 10,000-fold range with tailored search parameters and well-predicted translation rates. We validated the algorithm's predictions by characterizing 646 genetic system variants, encoded in plasmids and genomes, expressed in six gram-positive and gram-negative bacterial hosts. We then combined the search algorithm with system-level kinetic modeling, requiring the construction and characterization of 73 variants to build a sequence-expression-activity map (SEAMAP) for a biosynthesis pathway. Using model predictions, we designed and characterized 47 additional pathway variants to navigate its activity space, find optimal expression regions with desired activity response curves, and relieve rate-limiting steps in metabolism. Creating sequence-expression-activity maps accelerates the optimization of many protein systems and allows previous measurements to quantitatively inform future designs. PMID:24952589

  9. Physics-based protein-structure prediction using a hierarchical protocol based on the UNRES force field: assessment in two blind tests.

    PubMed

    Ołdziej, S; Czaplewski, C; Liwo, A; Chinchio, M; Nanias, M; Vila, J A; Khalili, M; Arnautova, Y A; Jagielska, A; Makowski, M; Schafroth, H D; Kaźmierkiewicz, R; Ripoll, D R; Pillardy, J; Saunders, J A; Kang, Y K; Gibson, K D; Scheraga, H A

    2005-05-24

    Recent improvements in the protein-structure prediction method developed in our laboratory, based on the thermodynamic hypothesis, are described. The conformational space is searched extensively at the united-residue level by using our physics-based UNRES energy function and the conformational space annealing method of global optimization. The lowest-energy coarse-grained structures are then converted to an all-atom representation and energy-minimized with the ECEPP/3 force field. The procedure was assessed in two recent blind tests of protein-structure prediction. During the first blind test, we predicted large fragments of alpha and alpha+beta proteins [60-70 residues with C(alpha) rms deviation (rmsd) <6 A]. However, for alpha+beta proteins, significant topological errors occurred despite low rmsd values. In the second exercise, we predicted whole structures of five proteins (two alpha and three alpha+beta, with sizes of 53-235 residues) with remarkably good accuracy. In particular, for the genomic target TM0487 (a 102-residue alpha+beta protein from Thermotoga maritima), we predicted the complete, topologically correct structure with 7.3-A C(alpha) rmsd. So far this protein is the largest alpha+beta protein predicted based solely on the amino acid sequence and a physics-based potential-energy function and search procedure. For target T0198, a phosphate transport system regulator PhoU from T. maritima (a 235-residue mainly alpha-helical protein), we predicted the topology of the whole six-helix bundle correctly within 8 A rmsd, except the 32 C-terminal residues, most of which form a beta-hairpin. These and other examples described in this work demonstrate significant progress in physics-based protein-structure prediction.

  10. Protein Sorting Prediction.

    PubMed

    Nielsen, Henrik

    2017-01-01

    Many computational methods are available for predicting protein sorting in bacteria. When comparing them, it is important to know that they can be grouped into three fundamentally different approaches: signal-based, global-property-based and homology-based prediction. In this chapter, the strengths and drawbacks of each of these approaches is described through many examples of methods that predict secretion, integration into membranes, or subcellular locations in general. The aim of this chapter is to provide a user-level introduction to the field with a minimum of computational theory.

  11. Energy Fluctuations Shape Free Energy of Nonspecific Biomolecular Interactions

    NASA Astrophysics Data System (ADS)

    Elkin, Michael; Andre, Ingemar; Lukatsky, David B.

    2012-01-01

    Understanding design principles of biomolecular recognition is a key question of molecular biology. Yet the enormous complexity and diversity of biological molecules hamper the efforts to gain a predictive ability for the free energy of protein-protein, protein-DNA, and protein-RNA binding. Here, using a variant of the Derrida model, we predict that for a large class of biomolecular interactions, it is possible to accurately estimate the relative free energy of binding based on the fluctuation properties of their energy spectra, even if a finite number of the energy levels is known. We show that the free energy of the system possessing a wider binding energy spectrum is almost surely lower compared with the system possessing a narrower energy spectrum. Our predictions imply that low-affinity binding scores, usually wasted in protein-protein and protein-DNA docking algorithms, can be efficiently utilized to compute the free energy. Using the results of Rosetta docking simulations of protein-protein interactions from Andre et al. (Proc. Natl. Acad. Sci. USA 105:16148, 2008), we demonstrate the power of our predictions.

  12. Predicted secondary structure similarity in the absence of primary amino acid sequence homology: hepatitis B virus open reading frames.

    PubMed Central

    Schaeffer, E; Sninsky, J J

    1984-01-01

    Proteins that are related evolutionarily may have diverged at the level of primary amino acid sequence while maintaining similar secondary structures. Computer analysis has been used to compare the open reading frames of the hepatitis B virus to those of the woodchuck hepatitis virus at the level of amino acid sequence, and to predict the relative hydrophilic character and the secondary structure of putative polypeptides. Similarity is seen at the levels of relative hydrophilicity and secondary structure, in the absence of sequence homology. These data reinforce the proposal that these open reading frames encode viral proteins. Computer analysis of this type can be more generally used to establish structural similarities between proteins that do not share obvious sequence homology as well as to assess whether an open reading frame is fortuitous or codes for a protein. PMID:6585835

  13. Improving Protein Fold Recognition by Deep Learning Networks.

    PubMed

    Jo, Taeho; Hou, Jie; Eickholt, Jesse; Cheng, Jianlin

    2015-12-04

    For accurate recognition of protein folds, a deep learning network method (DN-Fold) was developed to predict if a given query-template protein pair belongs to the same structural fold. The input used stemmed from the protein sequence and structural features extracted from the protein pair. We evaluated the performance of DN-Fold along with 18 different methods on Lindahl's benchmark dataset and on a large benchmark set extracted from SCOP 1.75 consisting of about one million protein pairs, at three different levels of fold recognition (i.e., protein family, superfamily, and fold) depending on the evolutionary distance between protein sequences. The correct recognition rate of ensembled DN-Fold for Top 1 predictions is 84.5%, 61.5%, and 33.6% and for Top 5 is 91.2%, 76.5%, and 60.7% at family, superfamily, and fold levels, respectively. We also evaluated the performance of single DN-Fold (DN-FoldS), which showed the comparable results at the level of family and superfamily, compared to ensemble DN-Fold. Finally, we extended the binary classification problem of fold recognition to real-value regression task, which also show a promising performance. DN-Fold is freely available through a web server at http://iris.rnet.missouri.edu/dnfold.

  14. PredPPCrys: accurate prediction of sequence cloning, protein production, purification and crystallization propensity from protein sequences using multi-step heterogeneous feature fusion and selection.

    PubMed

    Wang, Huilin; Wang, Mingjun; Tan, Hao; Li, Yuan; Zhang, Ziding; Song, Jiangning

    2014-01-01

    X-ray crystallography is the primary approach to solve the three-dimensional structure of a protein. However, a major bottleneck of this method is the failure of multi-step experimental procedures to yield diffraction-quality crystals, including sequence cloning, protein material production, purification, crystallization and ultimately, structural determination. Accordingly, prediction of the propensity of a protein to successfully undergo these experimental procedures based on the protein sequence may help narrow down laborious experimental efforts and facilitate target selection. A number of bioinformatics methods based on protein sequence information have been developed for this purpose. However, our knowledge on the important determinants of propensity for a protein sequence to produce high diffraction-quality crystals remains largely incomplete. In practice, most of the existing methods display poorer performance when evaluated on larger and updated datasets. To address this problem, we constructed an up-to-date dataset as the benchmark, and subsequently developed a new approach termed 'PredPPCrys' using the support vector machine (SVM). Using a comprehensive set of multifaceted sequence-derived features in combination with a novel multi-step feature selection strategy, we identified and characterized the relative importance and contribution of each feature type to the prediction performance of five individual experimental steps required for successful crystallization. The resulting optimal candidate features were used as inputs to build the first-level SVM predictor (PredPPCrys I). Next, prediction outputs of PredPPCrys I were used as the input to build second-level SVM classifiers (PredPPCrys II), which led to significantly enhanced prediction performance. Benchmarking experiments indicated that our PredPPCrys method outperforms most existing procedures on both up-to-date and previous datasets. In addition, the predicted crystallization targets of currently non-crystallizable proteins were provided as compendium data, which are anticipated to facilitate target selection and design for the worldwide structural genomics consortium. PredPPCrys is freely available at http://www.structbioinfor.org/PredPPCrys.

  15. deepNF: Deep network fusion for protein function prediction.

    PubMed

    Gligorijevic, Vladimir; Barot, Meet; Bonneau, Richard

    2018-06-01

    The prevalence of high-throughput experimental methods has resulted in an abundance of large-scale molecular and functional interaction networks. The connectivity of these networks provides a rich source of information for inferring functional annotations for genes and proteins. An important challenge has been to develop methods for combining these heterogeneous networks to extract useful protein feature representations for function prediction. Most of the existing approaches for network integration use shallow models that encounter difficulty in capturing complex and highly-nonlinear network structures. Thus, we propose deepNF, a network fusion method based on Multimodal Deep Autoencoders to extract high-level features of proteins from multiple heterogeneous interaction networks. We apply this method to combine STRING networks to construct a common low-dimensional representation containing high-level protein features. We use separate layers for different network types in the early stages of the multimodal autoencoder, later connecting all the layers into a single bottleneck layer from which we extract features to predict protein function. We compare the cross-validation and temporal holdout predictive performance of our method with state-of-the-art methods, including the recently proposed method Mashup. Our results show that our method outperforms previous methods for both human and yeast STRING networks. We also show substantial improvement in the performance of our method in predicting GO terms of varying type and specificity. deepNF is freely available at: https://github.com/VGligorijevic/deepNF. vgligorijevic@flatironinstitute.org, rb133@nyu.edu. Supplementary data are available at Bioinformatics online.

  16. Prediction of Cortical Defect Using C-Reactive Protein and Urine Sodium to Potassium Ratio in Infants with Febrile Urinary Tract Infection

    PubMed Central

    Jung, Su Jin

    2016-01-01

    Purpose We investigated whether C-reactive protein (CRP) levels, urine protein-creatinine ratio (uProt/Cr), and urine electrolytes can be useful for discriminating acute pyelonephritis (APN) from other febrile illnesses or the presence of a cortical defect on 99mTc dimercaptosuccinic acid (DMSA) scanning (true APN) from its absence in infants with febrile urinary tract infection (UTI). Materials and Methods We examined 150 infants experiencing their first febrile UTI and 100 controls with other febrile illnesses consecutively admitted to our hospital from January 2010 to December 2012. Blood (CRP, electrolytes, Cr) and urine tests [uProt/Cr, electrolytes, and sodium-potassium ratio (uNa/K)] were performed upon admission. All infants with UTI underwent DMSA scans during admission. All data were compared between infants with UTI and controls and between infants with or without a cortical defect on DMSA scans. Using multiple logistic regression analysis, the ability of the parameters to predict true APN was analyzed. Results CRP levels and uProt/Cr were significantly higher in infants with true APN than in controls. uNa levels and uNa/K were significantly lower in infants with true APN than in controls. CRP levels and uNa/K were relevant factors for predicting true APN. The method using CRP levels, u-Prot/Cr, u-Na levels, and uNa/K had a sensitivity of 94%, specificity of 65%, positive predictive value of 60%, and negative predictive value of 95% for predicting true APN. Conclusion We conclude that these parameters are useful for discriminating APN from other febrile illnesses or discriminating true APN in infants with febrile UTI. PMID:26632389

  17. Structure Prediction of Protein Complexes

    NASA Astrophysics Data System (ADS)

    Pierce, Brian; Weng, Zhiping

    Protein-protein interactions are critical for biological function. They directly and indirectly influence the biological systems of which they are a part. Antibodies bind with antigens to detect and stop viruses and other infectious agents. Cell signaling is performed in many cases through the interactions between proteins. Many diseases involve protein-protein interactions on some level, including cancer and prion diseases.

  18. MITOPRED: a web server for the prediction of mitochondrial proteins

    PubMed Central

    Guda, Chittibabu; Guda, Purnima; Fahy, Eoin; Subramaniam, Shankar

    2004-01-01

    MITOPRED web server enables prediction of nucleus-encoded mitochondrial proteins in all eukaryotic species. Predictions are made using a new algorithm based primarily on Pfam domain occurrence patterns in mitochondrial and non-mitochondrial locations. Pre-calculated predictions are instantly accessible for proteomes of Saccharomyces cerevisiae, Caenorhabditis elegans, Drosophila, Homo sapiens, Mus musculus and Arabidopsis species as well as all the eukaryotic sequences in the Swiss-Prot and TrEMBL databases. Queries, at different confidence levels, can be made through four distinct options: (i) entering Swiss-Prot/TrEMBL accession numbers; (ii) uploading a local file with such accession numbers; (iii) entering protein sequences; (iv) uploading a local file containing protein sequences in FASTA format. Automated updates are scheduled for the pre-calculated prediction database so as to provide access to the most current data. The server, its documentation and the data are available from http://mitopred.sdsc.edu. PMID:15215413

  19. Predicting protein β-sheet contacts using a maximum entropy-based correlated mutation measure.

    PubMed

    Burkoff, Nikolas S; Várnai, Csilla; Wild, David L

    2013-03-01

    The problem of ab initio protein folding is one of the most difficult in modern computational biology. The prediction of residue contacts within a protein provides a more tractable immediate step. Recently introduced maximum entropy-based correlated mutation measures (CMMs), such as direct information, have been successful in predicting residue contacts. However, most correlated mutation studies focus on proteins that have large good-quality multiple sequence alignments (MSA) because the power of correlated mutation analysis falls as the size of the MSA decreases. However, even with small autogenerated MSAs, maximum entropy-based CMMs contain information. To make use of this information, in this article, we focus not on general residue contacts but contacts between residues in β-sheets. The strong constraints and prior knowledge associated with β-contacts are ideally suited for prediction using a method that incorporates an often noisy CMM. Using contrastive divergence, a statistical machine learning technique, we have calculated a maximum entropy-based CMM. We have integrated this measure with a new probabilistic model for β-contact prediction, which is used to predict both residue- and strand-level contacts. Using our model on a standard non-redundant dataset, we significantly outperform a 2D recurrent neural network architecture, achieving a 5% improvement in true positives at the 5% false-positive rate at the residue level. At the strand level, our approach is competitive with the state-of-the-art single methods achieving precision of 61.0% and recall of 55.4%, while not requiring residue solvent accessibility as an input. http://www2.warwick.ac.uk/fac/sci/systemsbiology/research/software/

  20. Maternal c-reactive protein and oxidative stress markers as predictors of delivery latency in patients experiencing preterm premature rupture of membranes.

    PubMed

    Ryu, Hyun Kyung; Moon, Jong Ho; Heo, Hyun Ji; Kim, Jong Woon; Kim, Yoon Ha

    2017-02-01

    To evaluate the usefulness of maternal serum c-reactive protein (CRP), lipid peroxide, and oxygen radical absorbance capacity (ORAC), to predict the interval between membrane rupture and delivery in patients with preterm premature rupture of membranes (PPROM). The present prospective study included patients with singleton pregnancies experiencing PPROM at earlier than 34 weeks of pregnancy who underwent spontaneous vaginal delivery between August 1, 2010 and July 31, 2013 at Chonnam National University Hospital, Republic of Korea. Patients were categorized based on whether delivery occurred within 3 days of PPROM or after. CRP levels, lipid peroxide (using malondialdehyde levels), ORAC, protein carbonyl, and other potential risk factors were compared between the groups. There were 72 patients included. Maternal serum CRP levels, malondialdehyde levels, and Bishop Score were higher in patients who underwent delivery within 3 days (all P<0.05); ORAC levels were lower among these patients (P=0.002). A receiver operating characteristic curve analysis showed that CRP, malondialdehyde, and ORAC levels were predictive of delivery within 3 days after PPROM. Maternal serum CRP, malondialdehyde, and ORAC levels at admission were useful in predicting the latent period in patients with PPROM. © 2016 International Federation of Gynecology and Obstetrics.

  1. Towards Inferring Protein Interactions: Challenges and Solutions

    NASA Astrophysics Data System (ADS)

    Zhang, Ya; Zha, Hongyuan; Chu, Chao-Hsien; Ji, Xiang

    2006-12-01

    Discovering interacting proteins has been an essential part of functional genomics. However, existing experimental techniques only uncover a small portion of any interactome. Furthermore, these data often have a very high false rate. By conceptualizing the interactions at domain level, we provide a more abstract representation of interactome, which also facilitates the discovery of unobserved protein-protein interactions. Although several domain-based approaches have been proposed to predict protein-protein interactions, they usually assume that domain interactions are independent on each other for the convenience of computational modeling. A new framework to predict protein interactions is proposed in this paper, where no assumption is made about domain interactions. Protein interactions may be the result of multiple domain interactions which are dependent on each other. A conjunctive norm form representation is used to capture the relationships between protein interactions and domain interactions. The problem of interaction inference is then modeled as a constraint satisfiability problem and solved via linear programing. Experimental results on a combined yeast data set have demonstrated the robustness and the accuracy of the proposed algorithm. Moreover, we also map some predicted interacting domains to three-dimensional structures of protein complexes to show the validity of our predictions.

  2. Prediction of individual milk proteins including free amino acids in bovine milk using mid-infrared spectroscopy and their correlations with milk processing characteristics.

    PubMed

    McDermott, A; Visentin, G; De Marchi, M; Berry, D P; Fenelon, M A; O'Connor, P M; Kenny, O A; McParland, S

    2016-04-01

    The aim of this study was to evaluate the effectiveness of mid-infrared spectroscopy in predicting milk protein and free amino acid (FAA) composition in bovine milk. Milk samples were collected from 7 Irish research herds and represented cows from a range of breeds, parities, and stages of lactation. Mid-infrared spectral data in the range of 900 to 5,000 cm(-1) were available for 730 milk samples; gold standard methods were used to quantify individual protein fractions and FAA of these samples with a view to predicting these gold standard protein fractions and FAA levels with available mid-infrared spectroscopy data. Separate prediction equations were developed for each trait using partial least squares regression; accuracy of prediction was assessed using both cross validation on a calibration data set (n=400 to 591 samples) and external validation on an independent data set (n=143 to 294 samples). The accuracy of prediction in external validation was the same irrespective of whether undertaken on the entire external validation data set or just within the Holstein-Friesian breed. The strongest coefficient of correlation obtained for protein fractions in external validation was 0.74, 0.69, and 0.67 for total casein, total β-lactoglobulin, and β-casein, respectively. Total proteins (i.e., total casein, total whey, and total lactoglobulin) were predicted with greater accuracy then their respective component traits; prediction accuracy using the infrared spectrum was superior to prediction using just milk protein concentration. Weak to moderate prediction accuracies were observed for FAA. The greatest coefficient of correlation in both cross validation and external validation was for Gly (0.75), indicating a moderate accuracy of prediction. Overall, the FAA prediction models overpredicted the gold standard values. Near-unity correlations existed between total casein and β-casein irrespective of whether the traits were based on the gold standard (0.92) or mid-infrared spectroscopy predictions (0.95). Weaker correlations among FAA were observed than the correlations among the protein fractions. Pearson correlations between gold standard protein fractions and the milk processing characteristics of rennet coagulation time, curd firming time, curd firmness, heat coagulating time, pH, and casein micelle size were weak to moderate and ranged from -0.48 (protein and pH) to 0.50 (total casein and a30). Pearson correlations between gold standard FAA and these milk processing characteristics were also weak to moderate and ranged from -0.60 (Val and pH) to 0.49 (Val and K20). Results from this study indicate that mid-infrared spectroscopy has the potential to predict protein fractions and some FAA in milk at a population level. Copyright © 2016 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  3. Crysalis: an integrated server for computational analysis and design of protein crystallization.

    PubMed

    Wang, Huilin; Feng, Liubin; Zhang, Ziding; Webb, Geoffrey I; Lin, Donghai; Song, Jiangning

    2016-02-24

    The failure of multi-step experimental procedures to yield diffraction-quality crystals is a major bottleneck in protein structure determination. Accordingly, several bioinformatics methods have been successfully developed and employed to select crystallizable proteins. Unfortunately, the majority of existing in silico methods only allow the prediction of crystallization propensity, seldom enabling computational design of protein mutants that can be targeted for enhancing protein crystallizability. Here, we present Crysalis, an integrated crystallization analysis tool that builds on support-vector regression (SVR) models to facilitate computational protein crystallization prediction, analysis, and design. More specifically, the functionality of this new tool includes: (1) rapid selection of target crystallizable proteins at the proteome level, (2) identification of site non-optimality for protein crystallization and systematic analysis of all potential single-point mutations that might enhance protein crystallization propensity, and (3) annotation of target protein based on predicted structural properties. We applied the design mode of Crysalis to identify site non-optimality for protein crystallization on a proteome-scale, focusing on proteins currently classified as non-crystallizable. Our results revealed that site non-optimality is based on biases related to residues, predicted structures, physicochemical properties, and sequence loci, which provides in-depth understanding of the features influencing protein crystallization. Crysalis is freely available at http://nmrcen.xmu.edu.cn/crysalis/.

  4. Crysalis: an integrated server for computational analysis and design of protein crystallization

    PubMed Central

    Wang, Huilin; Feng, Liubin; Zhang, Ziding; Webb, Geoffrey I.; Lin, Donghai; Song, Jiangning

    2016-01-01

    The failure of multi-step experimental procedures to yield diffraction-quality crystals is a major bottleneck in protein structure determination. Accordingly, several bioinformatics methods have been successfully developed and employed to select crystallizable proteins. Unfortunately, the majority of existing in silico methods only allow the prediction of crystallization propensity, seldom enabling computational design of protein mutants that can be targeted for enhancing protein crystallizability. Here, we present Crysalis, an integrated crystallization analysis tool that builds on support-vector regression (SVR) models to facilitate computational protein crystallization prediction, analysis, and design. More specifically, the functionality of this new tool includes: (1) rapid selection of target crystallizable proteins at the proteome level, (2) identification of site non-optimality for protein crystallization and systematic analysis of all potential single-point mutations that might enhance protein crystallization propensity, and (3) annotation of target protein based on predicted structural properties. We applied the design mode of Crysalis to identify site non-optimality for protein crystallization on a proteome-scale, focusing on proteins currently classified as non-crystallizable. Our results revealed that site non-optimality is based on biases related to residues, predicted structures, physicochemical properties, and sequence loci, which provides in-depth understanding of the features influencing protein crystallization. Crysalis is freely available at http://nmrcen.xmu.edu.cn/crysalis/. PMID:26906024

  5. The Strength of Family Ties: Perceptions of Network Relationship Quality and Levels of C-Reactive Proteins in the North Texas Heart Study.

    PubMed

    Uchino, Bert N; Ruiz, John M; Smith, Timothy W; Smyth, Joshua M; Taylor, Daniel J; Allison, Matthew; Ahn, Chul

    2015-10-01

    Although the quality of one's social relationships has been linked to important physical health outcomes, less work has been conducted examining family and friends that differ in their underlying positivity and negativity. The main aim of this study was to examine the association between supportive, aversive, and ambivalent family/friends with levels of C-reactive proteins. Three hundred participants from the North Texas Heart Study completed the social relationships index and a blood draw to assess high-sensitivity C-reactive proteins (hs-CRPs). After standard controls, the number of supportive family members predicted lower hs-CRP levels, whereas the number of ambivalent family members predicted higher hs-CRP levels. These links were independent of depressive symptoms and perceived stress. These data highlight the importance of considering specific types of relationships and their underlying positive and negative aspects in research on social ties and physical health.

  6. Three-dimensional (3D) structure prediction of the American and African oil-palms β-ketoacyl-[ACP] synthase-II protein by comparative modelling

    PubMed Central

    Wang, Edina; Chinni, Suresh; Bhore, Subhash Janardhan

    2014-01-01

    Background: The fatty-acid profile of the vegetable oils determines its properties and nutritional value. Palm-oil obtained from the African oil-palm [Elaeis guineensis Jacq. (Tenera)] contains 44% palmitic acid (C16:0), but, palm-oil obtained from the American oilpalm [Elaeis oleifera] contains only 25% C16:0. In part, the b-ketoacyl-[ACP] synthase II (KASII) [EC: 2.3.1.179] protein is responsible for the high level of C16:0 in palm-oil derived from the African oil-palm. To understand more about E. guineensis KASII (EgKASII) and E. oleifera KASII (EoKASII) proteins, it is essential to know its structures. Hence, this study was undertaken. Objective: The objective of this study was to predict three-dimensional (3D) structure of EgKASII and EoKASII proteins using molecular modelling tools. Materials and Methods: The amino-acid sequences for KASII proteins were retrieved from the protein database of National Center for Biotechnology Information (NCBI), USA. The 3D structures were predicted for both proteins using homology modelling and ab-initio technique approach of protein structure prediction. The molecular dynamics (MD) simulation was performed to refine the predicted structures. The predicted structure models were evaluated and root mean square deviation (RMSD) and root mean square fluctuation (RMSF) values were calculated. Results: The homology modelling showed that EgKASII and EoKASII proteins are 78% and 74% similar with Streptococcus pneumonia KASII and Brucella melitensis KASII, respectively. The EgKASII and EoKASII structures predicted by using ab-initio technique approach shows 6% and 9% deviation to its structures predicted by homology modelling, respectively. The structure refinement and validation confirmed that the predicted structures are accurate. Conclusion: The 3D structures for EgKASII and EoKASII proteins were predicted. However, further research is essential to understand the interaction of EgKASII and EoKASII proteins with its substrates. PMID:24748752

  7. Three-dimensional (3D) structure prediction of the American and African oil-palms β-ketoacyl-[ACP] synthase-II protein by comparative modelling.

    PubMed

    Wang, Edina; Chinni, Suresh; Bhore, Subhash Janardhan

    2014-01-01

    The fatty-acid profile of the vegetable oils determines its properties and nutritional value. Palm-oil obtained from the African oil-palm [Elaeis guineensis Jacq. (Tenera)] contains 44% palmitic acid (C16:0), but, palm-oil obtained from the American oilpalm [Elaeis oleifera] contains only 25% C16:0. In part, the b-ketoacyl-[ACP] synthase II (KASII) [EC: 2.3.1.179] protein is responsible for the high level of C16:0 in palm-oil derived from the African oil-palm. To understand more about E. guineensis KASII (EgKASII) and E. oleifera KASII (EoKASII) proteins, it is essential to know its structures. Hence, this study was undertaken. The objective of this study was to predict three-dimensional (3D) structure of EgKASII and EoKASII proteins using molecular modelling tools. The amino-acid sequences for KASII proteins were retrieved from the protein database of National Center for Biotechnology Information (NCBI), USA. The 3D structures were predicted for both proteins using homology modelling and ab-initio technique approach of protein structure prediction. The molecular dynamics (MD) simulation was performed to refine the predicted structures. The predicted structure models were evaluated and root mean square deviation (RMSD) and root mean square fluctuation (RMSF) values were calculated. The homology modelling showed that EgKASII and EoKASII proteins are 78% and 74% similar with Streptococcus pneumonia KASII and Brucella melitensis KASII, respectively. The EgKASII and EoKASII structures predicted by using ab-initio technique approach shows 6% and 9% deviation to its structures predicted by homology modelling, respectively. The structure refinement and validation confirmed that the predicted structures are accurate. The 3D structures for EgKASII and EoKASII proteins were predicted. However, further research is essential to understand the interaction of EgKASII and EoKASII proteins with its substrates.

  8. HitPredict version 4: comprehensive reliability scoring of physical protein-protein interactions from more than 100 species.

    PubMed

    López, Yosvany; Nakai, Kenta; Patil, Ashwini

    2015-01-01

    HitPredict is a consolidated resource of experimentally identified, physical protein-protein interactions with confidence scores to indicate their reliability. The study of genes and their inter-relationships using methods such as network and pathway analysis requires high quality protein-protein interaction information. Extracting reliable interactions from most of the existing databases is challenging because they either contain only a subset of the available interactions, or a mixture of physical, genetic and predicted interactions. Automated integration of interactions is further complicated by varying levels of accuracy of database content and lack of adherence to standard formats. To address these issues, the latest version of HitPredict provides a manually curated dataset of 398 696 physical associations between 70 808 proteins from 105 species. Manual confirmation was used to resolve all issues encountered during data integration. For improved reliability assessment, this version combines a new score derived from the experimental information of the interactions with the original score based on the features of the interacting proteins. The combined interaction score performs better than either of the individual scores in HitPredict as well as the reliability score of another similar database. HitPredict provides a web interface to search proteins and visualize their interactions, and the data can be downloaded for offline analysis. Data usability has been enhanced by mapping protein identifiers across multiple reference databases. Thus, the latest version of HitPredict provides a significantly larger, more reliable and usable dataset of protein-protein interactions from several species for the study of gene groups. Database URL: http://hintdb.hgc.jp/htp. © The Author(s) 2015. Published by Oxford University Press.

  9. Predicting Human Protein Subcellular Locations by the Ensemble of Multiple Predictors via Protein-Protein Interaction Network with Edge Clustering Coefficients

    PubMed Central

    Du, Pufeng; Wang, Lusheng

    2014-01-01

    One of the fundamental tasks in biology is to identify the functions of all proteins to reveal the primary machinery of a cell. Knowledge of the subcellular locations of proteins will provide key hints to reveal their functions and to understand the intricate pathways that regulate biological processes at the cellular level. Protein subcellular location prediction has been extensively studied in the past two decades. A lot of methods have been developed based on protein primary sequences as well as protein-protein interaction network. In this paper, we propose to use the protein-protein interaction network as an infrastructure to integrate existing sequence based predictors. When predicting the subcellular locations of a given protein, not only the protein itself, but also all its interacting partners were considered. Unlike existing methods, our method requires neither the comprehensive knowledge of the protein-protein interaction network nor the experimentally annotated subcellular locations of most proteins in the protein-protein interaction network. Besides, our method can be used as a framework to integrate multiple predictors. Our method achieved 56% on human proteome in absolute-true rate, which is higher than the state-of-the-art methods. PMID:24466278

  10. MQAPRank: improved global protein model quality assessment by learning-to-rank.

    PubMed

    Jing, Xiaoyang; Dong, Qiwen

    2017-05-25

    Protein structure prediction has achieved a lot of progress during the last few decades and a greater number of models for a certain sequence can be predicted. Consequently, assessing the qualities of predicted protein models in perspective is one of the key components of successful protein structure prediction. Over the past years, a number of methods have been developed to address this issue, which could be roughly divided into three categories: single methods, quasi-single methods and clustering (or consensus) methods. Although these methods achieve much success at different levels, accurate protein model quality assessment is still an open problem. Here, we present the MQAPRank, a global protein model quality assessment program based on learning-to-rank. The MQAPRank first sorts the decoy models by using single method based on learning-to-rank algorithm to indicate their relative qualities for the target protein. And then it takes the first five models as references to predict the qualities of other models by using average GDT_TS scores between reference models and other models. Benchmarked on CASP11 and 3DRobot datasets, the MQAPRank achieved better performances than other leading protein model quality assessment methods. Recently, the MQAPRank participated in the CASP12 under the group name FDUBio and achieved the state-of-the-art performances. The MQAPRank provides a convenient and powerful tool for protein model quality assessment with the state-of-the-art performances, it is useful for protein structure prediction and model quality assessment usages.

  11. Two-level QSAR network (2L-QSAR) for peptide inhibitor design based on amino acid properties and sequence positions.

    PubMed

    Du, Q S; Ma, Y; Xie, N Z; Huang, R B

    2014-01-01

    In the design of peptide inhibitors the huge possible variety of the peptide sequences is of high concern. In collaboration with the fast accumulation of the peptide experimental data and database, a statistical method is suggested for peptide inhibitor design. In the two-level peptide prediction network (2L-QSAR) one level is the physicochemical properties of amino acids and the other level is the peptide sequence position. The activity contributions of amino acids are the functions of physicochemical properties and the sequence positions. In the prediction equation two weight coefficient sets {ak} and {bl} are assigned to the physicochemical properties and to the sequence positions, respectively. After the two coefficient sets are optimized based on the experimental data of known peptide inhibitors using the iterative double least square (IDLS) procedure, the coefficients are used to evaluate the bioactivities of new designed peptide inhibitors. The two-level prediction network can be applied to the peptide inhibitor design that may aim for different target proteins, or different positions of a protein. A notable advantage of the two-level statistical algorithm is that there is no need for host protein structural information. It may also provide useful insight into the amino acid properties and the roles of sequence positions.

  12. The prediction of biogenic magnetic nanoparticles biomineralization in human tissues and organs

    NASA Astrophysics Data System (ADS)

    Medviediev, O.; Gorobets, O. Yu; Gorobets, S. V.; Yadrykhins'ky, V. S.

    2017-10-01

    In this study, human homologs of magnetosome island proteins basing on pairwise and multiple alignment of amino acid sequences were found. The expression levels of genes, which encode magnetosome island proteins of M. gryphiswaldense MSR-1, that were cultured under oxygen deficiency conditions and also under microaerobic conditions were compared to the expression levels of genes that encode the relevant homologs in human organism. The possibility of BMN biomineralization in human tissues and organs, in which BMN were not experimentally found before, was predicted.

  13. Ultrasonography and C-reactive protein can predict the outcomes of voiding cystography after the first urinary tract infection.

    PubMed

    Kido, Jun; Yoshida, Fuminori; Sakaguchi, Katsuya; Ueno, Yasushi; Yanai, Masaaki

    2015-05-01

    This study evaluated whether sex, clinical variables, laboratory variables or ultrasonography predicted the presence of vesicoureteric reflux during the first episode of urinary tract infection in paediatric patients. We also aimed to define the criteria that indicated the need for voiding cystography testing. We used voiding cystography to investigate 200 patients who experienced their first urinary tract infection at our institution between 2004 and 2013 and retrospectively analysed the data by reviewing their medical records. Sex (p = 0.001), peak blood C-reactive protein levels (p < 0.001), the duration of fever after antibiotic administration (p = 0.007) and the ultrasonography findings grade (p < 0.001) were significantly different between patients with and without vesicoureteric reflux. Grade IV-V ultrasonography findings and C-reactive protein levels of ≥80 mg/L predicted vesicoureteric reflux with a sensitivity, specificity and odds ratio of 47.8%, 87.8% and 6.59 (95% confidence interval = 3.26-13.33), respectively (p < 0.001). Voiding cystography should be performed for patients with C-reactive protein levels of ≥80 mg/L and grade IV-V ultrasonography findings, but is not necessary in patients with C-reactive protein levels of <80 mg/L and grade I-III ultrasonography findings. ©2015 Foundation Acta Paediatrica. Published by John Wiley & Sons Ltd.

  14. A review of machine learning methods to predict the solubility of overexpressed recombinant proteins in Escherichia coli.

    PubMed

    Habibi, Narjeskhatoon; Mohd Hashim, Siti Z; Norouzi, Alireza; Samian, Mohammed Razip

    2014-05-08

    Over the last 20 years in biotechnology, the production of recombinant proteins has been a crucial bioprocess in both biopharmaceutical and research arena in terms of human health, scientific impact and economic volume. Although logical strategies of genetic engineering have been established, protein overexpression is still an art. In particular, heterologous expression is often hindered by low level of production and frequent fail due to opaque reasons. The problem is accentuated because there is no generic solution available to enhance heterologous overexpression. For a given protein, the extent of its solubility can indicate the quality of its function. Over 30% of synthesized proteins are not soluble. In certain experimental circumstances, including temperature, expression host, etc., protein solubility is a feature eventually defined by its sequence. Until now, numerous methods based on machine learning are proposed to predict the solubility of protein merely from its amino acid sequence. In spite of the 20 years of research on the matter, no comprehensive review is available on the published methods. This paper presents an extensive review of the existing models to predict protein solubility in Escherichia coli recombinant protein overexpression system. The models are investigated and compared regarding the datasets used, features, feature selection methods, machine learning techniques and accuracy of prediction. A discussion on the models is provided at the end. This study aims to investigate extensively the machine learning based methods to predict recombinant protein solubility, so as to offer a general as well as a detailed understanding for researches in the field. Some of the models present acceptable prediction performances and convenient user interfaces. These models can be considered as valuable tools to predict recombinant protein overexpression results before performing real laboratory experiments, thus saving labour, time and cost.

  15. In silico platform for predicting and initiating β-turns in a protein at desired locations.

    PubMed

    Singh, Harinder; Singh, Sandeep; Raghava, Gajendra P S

    2015-05-01

    Numerous studies have been performed for analysis and prediction of β-turns in a protein. This study focuses on analyzing, predicting, and designing of β-turns to understand the preference of amino acids in β-turn formation. We analyzed around 20,000 PDB chains to understand the preference of residues or pair of residues at different positions in β-turns. Based on the results, a propensity-based method has been developed for predicting β-turns with an accuracy of 82%. We introduced a new approach entitled "Turn level prediction method," which predicts the complete β-turn rather than focusing on the residues in a β-turn. Finally, we developed BetaTPred3, a Random forest based method for predicting β-turns by utilizing various features of four residues present in β-turns. The BetaTPred3 achieved an accuracy of 79% with 0.51 MCC that is comparable or better than existing methods on BT426 dataset. Additionally, models were developed to predict β-turn types with better performance than other methods available in the literature. In order to improve the quality of prediction of turns, we developed prediction models on a large and latest dataset of 6376 nonredundant protein chains. Based on this study, a web server has been developed for prediction of β-turns and their types in proteins. This web server also predicts minimum number of mutations required to initiate or break a β-turn in a protein at specified location of a protein. © 2015 Wiley Periodicals, Inc.

  16. Improving Protein Fold Recognition by Deep Learning Networks

    NASA Astrophysics Data System (ADS)

    Jo, Taeho; Hou, Jie; Eickholt, Jesse; Cheng, Jianlin

    2015-12-01

    For accurate recognition of protein folds, a deep learning network method (DN-Fold) was developed to predict if a given query-template protein pair belongs to the same structural fold. The input used stemmed from the protein sequence and structural features extracted from the protein pair. We evaluated the performance of DN-Fold along with 18 different methods on Lindahl’s benchmark dataset and on a large benchmark set extracted from SCOP 1.75 consisting of about one million protein pairs, at three different levels of fold recognition (i.e., protein family, superfamily, and fold) depending on the evolutionary distance between protein sequences. The correct recognition rate of ensembled DN-Fold for Top 1 predictions is 84.5%, 61.5%, and 33.6% and for Top 5 is 91.2%, 76.5%, and 60.7% at family, superfamily, and fold levels, respectively. We also evaluated the performance of single DN-Fold (DN-FoldS), which showed the comparable results at the level of family and superfamily, compared to ensemble DN-Fold. Finally, we extended the binary classification problem of fold recognition to real-value regression task, which also show a promising performance. DN-Fold is freely available through a web server at http://iris.rnet.missouri.edu/dnfold.

  17. Fasting Lipoprotein Lipase Protein Levels Can Predict a Postmeal Increment of Triglyceride Levels in Fasting Normohypertriglyceridemic Subjects.

    PubMed

    Tsuzaki, Kokoro; Kotani, Kazuhiko; Yamada, Kazunori; Sakane, Naoki

    2016-09-01

    Although a postprandial increment in triglyceride (TG) levels is considered to be a risk factor for atherogenesis, tests (e.g., fat load) to assess postprandial changes in TG levels cannot be easily applied to clinical practice. Therefore, fasting markers that predict postprandial TG states are needed to be developed. One current candidate is lipoprotein lipase (LPL) protein, a molecule that hydrides TGs. This study investigated whether fasting LPL levels could predict postprandial TG levels. A total of 17 subjects (11 men, 6 women, mean age 52 ± 11 years) with normotriglyceridemia during fasting underwent the meal test. Several fasting parameters, including LPL, were measured for the area under the curve of postprandial TGs (AUC-TG). The subjects' mean fasting TG level was 1.30 mmol/l, and their mean LPL level was 41.6 ng/ml. The subjects' TG levels increased after loading (they peaked after two postprandial hours). Stepwise multiple regression analysis demonstrated that fasting TG levels were a predictor of the AUC-TG. In addition, fasting LPL mass levels were found to be a predictor of the AUC-TG (β = 0.65, P < 0.01), and this relationship was independent of fasting TG levels. Fasting LPL levels may be useful to predict postprandial TG increment in this population. © 2015 Wiley Periodicals, Inc.

  18. Mining protein database using machine learning techniques.

    PubMed

    Camargo, Renata da Silva; Niranjan, Mahesan

    2008-08-25

    With a large amount of information relating to proteins accumulating in databases widely available online, it is of interest to apply machine learning techniques that, by extracting underlying statistical regularities in the data, make predictions about the functional and evolutionary characteristics of unseen proteins. Such predictions can help in achieving a reduction in the space over which experiment designers need to search in order to improve our understanding of the biochemical properties. Previously it has been suggested that an integration of features computable by comparing a pair of proteins can be achieved by an artificial neural network, hence predicting the degree to which they may be evolutionary related and homologous.
    We compiled two datasets of pairs of proteins, each pair being characterised by seven distinct features. We performed an exhaustive search through all possible combinations of features, for the problem of separating remote homologous from analogous pairs, we note that significant performance gain was obtained by the inclusion of sequence and structure information. We find that the use of a linear classifier was enough to discriminate a protein pair at the family level. However, at the superfamily level, to detect remote homologous pairs was a relatively harder problem. We find that the use of nonlinear classifiers achieve significantly higher accuracies.
    In this paper, we compare three different pattern classification methods on two problems formulated as detecting evolutionary and functional relationships between pairs of proteins, and from extensive cross validation and feature selection based studies quantify the average limits and uncertainties with which such predictions may be made. Feature selection points to a \\"knowledge gap\\" in currently available functional annotations. We demonstrate how the scheme may be employed in a framework to associate an individual protein with an existing family of evolutionarily related proteins.

  19. Transcriptional bursting explains the noise–versus–mean relationship in mRNA and protein levels

    DOE PAGES

    Dar, Roy; Shaffer, Sydney M.; Singh, Abhyudai; ...

    2016-07-28

    Recent analysis demonstrates that the HIV-1 Long Terminal Repeat (HIV LTR) promoter exhibits a range of possible transcriptional burst sizes and frequencies for any mean-expression level. However, these results have also been interpreted as demonstrating that cell-tocell expression variability (noise) and mean are uncorrelated, a significant deviation from previous results. Here, we re-examine the available mRNA and protein abundance data for the HIV LTR and find that noise in mRNA and protein expression scales inversely with the mean along analytically predicted transcriptional burst-size manifolds. We then experimentally perturb transcriptional activity to test a prediction of the multiple burst-size model: thatmore » increasing burst frequency will cause mRNA noise to decrease along given burst-size lines as mRNA levels increase. In conclusion, the data show that mRNA and protein noise decrease as mean expression increases, supporting the canonical inverse correlation between noise and mean.« less

  20. High pentraxin 3 level predicts septic shock and bacteremia at the onset of febrile neutropenia after intensive chemotherapy of hematologic patients

    PubMed Central

    Vänskä, Matti; Koivula, Irma; Hämäläinen, Sari; Pulkki, Kari; Nousiainen, Tapio; Jantunen, Esa; Juutilainen, Auni

    2011-01-01

    We evaluated pentraxin 3 as a marker for complications of neutropenic fever in 100 hematologic patients receiving intensive chemotherapy. Pentraxin 3 and C-reactive protein were measured at fever onset and then daily to day 3. Bacteremia was observed in 19 patients and septic shock in 5 patients (three deaths). In comparison to C-reactive protein, pentraxin 3 achieved its maximum more rapidly. Pentraxin 3 correlated not only with the same day C-reactive protein but also with the next day C-reactive protein. High pentraxin 3 on day 0 was associated with the development of septic shock (P=0.009) and bacteremia (P=0.046). The non-survivors had constantly high pentraxin 3 levels. To conclude, pentraxin 3 is an early predictor of complications in hematologic patients with neutropenic fever. High level of pentraxin 3 predicts septic shock and bacteremia already at the onset of febrile neutropenia. (ClinicalTrials.gov Identifier: NCT00781040.) PMID:21880642

  1. A new computational strategy for identifying essential proteins based on network topological properties and biological information.

    PubMed

    Qin, Chao; Sun, Yongqi; Dong, Yadong

    2017-01-01

    Essential proteins are the proteins that are indispensable to the survival and development of an organism. Deleting a single essential protein will cause lethality or infertility. Identifying and analysing essential proteins are key to understanding the molecular mechanisms of living cells. There are two types of methods for predicting essential proteins: experimental methods, which require considerable time and resources, and computational methods, which overcome the shortcomings of experimental methods. However, the prediction accuracy of computational methods for essential proteins requires further improvement. In this paper, we propose a new computational strategy named CoTB for identifying essential proteins based on a combination of topological properties, subcellular localization information and orthologous protein information. First, we introduce several topological properties of the protein-protein interaction (PPI) network. Second, we propose new methods for measuring orthologous information and subcellular localization and a new computational strategy that uses a random forest prediction model to obtain a probability score for the proteins being essential. Finally, we conduct experiments on four different Saccharomyces cerevisiae datasets. The experimental results demonstrate that our strategy for identifying essential proteins outperforms traditional computational methods and the most recently developed method, SON. In particular, our strategy improves the prediction accuracy to 89, 78, 79, and 85 percent on the YDIP, YMIPS, YMBD and YHQ datasets at the top 100 level, respectively.

  2. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Szymańska, Paulina; Martin, Katie R.; MacKeigan, Jeffrey P.

    We constructed a mechanistic, computational model for regulation of (macro)autophagy and protein synthesis (at the level of translation). The model was formulated to study the system-level consequences of interactions among the following proteins: two key components of MTOR complex 1 (MTORC1), namely the protein kinase MTOR (mechanistic target of rapamycin) and the scaffold protein RPTOR; the autophagy-initiating protein kinase ULK1; and the multimeric energy-sensing AMP-activated protein kinase (AMPK). Inputs of the model include intrinsic AMPK kinase activity, which is taken as an adjustable surrogate parameter for cellular energy level or AMP:ATP ratio, and rapamycin dose, which controls MTORC1 activity. Outputsmore » of the model include the phosphorylation level of the translational repressor EIF4EBP1, a substrate of MTORC1, and the phosphorylation level of AMBRA1 (activating molecule in BECN1-regulated autophagy), a substrate of ULK1 critical for autophagosome formation. The model incorporates reciprocal regulation of mTORC1 and ULK1 by AMPK, mutual inhibition of MTORC1 and ULK1, and ULK1-mediated negative feedback regulation of AMPK. Through analysis of the model, we find that these processes may be responsible, depending on conditions, for graded responses to stress inputs, for bistable switching between autophagy and protein synthesis, or relaxation oscillations, comprising alternating periods of autophagy and protein synthesis. A sensitivity analysis indicates that the prediction of oscillatory behavior is robust to changes of the parameter values of the model. The model provides testable predictions about the behavior of the AMPK-MTORC1-ULK1 network, which plays a central role in maintaining cellular energy and nutrient homeostasis.« less

  3. Year 2 Report: Protein Function Prediction Platform

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhou, C E

    2012-04-27

    Upon completion of our second year of development in a 3-year development cycle, we have completed a prototype protein structure-function annotation and function prediction system: Protein Function Prediction (PFP) platform (v.0.5). We have met our milestones for Years 1 and 2 and are positioned to continue development in completion of our original statement of work, or a reasonable modification thereof, in service to DTRA Programs involved in diagnostics and medical countermeasures research and development. The PFP platform is a multi-scale computational modeling system for protein structure-function annotation and function prediction. As of this writing, PFP is the only existing fullymore » automated, high-throughput, multi-scale modeling, whole-proteome annotation platform, and represents a significant advance in the field of genome annotation (Fig. 1). PFP modules perform protein functional annotations at the sequence, systems biology, protein structure, and atomistic levels of biological complexity (Fig. 2). Because these approaches provide orthogonal means of characterizing proteins and suggesting protein function, PFP processing maximizes the protein functional information that can currently be gained by computational means. Comprehensive annotation of pathogen genomes is essential for bio-defense applications in pathogen characterization, threat assessment, and medical countermeasure design and development in that it can short-cut the time and effort required to select and characterize protein biomarkers.« less

  4. LocTree2 predicts localization for all domains of life

    PubMed Central

    Goldberg, Tatyana; Hamp, Tobias; Rost, Burkhard

    2012-01-01

    Motivation: Subcellular localization is one aspect of protein function. Despite advances in high-throughput imaging, localization maps remain incomplete. Several methods accurately predict localization, but many challenges remain to be tackled. Results: In this study, we introduced a framework to predict localization in life's three domains, including globular and membrane proteins (3 classes for archaea; 6 for bacteria and 18 for eukaryota). The resulting method, LocTree2, works well even for protein fragments. It uses a hierarchical system of support vector machines that imitates the cascading mechanism of cellular sorting. The method reaches high levels of sustained performance (eukaryota: Q18=65%, bacteria: Q6=84%). LocTree2 also accurately distinguishes membrane and non-membrane proteins. In our hands, it compared favorably with top methods when tested on new data. Availability: Online through PredictProtein (predictprotein.org); as standalone version at http://www.rostlab.org/services/loctree2. Contact: localization@rostlab.org Supplementary Information: Supplementary data are available at Bioinformatics online. PMID:22962467

  5. Chronic ethanol feeding causes depression of mitochondrial elongation factor Tu in the rat liver: implications for the mitochondrial ribosome.

    PubMed

    Weiser, Brian; Gonye, Gregory; Sykora, Peter; Crumm, Sara; Cahill, Alan

    2011-05-01

    Chronic ethanol feeding is known to negatively impact hepatic energy metabolism. Previous studies have indicated that the underlying lesion responsible for this may lie at the level of the mitoribosome. The aim of this study was to characterize the structure of the hepatic mitoribosome in alcoholic male rats and their isocalorically paired controls. Our experiments revealed that chronic ethanol feeding resulted in a significant depletion of both structural (death-associated protein 3) and functional [elongation factor thermo unstable (EF-Tu)] mitoribosomal proteins. In addition, significant increases were found in nucleotide elongation factor thermo stable (EF-Ts) and structural mitochondrial ribosomal protein L12 (MRPL12). The increase in MRPL12 was found to correlate with an increase in the levels of the 39S large mitoribosomal subunit. These changes were accompanied by decreased levels of nuclear- and mitochondrially encoded respiratory subunits, decreased amounts of intact respiratory complexes, decreased hepatic ATP levels, and depressed mitochondrial translation. Mathematical modeling of ethanol-mediated changes in EF-Tu and EF-Ts using prederived kinetic data predicted that the ethanol-mediated decrease in EF-Tu levels could completely account for the impaired mitochondrial protein synthesis. In conclusion, chronic ethanol feeding results in a depletion of mitochondrial EF-Tu levels within the liver that is mathematically predicted to be responsible for the impaired mitochondrial protein synthesis seen in alcoholic animals.

  6. PredPPCrys: Accurate Prediction of Sequence Cloning, Protein Production, Purification and Crystallization Propensity from Protein Sequences Using Multi-Step Heterogeneous Feature Fusion and Selection

    PubMed Central

    Wang, Huilin; Wang, Mingjun; Tan, Hao; Li, Yuan; Zhang, Ziding; Song, Jiangning

    2014-01-01

    X-ray crystallography is the primary approach to solve the three-dimensional structure of a protein. However, a major bottleneck of this method is the failure of multi-step experimental procedures to yield diffraction-quality crystals, including sequence cloning, protein material production, purification, crystallization and ultimately, structural determination. Accordingly, prediction of the propensity of a protein to successfully undergo these experimental procedures based on the protein sequence may help narrow down laborious experimental efforts and facilitate target selection. A number of bioinformatics methods based on protein sequence information have been developed for this purpose. However, our knowledge on the important determinants of propensity for a protein sequence to produce high diffraction-quality crystals remains largely incomplete. In practice, most of the existing methods display poorer performance when evaluated on larger and updated datasets. To address this problem, we constructed an up-to-date dataset as the benchmark, and subsequently developed a new approach termed ‘PredPPCrys’ using the support vector machine (SVM). Using a comprehensive set of multifaceted sequence-derived features in combination with a novel multi-step feature selection strategy, we identified and characterized the relative importance and contribution of each feature type to the prediction performance of five individual experimental steps required for successful crystallization. The resulting optimal candidate features were used as inputs to build the first-level SVM predictor (PredPPCrys I). Next, prediction outputs of PredPPCrys I were used as the input to build second-level SVM classifiers (PredPPCrys II), which led to significantly enhanced prediction performance. Benchmarking experiments indicated that our PredPPCrys method outperforms most existing procedures on both up-to-date and previous datasets. In addition, the predicted crystallization targets of currently non-crystallizable proteins were provided as compendium data, which are anticipated to facilitate target selection and design for the worldwide structural genomics consortium. PredPPCrys is freely available at http://www.structbioinfor.org/PredPPCrys. PMID:25148528

  7. Roles for text mining in protein function prediction.

    PubMed

    Verspoor, Karin M

    2014-01-01

    The Human Genome Project has provided science with a hugely valuable resource: the blueprints for life; the specification of all of the genes that make up a human. While the genes have all been identified and deciphered, it is proteins that are the workhorses of the human body: they are essential to virtually all cell functions and are the primary mechanism through which biological function is carried out. Hence in order to fully understand what happens at a molecular level in biological organisms, and eventually to enable development of treatments for diseases where some aspect of a biological system goes awry, we must understand the functions of proteins. However, experimental characterization of protein function cannot scale to the vast amount of DNA sequence data now available. Computational protein function prediction has therefore emerged as a problem at the forefront of modern biology (Radivojac et al., Nat Methods 10(13):221-227, 2013).Within the varied approaches to computational protein function prediction that have been explored, there are several that make use of biomedical literature mining. These methods take advantage of information in the published literature to associate specific proteins with specific protein functions. In this chapter, we introduce two main strategies for doing this: association of function terms, represented as Gene Ontology terms (Ashburner et al., Nat Genet 25(1):25-29, 2000), to proteins based on information in published articles, and a paradigm called LEAP-FS (Literature-Enhanced Automated Prediction of Functional Sites) in which literature mining is used to validate the predictions of an orthogonal computational protein function prediction method.

  8. Modeling Dynamics of Cell-to-Cell Variability in TRAIL-Induced Apoptosis Explains Fractional Killing and Predicts Reversible Resistance

    PubMed Central

    Bertaux, François; Stoma, Szymon; Drasdo, Dirk; Batt, Gregory

    2014-01-01

    Isogenic cells sensing identical external signals can take markedly different decisions. Such decisions often correlate with pre-existing cell-to-cell differences in protein levels. When not neglected in signal transduction models, these differences are accounted for in a static manner, by assuming randomly distributed initial protein levels. However, this approach ignores the a priori non-trivial interplay between signal transduction and the source of this cell-to-cell variability: temporal fluctuations of protein levels in individual cells, driven by noisy synthesis and degradation. Thus, modeling protein fluctuations, rather than their consequences on the initial population heterogeneity, would set the quantitative analysis of signal transduction on firmer grounds. Adopting this dynamical view on cell-to-cell differences amounts to recast extrinsic variability into intrinsic noise. Here, we propose a generic approach to merge, in a systematic and principled manner, signal transduction models with stochastic protein turnover models. When applied to an established kinetic model of TRAIL-induced apoptosis, our approach markedly increased model prediction capabilities. One obtains a mechanistic explanation of yet-unexplained observations on fractional killing and non-trivial robust predictions of the temporal evolution of cell resistance to TRAIL in HeLa cells. Our results provide an alternative explanation to survival via induction of survival pathways since no TRAIL-induced regulations are needed and suggest that short-lived anti-apoptotic protein Mcl1 exhibit large and rare fluctuations. More generally, our results highlight the importance of accounting for stochastic protein turnover to quantitatively understand signal transduction over extended durations, and imply that fluctuations of short-lived proteins deserve particular attention. PMID:25340343

  9. Periscope: quantitative prediction of soluble protein expression in the periplasm of Escherichia coli

    NASA Astrophysics Data System (ADS)

    Chang, Catherine Ching Han; Li, Chen; Webb, Geoffrey I.; Tey, Bengti; Song, Jiangning; Ramanan, Ramakrishnan Nagasundara

    2016-03-01

    Periplasmic expression of soluble proteins in Escherichia coli not only offers a much-simplified downstream purification process, but also enhances the probability of obtaining correctly folded and biologically active proteins. Different combinations of signal peptides and target proteins lead to different soluble protein expression levels, ranging from negligible to several grams per litre. Accurate algorithms for rational selection of promising candidates can serve as a powerful tool to complement with current trial-and-error approaches. Accordingly, proteomics studies can be conducted with greater efficiency and cost-effectiveness. Here, we developed a predictor with a two-stage architecture, to predict the real-valued expression level of target protein in the periplasm. The output of the first-stage support vector machine (SVM) classifier determines which second-stage support vector regression (SVR) classifier to be used. When tested on an independent test dataset, the predictor achieved an overall prediction accuracy of 78% and a Pearson’s correlation coefficient (PCC) of 0.77. We further illustrate the relative importance of various features with respect to different models. The results indicate that the occurrence of dipeptide glutamine and aspartic acid is the most important feature for the classification model. Finally, we provide access to the implemented predictor through the Periscope webserver, freely accessible at http://lightning.med.monash.edu/periscope/.

  10. Experimental annotation of post-translational features and translated coding regions in the pathogen Salmonella Typhimurium

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ansong, Charles; Tolic, Nikola; Purvine, Samuel O.

    Complete and accurate genome annotation is crucial for comprehensive and systematic studies of biological systems. For example systems biology-oriented genome scale modeling efforts greatly benefit from accurate annotation of protein-coding genes to develop proper functioning models. However, determining protein-coding genes for most new genomes is almost completely performed by inference, using computational predictions with significant documented error rates (> 15%). Furthermore, gene prediction programs provide no information on biologically important post-translational processing events critical for protein function. With the ability to directly measure peptides arising from expressed proteins, mass spectrometry-based proteomics approaches can be used to augment and verify codingmore » regions of a genomic sequence and importantly detect post-translational processing events. In this study we utilized “shotgun” proteomics to guide accurate primary genome annotation of the bacterial pathogen Salmonella Typhimurium 14028 to facilitate a systems-level understanding of Salmonella biology. The data provides protein-level experimental confirmation for 44% of predicted protein-coding genes, suggests revisions to 48 genes assigned incorrect translational start sites, and uncovers 13 non-annotated genes missed by gene prediction programs. We also present a comprehensive analysis of post-translational processing events in Salmonella, revealing a wide range of complex chemical modifications (70 distinct modifications) and confirming more than 130 signal peptide and N-terminal methionine cleavage events in Salmonella. This study highlights several ways in which proteomics data applied during the primary stages of annotation can improve the quality of genome annotations, especially with regards to the annotation of mature protein products.« less

  11. Quantitative Estimation of Plasma Free Drug Fraction in Patients With Varying Degrees of Hepatic Impairment: A Methodological Evaluation.

    PubMed

    Li, Guo-Fu; Yu, Guo; Li, Yanfei; Zheng, Yi; Zheng, Qing-Shan; Derendorf, Hartmut

    2018-07-01

    Quantitative prediction of unbound drug fraction (f u ) is essential for scaling pharmacokinetics through physiologically based approaches. However, few attempts have been made to evaluate the projection of f u values under pathological conditions. The primary objective of this study was to predict f u values (n = 105) of 56 compounds with or without the information of predominant binding protein in patients with varying degrees of hepatic insufficiency by accounting for quantitative changes in molar concentrations of either the major binding protein or albumin plus alpha 1-acid glycoprotein associated with differing levels of hepatic dysfunction. For the purpose of scaling, data pertaining to albumin and α1-acid glycoprotein levels in response to differing degrees of hepatic impairment were systematically collected from 919 adult donors. The results of the present study demonstrate for the first time the feasibility of physiologically based scaling f u in hepatic dysfunction after verifying with experimentally measured data of a wide variety of compounds from individuals with varying degrees of hepatic insufficiency. Furthermore, the high level of predictive accuracy indicates that the inter-relation between the severity of hepatic impairment and these plasma protein levels are physiologically accurate. The present study enhances the confidence in predicting f u in hepatic insufficiency, particularly for albumin-bound drugs. Copyright © 2018 American Pharmacists Association®. Published by Elsevier Inc. All rights reserved.

  12. Recombinant Expression Screening of P. aeruginosa Bacterial Inner Membrane Proteins

    PubMed Central

    2010-01-01

    Background Transmembrane proteins (TM proteins) make up 25% of all proteins and play key roles in many diseases and normal physiological processes. However, much less is known about their structures and molecular mechanisms than for soluble proteins. Problems in expression, solubilization, purification, and crystallization cause bottlenecks in the characterization of TM proteins. This project addressed the need for improved methods for obtaining sufficient amounts of TM proteins for determining their structures and molecular mechanisms. Results Plasmid clones were obtained that encode eighty-seven transmembrane proteins with varying physical characteristics, for example, the number of predicted transmembrane helices, molecular weight, and grand average hydrophobicity (GRAVY). All the target proteins were from P. aeruginosa, a gram negative bacterial opportunistic pathogen that causes serious lung infections in people with cystic fibrosis. The relative expression levels of the transmembrane proteins were measured under several culture growth conditions. The use of E. coli strains, a T7 promoter, and a 6-histidine C-terminal affinity tag resulted in the expression of 61 out of 87 test proteins (70%). In this study, proteins with a higher grand average hydrophobicity and more transmembrane helices were expressed less well than less hydrophobic proteins with fewer transmembrane helices. Conclusions In this study, factors related to overall hydrophobicity and the number of predicted transmembrane helices correlated with the relative expression levels of the target proteins. Identifying physical characteristics that correlate with protein expression might aid in selecting the "low hanging fruit", or proteins that can be expressed to sufficient levels using an E. coli expression system. The use of other expression strategies or host species might be needed for sufficient levels of expression of transmembrane proteins with other physical characteristics. Surveys like this one could aid in overcoming the technical bottlenecks in working with TM proteins and could potentially aid in increasing the rate of structure determination. PMID:21114855

  13. Deregulated HOXB7 expression predicts poor prognosis of patients with malignancies of digestive system.

    PubMed

    Liu, Fang-Teng; Chen, Han-Min; Xiong, Ying; Zhu, Zheng-Ming

    2017-07-26

    Numerous studies have investigated the relationship between deregulated HOXB7 expression with the clinical outcome in patients with digestive stem cancers, HOXB7 has showed negative impacts but with varying levels. We aimed to comprehensively evaluate the prediction and prognostic value of HOXB7 in digestive stem cancers. Electronic databases updated to December 1, 2016 were retrieved to collect relevant eligible studies to quantitatively explore the potential roles of HOXB7 as a prognostic indicator in digestive system cancers. A total of 9 studies (n = 1298 patients) was included in this synthetical meta-analysis. The pooled hazard ratios suggested that high expression of HOXB7 protein was associated with poor prognosis of OS in patients with digestive system cancers (HR = 1.97, 95% CI: 1.65-2.28, p= 0.000), and HOXB7 protein could act as an independent prognostic factor for predicting OS of patients with digestive system cancers (HR: 2.02, 95% CI: 1.69-2.36, p = 0.000). Statistical significance was also observed in subgroup meta-analysis based on the cancer type, histology type, country, sample size and publication date. Furthermore, we examined the correlations between HOXB7 protein and clinicopathological features. It showed that altered expression of HOXB7 protein was correlated with tumor invasion (p = 0.000), lymph node status (p = 0.000), distant metastasis (p = 0.001) and TNM stage (p = 0.000). However, the expression of HOXB7 protein was not associated with age (p = 0.64), gender (p = 0.40) or levels of differentiation (p = 0.19). High expression of HOXB7 protein was associated with poor prognosis of patients with digestive system cancers, as well as clinicopathologic characteristics, including the tumor invasion, lymph node status, distant metastasis and TNM stage. The expression of HOXB7 protein was not associated with age, gender or levels of differentiation. HOXB7 protein expression level in tumor tissue might serve as a novel prognostic marker for digestive system cancers.

  14. The Protein Cost of Metabolic Fluxes: Prediction from Enzymatic Rate Laws and Cost Minimization.

    PubMed

    Noor, Elad; Flamholz, Avi; Bar-Even, Arren; Davidi, Dan; Milo, Ron; Liebermeister, Wolfram

    2016-11-01

    Bacterial growth depends crucially on metabolic fluxes, which are limited by the cell's capacity to maintain metabolic enzymes. The necessary enzyme amount per unit flux is a major determinant of metabolic strategies both in evolution and bioengineering. It depends on enzyme parameters (such as kcat and KM constants), but also on metabolite concentrations. Moreover, similar amounts of different enzymes might incur different costs for the cell, depending on enzyme-specific properties such as protein size and half-life. Here, we developed enzyme cost minimization (ECM), a scalable method for computing enzyme amounts that support a given metabolic flux at a minimal protein cost. The complex interplay of enzyme and metabolite concentrations, e.g. through thermodynamic driving forces and enzyme saturation, would make it hard to solve this optimization problem directly. By treating enzyme cost as a function of metabolite levels, we formulated ECM as a numerically tractable, convex optimization problem. Its tiered approach allows for building models at different levels of detail, depending on the amount of available data. Validating our method with measured metabolite and protein levels in E. coli central metabolism, we found typical prediction fold errors of 4.1 and 2.6, respectively, for the two kinds of data. This result from the cost-optimized metabolic state is significantly better than randomly sampled metabolite profiles, supporting the hypothesis that enzyme cost is important for the fitness of E. coli. ECM can be used to predict enzyme levels and protein cost in natural and engineered pathways, and could be a valuable computational tool to assist metabolic engineering projects. Furthermore, it establishes a direct connection between protein cost and thermodynamics, and provides a physically plausible and computationally tractable way to include enzyme kinetics into constraint-based metabolic models, where kinetics have usually been ignored or oversimplified.

  15. A Serum Protein Profile Predictive of the Resistance to Neoadjuvant Chemotherapy in Advanced Breast Cancers*

    PubMed Central

    Hyung, Seok-Won; Lee, Min Young; Yu, Jong-Han; Shin, Byunghee; Jung, Hee-Jung; Park, Jong-Moon; Han, Wonshik; Lee, Kyung-Min; Moon, Hyeong-Gon; Zhang, Hui; Aebersold, Ruedi; Hwang, Daehee; Lee, Sang-Won; Yu, Myeong-Hee; Noh, Dong-Young

    2011-01-01

    Prediction of the responses to neoadjuvant chemotherapy (NACT) can improve the treatment of patients with advanced breast cancer. Genes and proteins predictive of chemoresistance have been extensively studied in breast cancer tissues. However, noninvasive serum biomarkers capable of such prediction have been rarely exploited. Here, we performed profiling of N-glycosylated proteins in serum from fifteen advanced breast cancer patients (ten patients sensitive to and five patients resistant to NACT) to discover serum biomarkers of chemoresistance using a label-free liquid chromatography-tandem MS method. By performing a series of statistical analyses of the proteomic data, we selected thirteen biomarker candidates and tested their differential serum levels by Western blotting in 13 independent samples (eight patients sensitive to and five patients resistant to NACT). Among the candidates, we then selected the final set of six potential serum biomarkers (AHSG, APOB, C3, C9, CP, and ORM1) whose differential expression was confirmed in the independent samples. Finally, we demonstrated that a multivariate classification model using the six proteins could predict responses to NACT and further predict relapse-free survival of patients. In summary, global N-glycoproteome profile in serum revealed a protein pattern predictive of the responses to NACT, which can be further validated in large clinical studies. PMID:21799047

  16. Intestinal Fatty Acid Binding Protein as a Marker of Necrosis and Severity in Acute Pancreatitis.

    PubMed

    Kupčinskas, Juozas; Gedgaudas, Rolandas; Hartman, Hannes; Sippola, Tomi; Lindström, Outi; Johnson, Colin D; Regnér, Sara

    2018-07-01

    The aim of this study was to study intestinal fatty acid binding protein (i-FABP) as a potential biomarker in predicting severity of acute pancreatitis (AP). In a prospective multicenter cohort study, plasma levels of i-FABP were measured in 402 patients with AP. Severity of AP was determined based on the 1992 Atlanta Classification. Admission levels of plasma i-FABP were significantly higher in patients with pancreatic necrosis, in patients having systemic complications, in patients treated invasively, in patients treated in the intensive care unit, in patients with severe AP, and in deceased patients. Plasma i-FABP levels on admission yielded an area under curve (AUC) of 0.732 in discriminating patients with or without pancreatic necrosis and AUC of 0.669 in predicting severe AP. Combination of levels of i-FABP and venous lactate on the day of admission showed higher discriminative power in severe AP-AUC of 0.808. Higher i-FABP levels on admission were associated with pancreatic necrosis, systemic complications, and severe AP. Low levels of i-FABP had a high negative predictive value for pancreatic necrosis and severe AP. Combination of levels of i-FABP and venous lactates on admission were superior to either of markers used alone in predicting severe AP.

  17. miRNAs in human subcutaneous adipose tissue: Effects of weight loss induced by hypocaloric diet and exercise.

    PubMed

    Kristensen, Malene M; Davidsen, Peter K; Vigelsø, Andreas; Hansen, Christina N; Jensen, Lars J; Jessen, Niels; Bruun, Jens M; Dela, Flemming; Helge, Jørn W

    2017-03-01

    Obesity is central in the development of insulin resistance. However, the underlying mechanisms still need elucidation. Dysregulated microRNAs (miRNAs; post-transcriptional regulators) in adipose tissue may present an important link. The miRNA expression in subcutaneous adipose tissue from 19 individuals with severe obesity (10 women and 9 men) before and after a 15-week weight loss intervention was studied using genome-wide microarray analysis. The microarray results were validated with RT-qPCR, and pathway enrichment analysis of in silico predicted targets was performed to elucidate the biological consequences of the miRNA dysregulation. Lastly, the messenger RNA (mRNA) and/or protein expression of multiple predicted targets as well as several proteins involved in lipolysis were investigated. The intervention led to upregulation of miR-29a-3p and miR-29a-5p and downregulation of miR-20b-5p. The mRNA and protein expression of predicted targets was not significantly affected by the intervention. However, negative correlations between miR-20b-5p and the protein levels of its predicted target, acyl-CoA synthetase long-chain family member 1, were observed. Several other miRNA-target relationships correlated negatively, indicating possible miRNA regulation, including miR-29a-3p and lipoprotein lipase mRNA levels. Proteins involved in lipolysis were not affected by the intervention. Weight loss influenced several miRNAs, some of which were negatively correlated with predicted targets. These dysregulated miRNAs may affect adipocytokine signaling and forkhead box protein O signaling. © 2017 The Obesity Society.

  18. Integration of element specific persistent homology and machine learning for protein-ligand binding affinity prediction.

    PubMed

    Cang, Zixuan; Wei, Guo-Wei

    2018-02-01

    Protein-ligand binding is a fundamental biological process that is paramount to many other biological processes, such as signal transduction, metabolic pathways, enzyme construction, cell secretion, and gene expression. Accurate prediction of protein-ligand binding affinities is vital to rational drug design and the understanding of protein-ligand binding and binding induced function. Existing binding affinity prediction methods are inundated with geometric detail and involve excessively high dimensions, which undermines their predictive power for massive binding data. Topology provides the ultimate level of abstraction and thus incurs too much reduction in geometric information. Persistent homology embeds geometric information into topological invariants and bridges the gap between complex geometry and abstract topology. However, it oversimplifies biological information. This work introduces element specific persistent homology (ESPH) or multicomponent persistent homology to retain crucial biological information during topological simplification. The combination of ESPH and machine learning gives rise to a powerful paradigm for macromolecular analysis. Tests on 2 large data sets indicate that the proposed topology-based machine-learning paradigm outperforms other existing methods in protein-ligand binding affinity predictions. ESPH reveals protein-ligand binding mechanism that can not be attained from other conventional techniques. The present approach reveals that protein-ligand hydrophobic interactions are extended to 40Å  away from the binding site, which has a significant ramification to drug and protein design. Copyright © 2017 John Wiley & Sons, Ltd.

  19. RECURSIVE PROTEIN MODELING: A DIVIDE AND CONQUER STRATEGY FOR PROTEIN STRUCTURE PREDICTION AND ITS CASE STUDY IN CASP9

    PubMed Central

    CHENG, JIANLIN; EICKHOLT, JESSE; WANG, ZHENG; DENG, XIN

    2013-01-01

    After decades of research, protein structure prediction remains a very challenging problem. In order to address the different levels of complexity of structural modeling, two types of modeling techniques — template-based modeling and template-free modeling — have been developed. Template-based modeling can often generate a moderate- to high-resolution model when a similar, homologous template structure is found for a query protein but fails if no template or only incorrect templates are found. Template-free modeling, such as fragment-based assembly, may generate models of moderate resolution for small proteins of low topological complexity. Seldom have the two techniques been integrated together to improve protein modeling. Here we develop a recursive protein modeling approach to selectively and collaboratively apply template-based and template-free modeling methods to model template-covered (i.e. certain) and template-free (i.e. uncertain) regions of a protein. A preliminary implementation of the approach was tested on a number of hard modeling cases during the 9th Critical Assessment of Techniques for Protein Structure Prediction (CASP9) and successfully improved the quality of modeling in most of these cases. Recursive modeling can signicantly reduce the complexity of protein structure modeling and integrate template-based and template-free modeling to improve the quality and efficiency of protein structure prediction. PMID:22809379

  20. Levels of gemcitabine transport and metabolism proteins predict survival times of patients treated with gemcitabine for pancreatic adenocarcinoma.

    PubMed

    Maréchal, Raphaël; Bachet, Jean-Baptiste; Mackey, John R; Dalban, Cécile; Demetter, Pieter; Graham, Kathryn; Couvelard, Anne; Svrcek, Magali; Bardier-Dupas, Armelle; Hammel, Pascal; Sauvanet, Alain; Louvet, Christophe; Paye, François; Rougier, Philippe; Penna, Christophe; André, Thierry; Dumontet, Charles; Cass, Carol E; Jordheim, Lars Petter; Matera, Eva-Laure; Closset, Jean; Salmon, Isabelle; Devière, Jacques; Emile, Jean-François; Van Laethem, Jean-Luc

    2012-09-01

    Patients who undergo surgery for pancreatic ductal adenocarcinoma (PDAC) frequently receive adjuvant gemcitabine chemotherapy. Key determinants of gemcitabine cytotoxicity include the activities of the human equilibrative nucleoside transporter 1 (hENT1), deoxycytidine kinase (dCK), and ribonucleotide reductase subunit 1 (RRM1). We investigated whether tumor levels of these proteins were associated with efficacy of gemcitabine therapy following surgery. Sequential samples of resected PDACs were retrospectively collected from 434 patients at 5 centers; 142 patients did not receive adjuvant treatment (33%), 243 received adjuvant gemcitabine-based regimens (56%), and 49 received nongemcitabine regimens (11%). We measured protein levels of hENT1, dCK, and RRM1 by semiquantitative immunohistochemistry with tissue microarrays and investigated their relationship with patients' overall survival time. The median overall survival time of patients was 32.0 months. Among patients who did not receive adjuvant treatment, levels of hENT1, RRM1, and dCK were not associated with survival time. Among patients who received gemcitabine, high levels of hENT1 and dCK were significantly associated with longer survival time (hazard ratios of 0.34 [P < .0001] and 0.57 [P = .012], respectively). Interaction tests for gemcitabine administration and hENT1 and dCK status were statistically significant (P = .0007 and P = .016, respectively). On multivariate analysis of this population, hENT1 and dCK retained independent predictive values, and those patients with high levels of each protein had the longest survival times following adjuvant therapy with gemcitabine. High levels of hENT1 and dCK in PDAC predict longer survival times in patients treated with adjuvant gemcitabine. Copyright © 2012 AGA Institute. Published by Elsevier Inc. All rights reserved.

  1. Prealbumin, platelet factor 4 and S100A12 combination at baseline predicts good response to TNF alpha inhibitors in Rheumatoid Arthritis.

    PubMed

    Nguyen, Minh Vu Chuong; Baillet, Athan; Romand, Xavier; Trocmé, Candice; Courtier, Anaïs; Marotte, Hubert; Thomas, Thierry; Soubrier, Martin; Miossec, Pierre; Tébib, Jacques; Grange, Laurent; Toussaint, Bertrand; Lequerré, Thierry; Vittecoq, Olivier; Gaudin, Philippe

    2018-06-06

    Tumour necrosis factor-alpha inhibitors (TNFi) are effective treatments for Rheumatoid Arthritis (RA). Responses to treatment are barely predictable. As these treatments are costly and may induce a number of side effects, we aimed at identifying a panel of protein biomarkers that could be used to predict clinical response to TNFi for RA patients. Baseline blood levels of C-reactive protein, platelet factor 4, apolipoprotein A1, prealbumin, α1-antitrypsin, haptoglobin, S100A8/A9 and S100A12 proteins in bDMARD naive patients at the time of TNFi treatment initiation were assessed in a multicentric prospective French cohort. Patients fulfilling good EULAR response at 6 months were considered as responders. Logistic regression was used to determine best biomarker set that could predict good clinical response to TNFi. A combination of biomarkers (prealbumin, platelet factor 4 and S100A12) was identified and could predict response to TNFi in RA with sensitivity of 78%, specificity of 77%, positive predictive values (PPV) of 72%, negative predictive values (NPV) of 82%, positive likelihood ratio (LR+) of 3.35 and negative likelihood ratio (LR-) of 0.28. Lower levels of prealbumin and S100A12 and higher level of platelet factor 4 than the determined cutoff at baseline in RA patients are good predictors for response to TNFi treatment globally as well as to Infliximab, Etanercept and Adalimumab individually. A multivariate model combining 3 biomarkers (prealbumin, platelet factor 4 and S100A12) accurately predicted response of RA patients to TNFi and has potential in a daily practice personalized treatment. Copyright © 2018. Published by Elsevier Masson SAS.

  2. C-reactive protein as a prognostic indicator for rebleeding in patients with nonvariceal upper gastrointestinal bleeding.

    PubMed

    Lee, Han Hee; Park, Jae Myung; Lee, Soon-Wook; Kang, Seung Hun; Lim, Chul-Hyun; Cho, Yu Kyung; Lee, Bo-In; Lee, In Seok; Kim, Sang Woo; Choi, Myung-Gyu

    2015-05-01

    In patients with acute nonvariceal upper gastrointestinal bleeding, rebleeding after an initial treatment is observed in 10-20% and is associated with mortality. To investigate whether the initial serum C-reactive protein level could predict the risk of rebleeding in patients with acute nonvariceal upper gastrointestinal bleeding. This was a retrospective study using prospectively collected data for upper gastrointestinal bleeding. Initial clinical characteristics, endoscopic features, and C-reactive protein levels were compared between those with and without 30-day rebleeding. A total of 453 patients were included (mean age, 62 years; male, 70.9%). The incidence of 30-day rebleeding was 15.9%. The mean serum C-reactive protein level was significantly higher in these patients than in those without rebleeding (P<0.001). The area under the receiver operating characteristics curve with a cutoff value of 0.5mg/dL was 0.689 (P<0.001). High serum C-reactive protein level (odds ratio, 2.98; confidence interval, 1.65-5.40) was independently associated with the 30-day rebleeding risk after adjustment for the main confounding risk factors, including age, blood pressure, and initial haemoglobin level. The serum C-reactive protein was an independent risk factor for 30-day rebleeding in patients with acute nonvariceal upper gastrointestinal bleeding, indicating a possible role as a useful screening indicator for predicting the risk of rebleeding. Copyright © 2015 Editrice Gastroenterologica Italiana S.r.l. Published by Elsevier Ltd. All rights reserved.

  3. ClubSub-P: Cluster-Based Subcellular Localization Prediction for Gram-Negative Bacteria and Archaea

    PubMed Central

    Paramasivam, Nagarajan; Linke, Dirk

    2011-01-01

    The subcellular localization (SCL) of proteins provides important clues to their function in a cell. In our efforts to predict useful vaccine targets against Gram-negative bacteria, we noticed that misannotated start codons frequently lead to wrongly assigned SCLs. This and other problems in SCL prediction, such as the relatively high false-positive and false-negative rates of some tools, can be avoided by applying multiple prediction tools to groups of homologous proteins. Here we present ClubSub-P, an online database that combines existing SCL prediction tools into a consensus pipeline from more than 600 proteomes of fully sequenced microorganisms. On top of the consensus prediction at the level of single sequences, the tool uses clusters of homologous proteins from Gram-negative bacteria and from Archaea to eliminate false-positive and false-negative predictions. ClubSub-P can assign the SCL of proteins from Gram-negative bacteria and Archaea with high precision. The database is searchable, and can easily be expanded using either new bacterial genomes or new prediction tools as they become available. This will further improve the performance of the SCL prediction, as well as the detection of misannotated start codons and other annotation errors. ClubSub-P is available online at http://toolkit.tuebingen.mpg.de/clubsubp/ PMID:22073040

  4. Exploiting three kinds of interface propensities to identify protein binding sites.

    PubMed

    Liu, Bin; Wang, Xiaolong; Lin, Lei; Dong, Qiwen; Wang, Xuan

    2009-08-01

    Predicting the binding sites between two interacting proteins provides important clues to the function of a protein. In this study, we present a building block of proteins called order profiles to use the evolutionary information of the protein sequence frequency profiles and apply this building block to produce a class of propensities called order profile interface propensities. For comparisons, we revisit the usage of residue interface propensities and binary profile interface propensities for protein binding site prediction. Each kind of propensities combined with sequence profiles and accessible surface areas are inputted into SVM. When tested on four types of complexes (hetero-permanent complexes, hetero-transient complexes, homo-permanent complexes and homo-transient complexes), experimental results show that the order profile interface propensities are better than residue interface propensities and binary profile interface propensities. Therefore, order profile is a suitable profile-level building block of the protein sequences and can be widely used in many tasks of computational biology, such as the sequence alignment, the prediction of domain boundary, the designation of knowledge-based potentials and the protein remote homology detection.

  5. Decreased levels of sRAGE in follicular fluid from patients with PCOS.

    PubMed

    Wang, BiJun; Li, Jing; Yang, QingLing; Zhang, FuLi; Hao, MengMeng; Guo, YiHong

    2017-03-01

    This study aimed to explore the association between soluble receptor for advanced glycation end products (sRAGE) levels in follicular fluid and the number of oocytes retrieved and to evaluate the effect of sRAGE on vascular endothelial growth factor (VEGF) in granulosa cells in patients with polycystic ovarian syndrome (PCOS). Two sets of experiments were performed in this study. In part one, sRAGE and VEGF protein levels in follicular fluid samples from 39 patients with PCOS and 35 non-PCOS patients were measured by ELISA. In part two, ovarian granulosa cells were isolated from an additional 10 patients with PCOS and cultured. VEGF and SP1 mRNA and protein levels, as well as pAKT levels, were detected by real-time PCR and Western blotting after cultured cells were treated with different concentrations of sRAGE. Compared with the non-PCOS patients, patients with PCOS had lower sRAGE levels in follicular fluid. Multi-adjusted regression analysis showed that high sRAGE levels in follicular fluid predicted a lower Gn dose, more oocytes retrieved, and a better IVF outcome in the non-PCOS group. Logistic regression analysis showed that higher sRAGE levels predicted favorably IVF outcomes in the non-PCOS group. Multi-adjusted regression analysis also showed that high sRAGE levels in follicular fluid predicted a lower Gn dose in the PCOS group. Treating granulosa cells isolated from patients with PCOS with recombinant sRAGE decreased VEGF and SP1 mRNA and protein expression and pAKT levels in a dose-dependent manner. © 2017 Society for Reproduction and Fertility.

  6. Graph pyramids for protein function prediction

    PubMed Central

    2015-01-01

    Background Uncovering the hidden organizational characteristics and regularities among biological sequences is the key issue for detailed understanding of an underlying biological phenomenon. Thus pattern recognition from nucleic acid sequences is an important affair for protein function prediction. As proteins from the same family exhibit similar characteristics, homology based approaches predict protein functions via protein classification. But conventional classification approaches mostly rely on the global features by considering only strong protein similarity matches. This leads to significant loss of prediction accuracy. Methods Here we construct the Protein-Protein Similarity (PPS) network, which captures the subtle properties of protein families. The proposed method considers the local as well as the global features, by examining the interactions among 'weakly interacting proteins' in the PPS network and by using hierarchical graph analysis via the graph pyramid. Different underlying properties of the protein families are uncovered by operating the proposed graph based features at various pyramid levels. Results Experimental results on benchmark data sets show that the proposed hierarchical voting algorithm using graph pyramid helps to improve computational efficiency as well the protein classification accuracy. Quantitatively, among 14,086 test sequences, on an average the proposed method misclassified only 21.1 sequences whereas baseline BLAST score based global feature matching method misclassified 362.9 sequences. With each correctly classified test sequence, the fast incremental learning ability of the proposed method further enhances the training model. Thus it has achieved more than 96% protein classification accuracy using only 20% per class training data. PMID:26044522

  7. Graph pyramids for protein function prediction.

    PubMed

    Sandhan, Tushar; Yoo, Youngjun; Choi, Jin; Kim, Sun

    2015-01-01

    Uncovering the hidden organizational characteristics and regularities among biological sequences is the key issue for detailed understanding of an underlying biological phenomenon. Thus pattern recognition from nucleic acid sequences is an important affair for protein function prediction. As proteins from the same family exhibit similar characteristics, homology based approaches predict protein functions via protein classification. But conventional classification approaches mostly rely on the global features by considering only strong protein similarity matches. This leads to significant loss of prediction accuracy. Here we construct the Protein-Protein Similarity (PPS) network, which captures the subtle properties of protein families. The proposed method considers the local as well as the global features, by examining the interactions among 'weakly interacting proteins' in the PPS network and by using hierarchical graph analysis via the graph pyramid. Different underlying properties of the protein families are uncovered by operating the proposed graph based features at various pyramid levels. Experimental results on benchmark data sets show that the proposed hierarchical voting algorithm using graph pyramid helps to improve computational efficiency as well the protein classification accuracy. Quantitatively, among 14,086 test sequences, on an average the proposed method misclassified only 21.1 sequences whereas baseline BLAST score based global feature matching method misclassified 362.9 sequences. With each correctly classified test sequence, the fast incremental learning ability of the proposed method further enhances the training model. Thus it has achieved more than 96% protein classification accuracy using only 20% per class training data.

  8. SynechoNET: integrated protein-protein interaction database of a model cyanobacterium Synechocystis sp. PCC 6803.

    PubMed

    Kim, Woo-Yeon; Kang, Sungsoo; Kim, Byoung-Chul; Oh, Jeehyun; Cho, Seongwoong; Bhak, Jong; Choi, Jong-Soon

    2008-01-01

    Cyanobacteria are model organisms for studying photosynthesis, carbon and nitrogen assimilation, evolution of plant plastids, and adaptability to environmental stresses. Despite many studies on cyanobacteria, there is no web-based database of their regulatory and signaling protein-protein interaction networks to date. We report a database and website SynechoNET that provides predicted protein-protein interactions. SynechoNET shows cyanobacterial domain-domain interactions as well as their protein-level interactions using the model cyanobacterium, Synechocystis sp. PCC 6803. It predicts the protein-protein interactions using public interaction databases that contain mutually complementary and redundant data. Furthermore, SynechoNET provides information on transmembrane topology, signal peptide, and domain structure in order to support the analysis of regulatory membrane proteins. Such biological information can be queried and visualized in user-friendly web interfaces that include the interactive network viewer and search pages by keyword and functional category. SynechoNET is an integrated protein-protein interaction database designed to analyze regulatory membrane proteins in cyanobacteria. It provides a platform for biologists to extend the genomic data of cyanobacteria by predicting interaction partners, membrane association, and membrane topology of Synechocystis proteins. SynechoNET is freely available at http://synechocystis.org/ or directly at http://bioportal.kobic.kr/SynechoNET/.

  9. Computational analysis of an autophagy/translation switch based on mutual inhibition of MTORC1 and ULK1

    DOE PAGES

    Szymańska, Paulina; Martin, Katie R.; MacKeigan, Jeffrey P.; ...

    2015-03-11

    We constructed a mechanistic, computational model for regulation of (macro)autophagy and protein synthesis (at the level of translation). The model was formulated to study the system-level consequences of interactions among the following proteins: two key components of MTOR complex 1 (MTORC1), namely the protein kinase MTOR (mechanistic target of rapamycin) and the scaffold protein RPTOR; the autophagy-initiating protein kinase ULK1; and the multimeric energy-sensing AMP-activated protein kinase (AMPK). Inputs of the model include intrinsic AMPK kinase activity, which is taken as an adjustable surrogate parameter for cellular energy level or AMP:ATP ratio, and rapamycin dose, which controls MTORC1 activity. Outputsmore » of the model include the phosphorylation level of the translational repressor EIF4EBP1, a substrate of MTORC1, and the phosphorylation level of AMBRA1 (activating molecule in BECN1-regulated autophagy), a substrate of ULK1 critical for autophagosome formation. The model incorporates reciprocal regulation of mTORC1 and ULK1 by AMPK, mutual inhibition of MTORC1 and ULK1, and ULK1-mediated negative feedback regulation of AMPK. Through analysis of the model, we find that these processes may be responsible, depending on conditions, for graded responses to stress inputs, for bistable switching between autophagy and protein synthesis, or relaxation oscillations, comprising alternating periods of autophagy and protein synthesis. A sensitivity analysis indicates that the prediction of oscillatory behavior is robust to changes of the parameter values of the model. The model provides testable predictions about the behavior of the AMPK-MTORC1-ULK1 network, which plays a central role in maintaining cellular energy and nutrient homeostasis.« less

  10. EFFECT OF DIETARY PROTEIN AND CARBOHYDRATE LEVELS ON WEIGHT GAIN AND GONAD PRODUCTION IN THE SEA URCHIN LYTECHINUS VARIEGATUS

    PubMed Central

    Heflin, Laura E.; Gibbs, Victoria K.; Powell, Mickie L; Makowsky, Robert; Lawrence, John M.; Lawrence, Addison L.; Watts, Stephen A.

    2014-01-01

    Adult Lytechinus variegatus were fed eight formulated diets with different protein (ranging from 12 to 36%) and carbohydrate (ranging from 21 to 39 %) levels. Each sea urchin (n = 8 per treatment) was fed a daily sub-satiation ration of 1.5% of average body weight for 9 weeks. Akaike information criterion analysis was used to compare six different hypothesized dietary composition models across eight growth measurements. Dietary protein level and protein: energy ratio were the best models for prediction of total weight gain. Diets with the highest (> 68.6 mg P kcal−-1) protein: energy ratios produced the most wet weight gain after 9 weeks. Dietary carbohydrate level was a poor predictor for most growth parameters examined in this study. However, the model containing a protein × carbohydrate interaction effect was the best model for protein efficiency ratio (PER). PER decreased with increasing dietary protein level, more so at higher carbohydrate levels. Food conversion ratio (FCR) was best modeled by total dietary energy levels: Higher energy diets produced lower FCRs. Dietary protein level was the best model of gonad wet weight gain. These data suggest that variations in dietary nutrients and energy differentially affect organismal growth and growth of body components. PMID:24994942

  11. Reduced changes in protein compared to mRNA levels across non-proliferating tissues.

    PubMed

    Perl, Kobi; Ushakov, Kathy; Pozniak, Yair; Yizhar-Barnea, Ofer; Bhonker, Yoni; Shivatzki, Shaked; Geiger, Tamar; Avraham, Karen B; Shamir, Ron

    2017-04-18

    The quantitative relations between RNA and protein are fundamental to biology and are still not fully understood. Across taxa, it was demonstrated that the protein-to-mRNA ratio in steady state varies in a direction that lessens the change in protein levels as a result of changes in the transcript abundance. Evidence for this behavior in tissues is sparse. We tested this phenomenon in new data that we produced for the mouse auditory system, and in previously published tissue datasets. A joint analysis of the transcriptome and proteome was performed across four datasets: inner-ear mouse tissues, mouse organ tissues, lymphoblastoid primate samples and human cancer cell lines. We show that the protein levels are more conserved than the mRNA levels in all datasets, and that changes in transcription are associated with translational changes that exert opposite effects on the final protein level, in all tissues except cancer. Finally, we observe that some functions are enriched in the inner ear on the mRNA level but not in protein. We suggest that partial buffering between transcription and translation ensures that proteins can be made rapidly in response to a stimulus. Accounting for the buffering can improve the prediction of protein levels from mRNA levels.

  12. Consistent prediction of GO protein localization.

    PubMed

    Spetale, Flavio E; Arce, Debora; Krsticevic, Flavia; Bulacio, Pilar; Tapia, Elizabeth

    2018-05-17

    The GO-Cellular Component (GO-CC) ontology provides a controlled vocabulary for the consistent description of the subcellular compartments or macromolecular complexes where proteins may act. Current machine learning-based methods used for the automated GO-CC annotation of proteins suffer from the inconsistency of individual GO-CC term predictions. Here, we present FGGA-CC + , a class of hierarchical graph-based classifiers for the consistent GO-CC annotation of protein coding genes at the subcellular compartment or macromolecular complex levels. Aiming to boost the accuracy of GO-CC predictions, we make use of the protein localization knowledge in the GO-Biological Process (GO-BP) annotations to boost the accuracy of GO-CC prediction. As a result, FGGA-CC + classifiers are built from annotation data in both the GO-CC and GO-BP ontologies. Due to their graph-based design, FGGA-CC + classifiers are fully interpretable and their predictions amenable to expert analysis. Promising results on protein annotation data from five model organisms were obtained. Additionally, successful validation results in the annotation of a challenging subset of tandem duplicated genes in the tomato non-model organism were accomplished. Overall, these results suggest that FGGA-CC + classifiers can indeed be useful for satisfying the huge demand of GO-CC annotation arising from ubiquitous high throughout sequencing and proteomic projects.

  13. Comprehensive curation and analysis of global interaction networks in Saccharomyces cerevisiae

    PubMed Central

    Reguly, Teresa; Breitkreutz, Ashton; Boucher, Lorrie; Breitkreutz, Bobby-Joe; Hon, Gary C; Myers, Chad L; Parsons, Ainslie; Friesen, Helena; Oughtred, Rose; Tong, Amy; Stark, Chris; Ho, Yuen; Botstein, David; Andrews, Brenda; Boone, Charles; Troyanskya, Olga G; Ideker, Trey; Dolinski, Kara; Batada, Nizar N; Tyers, Mike

    2006-01-01

    Background The study of complex biological networks and prediction of gene function has been enabled by high-throughput (HTP) methods for detection of genetic and protein interactions. Sparse coverage in HTP datasets may, however, distort network properties and confound predictions. Although a vast number of well substantiated interactions are recorded in the scientific literature, these data have not yet been distilled into networks that enable system-level inference. Results We describe here a comprehensive database of genetic and protein interactions, and associated experimental evidence, for the budding yeast Saccharomyces cerevisiae, as manually curated from over 31,793 abstracts and online publications. This literature-curated (LC) dataset contains 33,311 interactions, on the order of all extant HTP datasets combined. Surprisingly, HTP protein-interaction datasets currently achieve only around 14% coverage of the interactions in the literature. The LC network nevertheless shares attributes with HTP networks, including scale-free connectivity and correlations between interactions, abundance, localization, and expression. We find that essential genes or proteins are enriched for interactions with other essential genes or proteins, suggesting that the global network may be functionally unified. This interconnectivity is supported by a substantial overlap of protein and genetic interactions in the LC dataset. We show that the LC dataset considerably improves the predictive power of network-analysis approaches. The full LC dataset is available at the BioGRID () and SGD () databases. Conclusion Comprehensive datasets of biological interactions derived from the primary literature provide critical benchmarks for HTP methods, augment functional prediction, and reveal system-level attributes of biological networks. PMID:16762047

  14. Survey of predictors of propensity for protein production and crystallization with application to predict resolution of crystal structures

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gao, Jianzhao; Wu, Zhonghua; Hu, Gang

    Selection of proper targets for the X-ray crystallography will benefit biological research community immensely. Several computational models were proposed to predict propensity of successful protein production and diffraction quality crystallization from protein sequences. We reviewed a comprehensive collection of 22 such predictors that were developed in the last decade. We found that almost all of these models are easily accessible as webservers and/or standalone software and we demonstrated that some of them are widely used by the research community. We empirically evaluated and compared the predictive performance of seven representative methods. The analysis suggests that these methods produce quite accuratemore » propensities for the diffraction-quality crystallization. We also summarized results of the first study of the relation between these predictive propensities and the resolution of the crystallizable proteins. We found that the propensities predicted by several methods are significantly higher for proteins that have high resolution structures compared to those with the low resolution structures. Moreover, we tested a new meta-predictor, MetaXXC, which averages the propensities generated by the three most accurate predictors of the diffraction-quality crystallization. MetaXXC generates putative values of resolution that have modest levels of correlation with the experimental resolutions and it offers the lowest mean absolute error when compared to the seven considered methods. We conclude that protein sequences can be used to fairly accurately predict whether their corresponding protein structures can be solved using X-ray crystallography. Moreover, we also ascertain that sequences can be used to reasonably well predict the resolution of the resulting protein crystals.« less

  15. A web server for analysis, comparison and prediction of protein ligand binding sites.

    PubMed

    Singh, Harinder; Srivastava, Hemant Kumar; Raghava, Gajendra P S

    2016-03-25

    One of the major challenges in the field of system biology is to understand the interaction between a wide range of proteins and ligands. In the past, methods have been developed for predicting binding sites in a protein for a limited number of ligands. In order to address this problem, we developed a web server named 'LPIcom' to facilitate users in understanding protein-ligand interaction. Analysis, comparison and prediction modules are available in the "LPIcom' server to predict protein-ligand interacting residues for 824 ligands. Each ligand must have at least 30 protein binding sites in PDB. Analysis module of the server can identify residues preferred in interaction and binding motif for a given ligand; for example residues glycine, lysine and arginine are preferred in ATP binding sites. Comparison module of the server allows comparing protein-binding sites of multiple ligands to understand the similarity between ligands based on their binding site. This module indicates that ATP, ADP and GTP ligands are in the same cluster and thus their binding sites or interacting residues exhibit a high level of similarity. Propensity-based prediction module has been developed for predicting ligand-interacting residues in a protein for more than 800 ligands. In addition, a number of web-based tools have been integrated to facilitate users in creating web logo and two-sample between ligand interacting and non-interacting residues. In summary, this manuscript presents a web-server for analysis of ligand interacting residue. This server is available for public use from URL http://crdd.osdd.net/raghava/lpicom .

  16. Quantitative PET Imaging with Novel HER3-Targeted Peptides Selected by Phage Display to Predict Androgen-Independent Prostate Cancer Progression

    DTIC Science & Technology

    2017-12-01

    peptide in tumors that was linearly correlated with HER3 levels. Biodistribution analysis revealed low off-target accumulation and rapid clearance...Internal Lab 15-22 Dr. Larimer 5 Stock) Subtask 2: Correlate changes in peptide uptake with protein expression and cell signaling changes ex vivo...signal for each individual tumor was plotted against its corresponding HER3 protein level, the TBR correlated linearly with the amount of protein

  17. Impact of Upfront Cellular Enrichment by Laser Capture Microdissection on Protein and Phosphoprotein Drug Target Signaling Activation Measurements in Human Lung Cancer: Implications for Personalized Medicine

    PubMed Central

    Elisa, Baldelli; B., Haura Eric; Lucio, Crinò; Douglas, Cress W.; Vienna, Ludovini; B., Schabath Matthew; A., Liotta Lance; F., Petricoin Emanuel; Mariaelena, Pierobon

    2015-01-01

    Purpose The aim of this study was to evaluate whether upfront cellular enrichment via laser capture microdissection is necessary for accurately quantifying predictive biomarkers in non-small cell lung cancer tumors. Experimental design Fifteen snap frozen surgical biopsies were analyzed. Whole tissue lysate and matched highly enriched tumor epithelium via laser capture microdissection (LCM) were obtained for each patient. The expression and activation/phosphorylation levels of 26 proteins were measured by reverse phase protein microarray. Differences in signaling architecture of dissected and undissected matched pairs were visualized using unsupervised clustering analysis, bar graphs, and scatter plots. Results Overall patient matched LCM and undissected material displayed very distinct and differing signaling architectures with 93% of the matched pairs clustering separately. These differences were seen regardless of the amount of starting tumor epithelial content present in the specimen. Conclusions and clinical relevance These results indicate that LCM driven upfront cellular enrichment is necessary to accurately determine the expression/activation levels of predictive protein signaling markers although results should be evaluated in larger clinical settings. Upfront cellular enrichment of the target cell appears to be an important part of the workflow needed for the accurate quantification of predictive protein signaling biomarkers. Larger independent studies are warranted. PMID:25676683

  18. Structure Prediction of the Second Extracellular Loop in G-Protein-Coupled Receptors

    PubMed Central

    Kmiecik, Sebastian; Jamroz, Michal; Kolinski, Michal

    2014-01-01

    G-protein-coupled receptors (GPCRs) play key roles in living organisms. Therefore, it is important to determine their functional structures. The second extracellular loop (ECL2) is a functionally important region of GPCRs, which poses significant challenge for computational structure prediction methods. In this work, we evaluated CABS, a well-established protein modeling tool for predicting ECL2 structure in 13 GPCRs. The ECL2s (with between 13 and 34 residues) are predicted in an environment of other extracellular loops being fully flexible and the transmembrane domain fixed in its x-ray conformation. The modeling procedure used theoretical predictions of ECL2 secondary structure and experimental constraints on disulfide bridges. Our approach yielded ensembles of low-energy conformers and the most populated conformers that contained models close to the available x-ray structures. The level of similarity between the predicted models and x-ray structures is comparable to that of other state-of-the-art computational methods. Our results extend other studies by including newly crystallized GPCRs. PMID:24896119

  19. Biomarker-Based Prediction Models for Response to Treatment in Systemic Sclerosis-Related Interstitial Lung Disease

    DTIC Science & Technology

    2017-10-01

    in the baseline samples of the Scleroderma Lung Study II (SLS II). We are currently analyzing whether these serum proteins have predictive...In this project, we use the valuable samples collected in the Scleroderma Lung Study II (SLSII) clinical trial and the observational cohort, GENISOS...determine key serum protein levels and transcript signatures in whole blood and skin samples collected in the SLSII study . The identified candidate

  20. Rationally designed synthetic protein hydrogels with predictable mechanical properties.

    PubMed

    Wu, Junhua; Li, Pengfei; Dong, Chenling; Jiang, Heting; Bin Xue; Gao, Xiang; Qin, Meng; Wang, Wei; Bin Chen; Cao, Yi

    2018-02-12

    Designing synthetic protein hydrogels with tailored mechanical properties similar to naturally occurring tissues is an eternal pursuit in tissue engineering and stem cell and cancer research. However, it remains challenging to correlate the mechanical properties of protein hydrogels with the nanomechanics of individual building blocks. Here we use single-molecule force spectroscopy, protein engineering and theoretical modeling to prove that the mechanical properties of protein hydrogels are predictable based on the mechanical hierarchy of the cross-linkers and the load-bearing modules at the molecular level. These findings provide a framework for rationally designing protein hydrogels with independently tunable elasticity, extensibility, toughness and self-healing. Using this principle, we demonstrate the engineering of self-healable muscle-mimicking hydrogels that can significantly dissipate energy through protein unfolding. We expect that this principle can be generalized for the construction of protein hydrogels with customized mechanical properties for biomedical applications.

  1. A Deep Learning Framework for Robust and Accurate Prediction of ncRNA-Protein Interactions Using Evolutionary Information.

    PubMed

    Yi, Hai-Cheng; You, Zhu-Hong; Huang, De-Shuang; Li, Xiao; Jiang, Tong-Hai; Li, Li-Ping

    2018-06-01

    The interactions between non-coding RNAs (ncRNAs) and proteins play an important role in many biological processes, and their biological functions are primarily achieved by binding with a variety of proteins. High-throughput biological techniques are used to identify protein molecules bound with specific ncRNA, but they are usually expensive and time consuming. Deep learning provides a powerful solution to computationally predict RNA-protein interactions. In this work, we propose the RPI-SAN model by using the deep-learning stacked auto-encoder network to mine the hidden high-level features from RNA and protein sequences and feed them into a random forest (RF) model to predict ncRNA binding proteins. Stacked assembling is further used to improve the accuracy of the proposed method. Four benchmark datasets, including RPI2241, RPI488, RPI1807, and NPInter v2.0, were employed for the unbiased evaluation of five established prediction tools: RPI-Pred, IPMiner, RPISeq-RF, lncPro, and RPI-SAN. The experimental results show that our RPI-SAN model achieves much better performance than other methods, with accuracies of 90.77%, 89.7%, 96.1%, and 99.33%, respectively. It is anticipated that RPI-SAN can be used as an effective computational tool for future biomedical researches and can accurately predict the potential ncRNA-protein interacted pairs, which provides reliable guidance for biological research. Copyright © 2018 The Author(s). Published by Elsevier Inc. All rights reserved.

  2. Urinary Liver-Type Fatty Acid-Binding Protein Level as a Predictive Biomarker of Acute Kidney Injury in Patients with Acute Decompensated Heart Failure.

    PubMed

    Hishikari, Keiichi; Hikita, Hiroyuki; Nakamura, Shun; Nakagama, Shun; Mizusawa, Masahumi; Yamamoto, Tasuku; Doi, Junichi; Hayashi, Yosuke; Utsugi, Yuya; Araki, Makoto; Sudo, Yuta; Kimura, Shigeki; Takahashi, Atsushi; Ashikaga, Takashi; Isobe, Mitsuaki

    2017-10-01

    There are no biological markers to predict the onset of acute kidney injury (AKI) in patients with acute decompensated heart failure (ADHF). Liver-type fatty acid-binding protein (L-FABP) levels are markedly upregulated in the proximal tubules after renal ischemia. We investigated whether urinary L-FABP is a suitable marker to predict AKI in ADHF patients. We examined 281 consecutive patients with ADHF. Serum creatinine (Cr) and L-FABP levels were measured at admission and 24 and 48 h after admission. AKI developed in 104 patients (37%). Urinary L-FABP levels at admission were significantly higher in patients with AKI than in those without (33.0 vs. 5.2 μg/g Cr; p < 0.001). Multivariate analysis showed that baseline urinary L-FABP level was an independent predictor of AKI in ADHF patients (odds ratio 1.08, 95% confidence interval 1.05-1.12; p < 0.001). Receiver operating characteristic analysis showed that baseline urinary L-FABP level exhibited 94.2% sensitivity and 87.0% specificity at a cutoff value of 12.5 μg/g Cr. Urinary L-FABP level is useful for predicting the onset of AKI in patients with ADHF. The results of our study could help clinicians diagnose AKI in ADHF patients earlier, leading to possible improvements in the treatment of this group of patients.

  3. Sequence heuristics to encode phase behaviour in intrinsically disordered protein polymers

    PubMed Central

    Quiroz, Felipe García; Chilkoti, Ashutosh

    2015-01-01

    Proteins and synthetic polymers that undergo aqueous phase transitions mediate self-assembly in nature and in man-made material systems. Yet little is known about how the phase behaviour of a protein is encoded in its amino acid sequence. Here, by synthesizing intrinsically disordered, repeat proteins to test motifs that we hypothesized would encode phase behaviour, we show that the proteins can be designed to exhibit tunable lower or upper critical solution temperature (LCST and UCST, respectively) transitions in physiological solutions. We also show that mutation of key residues at the repeat level abolishes phase behaviour or encodes an orthogonal transition. Furthermore, we provide heuristics to identify, at the proteome level, proteins that might exhibit phase behaviour and to design novel protein polymers consisting of biologically active peptide repeats that exhibit LCST or UCST transitions. These findings set the foundation for the prediction and encoding of phase behaviour at the sequence level. PMID:26390327

  4. Minimum curvilinearity to enhance topological prediction of protein interactions by network embedding

    PubMed Central

    Cannistraci, Carlo Vittorio; Alanis-Lobato, Gregorio; Ravasi, Timothy

    2013-01-01

    Motivation: Most functions within the cell emerge thanks to protein–protein interactions (PPIs), yet experimental determination of PPIs is both expensive and time-consuming. PPI networks present significant levels of noise and incompleteness. Predicting interactions using only PPI-network topology (topological prediction) is difficult but essential when prior biological knowledge is absent or unreliable. Methods: Network embedding emphasizes the relations between network proteins embedded in a low-dimensional space, in which protein pairs that are closer to each other represent good candidate interactions. To achieve network denoising, which boosts prediction performance, we first applied minimum curvilinear embedding (MCE), and then adopted shortest path (SP) in the reduced space to assign likelihood scores to candidate interactions. Furthermore, we introduce (i) a new valid variation of MCE, named non-centred MCE (ncMCE); (ii) two automatic strategies for selecting the appropriate embedding dimension; and (iii) two new randomized procedures for evaluating predictions. Results: We compared our method against several unsupervised and supervisedly tuned embedding approaches and node neighbourhood techniques. Despite its computational simplicity, ncMCE-SP was the overall leader, outperforming the current methods in topological link prediction. Conclusion: Minimum curvilinearity is a valuable non-linear framework that we successfully applied to the embedding of protein networks for the unsupervised prediction of novel PPIs. The rationale for our approach is that biological and evolutionary information is imprinted in the non-linear patterns hidden behind the protein network topology, and can be exploited for predicting new protein links. The predicted PPIs represent good candidates for testing in high-throughput experiments or for exploitation in systems biology tools such as those used for network-based inference and prediction of disease-related functional modules. Availability: https://sites.google.com/site/carlovittoriocannistraci/home Contact: kalokagathos.agon@gmail.com or timothy.ravasi@kaust.edu.sa Supplementary information: Supplementary data are available at Bioinformatics online. PMID:23812985

  5. The Phyre2 web portal for protein modelling, prediction and analysis

    PubMed Central

    Kelley, Lawrence A; Mezulis, Stefans; Yates, Christopher M; Wass, Mark N; Sternberg, Michael JE

    2017-01-01

    Summary Phyre2 is a suite of tools available on the web to predict and analyse protein structure, function and mutations. The focus of Phyre2 is to provide biologists with a simple and intuitive interface to state-of-the-art protein bioinformatics tools. Phyre2 replaces Phyre, the original version of the server for which we previously published a protocol. In this updated protocol, we describe Phyre2, which uses advanced remote homology detection methods to build 3D models, predict ligand binding sites, and analyse the effect of amino-acid variants (e.g. nsSNPs) for a user’s protein sequence. Users are guided through results by a simple interface at a level of detail determined by them. This protocol will guide a user from submitting a protein sequence to interpreting the secondary and tertiary structure of their models, their domain composition and model quality. A range of additional available tools is described to find a protein structure in a genome, to submit large number of sequences at once and to automatically run weekly searches for proteins difficult to model. The server is available at http://www.sbg.bio.ic.ac.uk/phyre2. A typical structure prediction will be returned between 30mins and 2 hours after submission. PMID:25950237

  6. Highly Reproducible Label Free Quantitative Proteomic Analysis of RNA Polymerase Complexes*

    PubMed Central

    Mosley, Amber L.; Sardiu, Mihaela E.; Pattenden, Samantha G.; Workman, Jerry L.; Florens, Laurence; Washburn, Michael P.

    2011-01-01

    The use of quantitative proteomics methods to study protein complexes has the potential to provide in-depth information on the abundance of different protein components as well as their modification state in various cellular conditions. To interrogate protein complex quantitation using shotgun proteomic methods, we have focused on the analysis of protein complexes using label-free multidimensional protein identification technology and studied the reproducibility of biological replicates. For these studies, we focused on three highly related and essential multi-protein enzymes, RNA polymerase I, II, and III from Saccharomyces cerevisiae. We found that label-free quantitation using spectral counting is highly reproducible at the protein and peptide level when analyzing RNA polymerase I, II, and III. In addition, we show that peptide sampling does not follow a random sampling model, and we show the need for advanced computational models to predict peptide detection probabilities. In order to address these issues, we used the APEX protocol to model the expected peptide detectability based on whole cell lysate acquired using the same multidimensional protein identification technology analysis used for the protein complexes. Neither method was able to predict the peptide sampling levels that we observed using replicate multidimensional protein identification technology analyses. In addition to the analysis of the RNA polymerase complexes, our analysis provides quantitative information about several RNAP associated proteins including the RNAPII elongation factor complexes DSIF and TFIIF. Our data shows that DSIF and TFIIF are the most highly enriched RNAP accessory factors in Rpb3-TAP purifications and demonstrate our ability to measure low level associated protein abundance across biological replicates. In addition, our quantitative data supports a model in which DSIF and TFIIF interact with RNAPII in a dynamic fashion in agreement with previously published reports. PMID:21048197

  7. Protein-Protein Interactions in a Crowded Environment: An Analysis via Cross-Docking Simulations and Evolutionary Information

    PubMed Central

    Lopes, Anne; Sacquin-Mora, Sophie; Dimitrova, Viktoriya; Laine, Elodie; Ponty, Yann; Carbone, Alessandra

    2013-01-01

    Large-scale analyses of protein-protein interactions based on coarse-grain molecular docking simulations and binding site predictions resulting from evolutionary sequence analysis, are possible and realizable on hundreds of proteins with variate structures and interfaces. We demonstrated this on the 168 proteins of the Mintseris Benchmark 2.0. On the one hand, we evaluated the quality of the interaction signal and the contribution of docking information compared to evolutionary information showing that the combination of the two improves partner identification. On the other hand, since protein interactions usually occur in crowded environments with several competing partners, we realized a thorough analysis of the interactions of proteins with true partners but also with non-partners to evaluate whether proteins in the environment, competing with the true partner, affect its identification. We found three populations of proteins: strongly competing, never competing, and interacting with different levels of strength. Populations and levels of strength are numerically characterized and provide a signature for the behavior of a protein in the crowded environment. We showed that partner identification, to some extent, does not depend on the competing partners present in the environment, that certain biochemical classes of proteins are intrinsically easier to analyze than others, and that small proteins are not more promiscuous than large ones. Our approach brings to light that the knowledge of the binding site can be used to reduce the high computational cost of docking simulations with no consequence in the quality of the results, demonstrating the possibility to apply coarse-grain docking to datasets made of thousands of proteins. Comparison with all available large-scale analyses aimed to partner predictions is realized. We release the complete decoys set issued by coarse-grain docking simulations of both true and false interacting partners, and their evolutionary sequence analysis leading to binding site predictions. Download site: http://www.lgm.upmc.fr/CCDMintseris/ PMID:24339765

  8. Predicting Gene Expression Level from Relative Codon Usage Bias: An Application to Escherichia coli Genome

    PubMed Central

    Roymondal, Uttam; Das, Shibsankar; Sahoo, Satyabrata

    2009-01-01

    We present an expression measure of a gene, devised to predict the level of gene expression from relative codon bias (RCB). There are a number of measures currently in use that quantify codon usage in genes. Based on the hypothesis that gene expressivity and codon composition is strongly correlated, RCB has been defined to provide an intuitively meaningful measure of an extent of the codon preference in a gene. We outline a simple approach to assess the strength of RCB (RCBS) in genes as a guide to their likely expression levels and illustrate this with an analysis of Escherichia coli (E. coli) genome. Our efforts to quantitatively predict gene expression levels in E. coli met with a high level of success. Surprisingly, we observe a strong correlation between RCBS and protein length indicating natural selection in favour of the shorter genes to be expressed at higher level. The agreement of our result with high protein abundances, microarray data and radioactive data demonstrates that the genomic expression profile available in our method can be applied in a meaningful way to the study of cell physiology and also for more detailed studies of particular genes of interest. PMID:19131380

  9. Plasma cytokines eotaxin, MIP-1α, MCP-4, and vascular endothelial growth factor in acute lower respiratory tract infection.

    PubMed

    Relster, Mette Marie; Holm, Anette; Pedersen, Court

    2017-02-01

    Major overlaps of clinical characteristics and the limitations of conventional diagnostic tests render the initial diagnosis and clinical management of pulmonary disorders difficult. In this pilot study, we analyzed the predictive value of eotaxin, macrophage inflammatory protein 1 alpha (MIP-1α), monocyte chemoattractant protein 4 (MCP-4), and vascular endothelial growth factor (VEGF) in 40 patients hospitalized with acute lower respiratory tract infections (LRTI). The cytokines contribute to the pathogenesis of several inflammatory respiratory diseases, indicating a potential as markers for LRTI. Patients were stratified according to etiology and severity of LRTI, based on baseline C-reactive protein and CURB-65 scores. Using a multiplex immunoassay of plasma, levels of eotaxin and MCP-4 were shown to increase from baseline until day 6 after admission to hospital. The four cytokines were unable to predict the etiology and severity. Eotaxin and MCP-4 were significantly lower in patients with C-reactive protein ≥100, and MIP-1α was significantly higher in the patients with CURB-65 > 3, but the predictive power was low. In conclusion, further evaluation, including more patients, is required to assess the full potential of eotaxin, MCP-4, MIP-1α, and VEGF as biomarkers for LRTI because of their low predictive power and a high interindividual variation of cytokine levels. © 2016 APMIS. Published by John Wiley & Sons Ltd.

  10. Computational prediction of hinge axes in proteins

    PubMed Central

    2014-01-01

    Background A protein's function is determined by the wide range of motions exhibited by its 3D structure. However, current experimental techniques are not able to reliably provide the level of detail required for elucidating the exact mechanisms of protein motion essential for effective drug screening and design. Computational tools are instrumental in the study of the underlying structure-function relationship. We focus on a special type of proteins called "hinge proteins" which exhibit a motion that can be interpreted as a rotation of one domain relative to another. Results This work proposes a computational approach that uses the geometric structure of a single conformation to predict the feasible motions of the protein and is founded in recent work from rigidity theory, an area of mathematics that studies flexibility properties of general structures. Given a single conformational state, our analysis predicts a relative axis of motion between two specified domains. We analyze a dataset of 19 structures known to exhibit this hinge-like behavior. For 15, the predicted axis is consistent with a motion to a second, known conformation. We present a detailed case study for three proteins whose dynamics have been well-studied in the literature: calmodulin, the LAO binding protein and the Bence-Jones protein. Conclusions Our results show that incorporating rigidity-theoretic analyses can lead to effective computational methods for understanding hinge motions in macromolecules. This initial investigation is the first step towards a new tool for probing the structure-dynamics relationship in proteins. PMID:25080829

  11. Sequence fingerprints distinguish erroneous from correct predictions of intrinsically disordered protein regions.

    PubMed

    Saravanan, Konda Mani; Dunker, A Keith; Krishnaswamy, Sankaran

    2017-12-27

    More than 60 prediction methods for intrinsically disordered proteins (IDPs) have been developed over the years, many of which are accessible on the World Wide Web. Nearly, all of these predictors give balanced accuracies in the ~65%-~80% range. Since predictors are not perfect, further studies are required to uncover the role of amino acid residues in native IDP as compared to predicted IDP regions. In the present work, we make use of sequences of 100% predicted IDP regions, false positive disorder predictions, and experimentally determined IDP regions to distinguish the characteristics of native versus predicted IDP regions. A higher occurrence of asparagine is observed in sequences of native IDP regions but not in sequences of false positive predictions of IDP regions. The occurrences of certain combinations of amino acids at the pentapeptide level provide a distinguishing feature in the IDPs with respect to globular proteins. The distinguishing features presented in this paper provide insights into the sequence fingerprints of amino acid residues in experimentally determined as compared to predicted IDP regions. These observations and additional work along these lines should enable the development of improvements in the accuracy of disorder prediction algorithm.

  12. Selection on Network Dynamics Drives Differential Rates of Protein Domain Evolution

    PubMed Central

    Mannakee, Brian K.; Gutenkunst, Ryan N.

    2016-01-01

    The long-held principle that functionally important proteins evolve slowly has recently been challenged by studies in mice and yeast showing that the severity of a protein knockout only weakly predicts that protein’s rate of evolution. However, the relevance of these studies to evolutionary changes within proteins is unknown, because amino acid substitutions, unlike knockouts, often only slightly perturb protein activity. To quantify the phenotypic effect of small biochemical perturbations, we developed an approach to use computational systems biology models to measure the influence of individual reaction rate constants on network dynamics. We show that this dynamical influence is predictive of protein domain evolutionary rate within networks in vertebrates and yeast, even after controlling for expression level and breadth, network topology, and knockout effect. Thus, our results not only demonstrate the importance of protein domain function in determining evolutionary rate, but also the power of systems biology modeling to uncover unanticipated evolutionary forces. PMID:27380265

  13. Topology and weights in a protein domain interaction network--a novel way to predict protein interactions.

    PubMed

    Wuchty, Stefan

    2006-05-23

    While the analysis of unweighted biological webs as diverse as genetic, protein and metabolic networks allowed spectacular insights in the inner workings of a cell, biological networks are not only determined by their static grid of links. In fact, we expect that the heterogeneity in the utilization of connections has a major impact on the organization of cellular activities as well. We consider a web of interactions between protein domains of the Protein Family database (PFAM), which are weighted by a probability score. We apply metrics that combine the static layout and the weights of the underlying interactions. We observe that unweighted measures as well as their weighted counterparts largely share the same trends in the underlying domain interaction network. However, we only find weak signals that weights and the static grid of interactions are connected entities. Therefore assuming that a protein interaction is governed by a single domain interaction, we observe strong and significant correlations of the highest scoring domain interaction and the confidence of protein interactions in the underlying interactions of yeast and fly. Modeling an interaction between proteins if we find a high scoring protein domain interaction we obtain 1, 428 protein interactions among 361 proteins in the human malaria parasite Plasmodium falciparum. Assessing their quality by a logistic regression method we observe that increasing confidence of predicted interactions is accompanied by high scoring domain interactions and elevated levels of functional similarity and evolutionary conservation. Our results indicate that probability scores are randomly distributed, allowing to treat static grid and weights of domain interactions as separate entities. In particular, these finding confirms earlier observations that a protein interaction is a matter of a single interaction event on domain level. As an immediate application, we show a simple way to predict potential protein interactions by utilizing expectation scores of single domain interactions.

  14. The Protein Cost of Metabolic Fluxes: Prediction from Enzymatic Rate Laws and Cost Minimization

    PubMed Central

    Noor, Elad; Flamholz, Avi; Bar-Even, Arren; Davidi, Dan; Milo, Ron; Liebermeister, Wolfram

    2016-01-01

    Bacterial growth depends crucially on metabolic fluxes, which are limited by the cell’s capacity to maintain metabolic enzymes. The necessary enzyme amount per unit flux is a major determinant of metabolic strategies both in evolution and bioengineering. It depends on enzyme parameters (such as kcat and KM constants), but also on metabolite concentrations. Moreover, similar amounts of different enzymes might incur different costs for the cell, depending on enzyme-specific properties such as protein size and half-life. Here, we developed enzyme cost minimization (ECM), a scalable method for computing enzyme amounts that support a given metabolic flux at a minimal protein cost. The complex interplay of enzyme and metabolite concentrations, e.g. through thermodynamic driving forces and enzyme saturation, would make it hard to solve this optimization problem directly. By treating enzyme cost as a function of metabolite levels, we formulated ECM as a numerically tractable, convex optimization problem. Its tiered approach allows for building models at different levels of detail, depending on the amount of available data. Validating our method with measured metabolite and protein levels in E. coli central metabolism, we found typical prediction fold errors of 4.1 and 2.6, respectively, for the two kinds of data. This result from the cost-optimized metabolic state is significantly better than randomly sampled metabolite profiles, supporting the hypothesis that enzyme cost is important for the fitness of E. coli. ECM can be used to predict enzyme levels and protein cost in natural and engineered pathways, and could be a valuable computational tool to assist metabolic engineering projects. Furthermore, it establishes a direct connection between protein cost and thermodynamics, and provides a physically plausible and computationally tractable way to include enzyme kinetics into constraint-based metabolic models, where kinetics have usually been ignored or oversimplified. PMID:27812109

  15. Protein subcellular localization prediction using multiple kernel learning based support vector machine.

    PubMed

    Hasan, Md Al Mehedi; Ahmad, Shamim; Molla, Md Khademul Islam

    2017-03-28

    Predicting the subcellular locations of proteins can provide useful hints that reveal their functions, increase our understanding of the mechanisms of some diseases, and finally aid in the development of novel drugs. As the number of newly discovered proteins has been growing exponentially, which in turns, makes the subcellular localization prediction by purely laboratory tests prohibitively laborious and expensive. In this context, to tackle the challenges, computational methods are being developed as an alternative choice to aid biologists in selecting target proteins and designing related experiments. However, the success of protein subcellular localization prediction is still a complicated and challenging issue, particularly, when query proteins have multi-label characteristics, i.e., if they exist simultaneously in more than one subcellular location or if they move between two or more different subcellular locations. To date, to address this problem, several types of subcellular localization prediction methods with different levels of accuracy have been proposed. The support vector machine (SVM) has been employed to provide potential solutions to the protein subcellular localization prediction problem. However, the practicability of an SVM is affected by the challenges of selecting an appropriate kernel and selecting the parameters of the selected kernel. To address this difficulty, in this study, we aimed to develop an efficient multi-label protein subcellular localization prediction system, named as MKLoc, by introducing multiple kernel learning (MKL) based SVM. We evaluated MKLoc using a combined dataset containing 5447 single-localized proteins (originally published as part of the Höglund dataset) and 3056 multi-localized proteins (originally published as part of the DBMLoc set). Note that this dataset was used by Briesemeister et al. in their extensive comparison of multi-localization prediction systems. Finally, our experimental results indicate that MKLoc not only achieves higher accuracy than a single kernel based SVM system but also shows significantly better results than those obtained from other top systems (MDLoc, BNCs, YLoc+). Moreover, MKLoc requires less computation time to tune and train the system than that required for BNCs and single kernel based SVM.

  16. Serum Neutrophil Gelatinase-Associated Lipocalin Predicts Survival After Resuscitation From Cardiac Arrest.

    PubMed

    Elmer, Jonathan; Jeong, Kwonho; Abebe, Kaleab Z; Guyette, Francis X; Murugan, Raghavan; Callaway, Clifton W; Rittenberger, Jon C

    2016-01-01

    In the first days after cardiac arrest, accurate prognostication is challenging. Serum biomarkers are a potentially attractive adjunct for prognostication and risk stratification. Our primary objective in this exploratory study was to identify novel early serum biomarkers that predict survival after cardiac arrest earlier than currently possible. Prospective, observational study. A single academic medical center. Adult subjects who sustained cardiac arrest with return of spontaneous circulation. None. We obtained blood samples from each subject at enrollment, 6, 12, 24, 48, and 72 hours after return of spontaneous circulation. We measured the serum levels of novel biomarkers, including neutrophil gelatinase-associated lipocalin, high-mobility group protein B1, intracellular cell adhesion molecule-1, and leptin, as well as previously characterized biomarkers, including neuron-specific enolase and S100B protein. Our primary outcome of interest was survival-to-hospital discharge. We compared biomarker concentrations at each time point between survivors and nonsurvivors and used logistic regression to test the unadjusted associations of baseline clinical characteristics and enrollment biomarker levels with survival. Finally, we constructed a series of adjusted models to explore the independent association of each enrollment biomarker level with survival. A total of 86 subjects were enrolled. Enrollment levels of high-mobility group protein B1, neutrophil gelatinase-associated lipocalin, and S100B were higher in nonsurvivors than survivors. Enrollment leptin, neuron-specific enolase, and intracellular cell adhesion molecule-1 levels did not differ between nonsurvivors and survivors. The discriminatory power of enrollment neutrophil gelatinase-associated lipocalin level was the greatest (c-statistic, 0.78 [95% CI, 0.66-0.90]) and remained stable across all time points. In our adjusted models, enrollment neutrophil gelatinase-associated lipocalin level was independently associated with survival even after controlling for the development of acute kidney injury, and its addition to clinical models improved overall predictive accuracy. Serum neutrophil gelatinase-associated lipocalin levels are strongly predictive of survival-to-hospital discharge after cardiac arrest.

  17. Integration of Structural Dynamics and Molecular Evolution via Protein Interaction Networks: A New Era in Genomic Medicine

    PubMed Central

    Kumar, Avishek; Butler, Brandon M.; Kumar, Sudhir; Ozkan, S. Banu

    2016-01-01

    Summary Sequencing technologies are revealing many new non-synonymous single nucleotide variants (nsSNVs) in each personal exome. To assess their functional impacts, comparative genomics is frequently employed to predict if they are benign or not. However, evolutionary analysis alone is insufficient, because it misdiagnoses many disease-associated nsSNVs, such as those at positions involved in protein interfaces, and because evolutionary predictions do not provide mechanistic insights into functional change or loss. Structural analyses can aid in overcoming both of these problems by incorporating conformational dynamics and allostery in nSNV diagnosis. Finally, protein-protein interaction networks using systems-level methodologies shed light onto disease etiology and pathogenesis. Bridging these network approaches with structurally resolved protein interactions and dynamics will advance genomic medicine. PMID:26684487

  18. Conservation of coevolving protein interfaces bridges prokaryote-eukaryote homologies in the twilight zone.

    PubMed

    Rodriguez-Rivas, Juan; Marsili, Simone; Juan, David; Valencia, Alfonso

    2016-12-27

    Protein-protein interactions are fundamental for the proper functioning of the cell. As a result, protein interaction surfaces are subject to strong evolutionary constraints. Recent developments have shown that residue coevolution provides accurate predictions of heterodimeric protein interfaces from sequence information. So far these approaches have been limited to the analysis of families of prokaryotic complexes for which large multiple sequence alignments of homologous sequences can be compiled. We explore the hypothesis that coevolution points to structurally conserved contacts at protein-protein interfaces, which can be reliably projected to homologous complexes with distantly related sequences. We introduce a domain-centered protocol to study the interplay between residue coevolution and structural conservation of protein-protein interfaces. We show that sequence-based coevolutionary analysis systematically identifies residue contacts at prokaryotic interfaces that are structurally conserved at the interface of their eukaryotic counterparts. In turn, this allows the prediction of conserved contacts at eukaryotic protein-protein interfaces with high confidence using solely mutational patterns extracted from prokaryotic genomes. Even in the context of high divergence in sequence (the twilight zone), where standard homology modeling of protein complexes is unreliable, our approach provides sequence-based accurate information about specific details of protein interactions at the residue level. Selected examples of the application of prokaryotic coevolutionary analysis to the prediction of eukaryotic interfaces further illustrate the potential of this approach.

  19. Pathway-Specific Aggregate Biomarker Risk Score Is Associated With Burden of Coronary Artery Disease and Predicts Near-Term Risk of Myocardial Infarction and Death

    PubMed Central

    Ghasemzadeh, Nima; Hayek, Salim S.; Ko, Yi-An; Eapen, Danny J.; Patel, Riyaz S.; Manocha, Pankaj; Kassem, Hatem Al; Khayata, Mohamed; Veledar, Emir; Kremastinos, Dimitrios; Thorball, Christian W.; Pielak, Tomasz; Sikora, Sergey; Zafari, A. Maziar; Lerakis, Stamatios; Sperling, Laurence; Vaccarino, Viola; Epstein, Stephen E.; Quyyumi, Arshed A.

    2018-01-01

    Background Inflammation, coagulation, and cell stress contribute to atherosclerosis and its adverse events. A biomarker risk score (BRS) based on the circulating levels of biomarkers C-reactive protein, fibrin degradation products, and heat shock protein-70 representing these 3 pathways was a strong predictor of future outcomes. We investigated whether soluble urokinase plasminogen activator receptor (suPAR), a marker of immune activation, is predictive of outcomes independent of the aforementioned markers and whether its addition to a 3-BRS improves risk reclassification. Methods and Results C-reactive protein, fibrin degradation product, heat shock protein-70, and suPAR were measured in 3278 patients undergoing coronary angiography. The BRS was calculated by counting the number of biomarkers above a cutoff determined using the Youden’s index. Survival analyses were performed using models adjusted for traditional risk factors. A high suPAR level ≥3.5 ng/mL was associated with all-cause death and myocardial infarction (hazard ratio, 1.83; 95% confidence interval, 1.43–2.35) after adjustment for risk factors, C-reactive protein, fibrin degradation product, and heat shock protein-70. Addition of suPAR to the 3-BRS significantly improved the C statistic, integrated discrimination improvement, and net reclassification index for the primary outcome. A BRS of 1, 2, 3, or 4 was associated with a 1.81-, 2.59-, 6.17-, and 8.80-fold increase, respectively, in the risk of death and myocardial infarction. The 4-BRS was also associated with severity of coronary artery disease and composite end points. Conclusions SuPAR is independently predictive of adverse outcomes, and its addition to a 3-BRS comprising C-reactive protein, fibrin degradation product, and heat shock protein-70 improved risk reclassification. The clinical utility of using a 4-BRS for risk prediction and management of patients with coronary artery disease warrants further study. PMID:28280039

  20. Pathway-Specific Aggregate Biomarker Risk Score Is Associated With Burden of Coronary Artery Disease and Predicts Near-Term Risk of Myocardial Infarction and Death.

    PubMed

    Ghasemzedah, Nima; Hayek, Salim S; Ko, Yi-An; Eapen, Danny J; Patel, Riyaz S; Manocha, Pankaj; Al Kassem, Hatem; Khayata, Mohamed; Veledar, Emir; Kremastinos, Dimitrios; Thorball, Christian W; Pielak, Tomasz; Sikora, Sergey; Zafari, A Maziar; Lerakis, Stamatios; Sperling, Laurence; Vaccarino, Viola; Epstein, Stephen E; Quyyumi, Arshed A

    2017-03-01

    Inflammation, coagulation, and cell stress contribute to atherosclerosis and its adverse events. A biomarker risk score (BRS) based on the circulating levels of biomarkers C-reactive protein, fibrin degradation products, and heat shock protein-70 representing these 3 pathways was a strong predictor of future outcomes. We investigated whether soluble urokinase plasminogen activator receptor (suPAR), a marker of immune activation, is predictive of outcomes independent of the aforementioned markers and whether its addition to a 3-BRS improves risk reclassification. C-reactive protein, fibrin degradation product, heat shock protein-70, and suPAR were measured in 3278 patients undergoing coronary angiography. The BRS was calculated by counting the number of biomarkers above a cutoff determined using the Youden's index. Survival analyses were performed using models adjusted for traditional risk factors. A high suPAR level ≥3.5 ng/mL was associated with all-cause death and myocardial infarction (hazard ratio, 1.83; 95% confidence interval, 1.43-2.35) after adjustment for risk factors, C-reactive protein, fibrin degradation product, and heat shock protein-70. Addition of suPAR to the 3-BRS significantly improved the C statistic, integrated discrimination improvement, and net reclassification index for the primary outcome. A BRS of 1, 2, 3, or 4 was associated with a 1.81-, 2.59-, 6.17-, and 8.80-fold increase, respectively, in the risk of death and myocardial infarction. The 4-BRS was also associated with severity of coronary artery disease and composite end points. SuPAR is independently predictive of adverse outcomes, and its addition to a 3-BRS comprising C-reactive protein, fibrin degradation product, and heat shock protein-70 improved risk reclassification. The clinical utility of using a 4-BRS for risk prediction and management of patients with coronary artery disease warrants further study. © 2017 American Heart Association, Inc.

  1. Triangle network motifs predict complexes by complementing high-error interactomes with structural information.

    PubMed

    Andreopoulos, Bill; Winter, Christof; Labudde, Dirk; Schroeder, Michael

    2009-06-27

    A lot of high-throughput studies produce protein-protein interaction networks (PPINs) with many errors and missing information. Even for genome-wide approaches, there is often a low overlap between PPINs produced by different studies. Second-level neighbors separated by two protein-protein interactions (PPIs) were previously used for predicting protein function and finding complexes in high-error PPINs. We retrieve second level neighbors in PPINs, and complement these with structural domain-domain interactions (SDDIs) representing binding evidence on proteins, forming PPI-SDDI-PPI triangles. We find low overlap between PPINs, SDDIs and known complexes, all well below 10%. We evaluate the overlap of PPI-SDDI-PPI triangles with known complexes from Munich Information center for Protein Sequences (MIPS). PPI-SDDI-PPI triangles have ~20 times higher overlap with MIPS complexes than using second-level neighbors in PPINs without SDDIs. The biological interpretation for triangles is that a SDDI causes two proteins to be observed with common interaction partners in high-throughput experiments. The relatively few SDDIs overlapping with PPINs are part of highly connected SDDI components, and are more likely to be detected in experimental studies. We demonstrate the utility of PPI-SDDI-PPI triangles by reconstructing myosin-actin processes in the nucleus, cytoplasm, and cytoskeleton, which were not obvious in the original PPIN. Using other complementary datatypes in place of SDDIs to form triangles, such as PubMed co-occurrences or threading information, results in a similar ability to find protein complexes. Given high-error PPINs with missing information, triangles of mixed datatypes are a promising direction for finding protein complexes. Integrating PPINs with SDDIs improves finding complexes. Structural SDDIs partially explain the high functional similarity of second-level neighbors in PPINs. We estimate that relatively little structural information would be sufficient for finding complexes involving most of the proteins and interactions in a typical PPIN.

  2. Triangle network motifs predict complexes by complementing high-error interactomes with structural information

    PubMed Central

    Andreopoulos, Bill; Winter, Christof; Labudde, Dirk; Schroeder, Michael

    2009-01-01

    Background A lot of high-throughput studies produce protein-protein interaction networks (PPINs) with many errors and missing information. Even for genome-wide approaches, there is often a low overlap between PPINs produced by different studies. Second-level neighbors separated by two protein-protein interactions (PPIs) were previously used for predicting protein function and finding complexes in high-error PPINs. We retrieve second level neighbors in PPINs, and complement these with structural domain-domain interactions (SDDIs) representing binding evidence on proteins, forming PPI-SDDI-PPI triangles. Results We find low overlap between PPINs, SDDIs and known complexes, all well below 10%. We evaluate the overlap of PPI-SDDI-PPI triangles with known complexes from Munich Information center for Protein Sequences (MIPS). PPI-SDDI-PPI triangles have ~20 times higher overlap with MIPS complexes than using second-level neighbors in PPINs without SDDIs. The biological interpretation for triangles is that a SDDI causes two proteins to be observed with common interaction partners in high-throughput experiments. The relatively few SDDIs overlapping with PPINs are part of highly connected SDDI components, and are more likely to be detected in experimental studies. We demonstrate the utility of PPI-SDDI-PPI triangles by reconstructing myosin-actin processes in the nucleus, cytoplasm, and cytoskeleton, which were not obvious in the original PPIN. Using other complementary datatypes in place of SDDIs to form triangles, such as PubMed co-occurrences or threading information, results in a similar ability to find protein complexes. Conclusion Given high-error PPINs with missing information, triangles of mixed datatypes are a promising direction for finding protein complexes. Integrating PPINs with SDDIs improves finding complexes. Structural SDDIs partially explain the high functional similarity of second-level neighbors in PPINs. We estimate that relatively little structural information would be sufficient for finding complexes involving most of the proteins and interactions in a typical PPIN. PMID:19558694

  3. Efficient lowering of triglyceride levels in mice by human apoAV protein variants associated with hypertriglyceridemia.

    PubMed

    Vaessen, Stefan F C; Sierts, Jeroen A; Kuivenhoven, Jan Albert; Schaap, Frank G

    2009-02-06

    Variation in the apolipoprotein A5 (APOA5) gene has consistently been associated with increased plasma triglyceride (TG) levels in epidemiological studies. In vivo functionality of these variations, however, has thus far not been tested. Using adenoviral over-expression, we evaluated plasma expression levels and TG-lowering efficacies of wild-type human apoAV, two human apoAV variants associated with increased TG (S19W, G185C) and one variant (Q341H) that is predicted to have altered protein function. Injection of mice with adenovirus encoding wild-type or mutant apoAV resulted in an identical dose-dependent elevation of human apoAV levels in plasma. The increase in apoAV levels resulted in pronounced lowering of plasma TG levels at two viral dosages. Unexpectedly, the TG-lowering efficacy of all three apoAV variants was similar to wild-type apoAV. In addition, no effect on TG-hydrolysis-related plasma parameters (free fatty acids, glycerol and post-heparin lipoprotein lipase activity) was apparent upon expression of all apoAV variants. In conclusion, our data indicate that despite their association with hypertriglyceridemia and/or predicted protein dysfunction, the 19W, 185C and 341H apoAV variants are equally effective in reducing plasma TG levels in mice.

  4. News from the protein mutability landscape.

    PubMed

    Hecht, Maximilian; Bromberg, Yana; Rost, Burkhard

    2013-11-01

    Some mutations of protein residues matter more than others, and these are often conserved evolutionarily. The explosion of deep sequencing and genotyping increasingly requires the distinction between effect and neutral variants. The simplest approach predicts all mutations of conserved residues to have an effect; however, this works poorly, at best. Many computational tools that are optimized to predict the impact of point mutations provide more detail. Here, we expand the perspective from the view of single variants to the level of sketching the entire mutability landscape. This landscape is defined by the impact of substituting every residue at each position in a protein by each of the 19 non-native amino acids. We review some of the powerful conclusions about protein function, stability and their robustness to mutation that can be drawn from such an analysis. Large-scale experimental and computational mutagenesis experiments are increasingly furthering our understanding of protein function and of the genotype-phenotype associations. We also discuss how these can be used to improve predictions of protein function and pathogenicity of missense variants. Copyright © 2013 The Authors. Published by Elsevier Ltd.. All rights reserved.

  5. A Web-Accessible Protein Structure Prediction Pipeline

    DTIC Science & Technology

    2009-06-01

    Abstract Proteins are the molecular basis of nearly all structural, catalytic, sensory, and regulatory functions in living organisms. The biological...sensory, and regulatory functions in living organisms. The structure of a protein is essential in understanding its function at the molecular level...Characterizing sequence-structure and structure-function relationships have been the goals of molecular biology for more than three decades

  6. GeneBuilder: interactive in silico prediction of gene structure.

    PubMed

    Milanesi, L; D'Angelo, D; Rogozin, I B

    1999-01-01

    Prediction of gene structure in newly sequenced DNA becomes very important in large genome sequencing projects. This problem is complicated due to the exon-intron structure of eukaryotic genes and because gene expression is regulated by many different short nucleotide domains. In order to be able to analyse the full gene structure in different organisms, it is necessary to combine information about potential functional signals (promoter region, splice sites, start and stop codons, 3' untranslated region) together with the statistical properties of coding sequences (coding potential), information about homologous proteins, ESTs and repeated elements. We have developed the GeneBuilder system which is based on prediction of functional signals and coding regions by different approaches in combination with similarity searches in proteins and EST databases. The potential gene structure models are obtained by using a dynamic programming method. The program permits the use of several parameters for gene structure prediction and refinement. During gene model construction, selecting different exon homology levels with a protein sequence selected from a list of homologous proteins can improve the accuracy of the gene structure prediction. In the case of low homology, GeneBuilder is still able to predict the gene structure. The GeneBuilder system has been tested by using the standard set (Burset and Guigo, Genomics, 34, 353-367, 1996) and the performances are: 0.89 sensitivity and 0.91 specificity at the nucleotide level. The total correlation coefficient is 0.88. The GeneBuilder system is implemented as a part of the WebGene a the URL: http://www.itba.mi. cnr.it/webgene and TRADAT (TRAncription Database and Analysis Tools) launcher URL: http://www.itba.mi.cnr.it/tradat.

  7. Experimental Determination and Prediction of the Fitness Effects of Random Point Mutations in the Biosynthetic Enzyme HisA

    PubMed Central

    Lundin, Erik; Tang, Po-Cheng; Guy, Lionel; Näsvall, Joakim; Andersson, Dan I

    2018-01-01

    Abstract The distribution of fitness effects of mutations is a factor of fundamental importance in evolutionary biology. We determined the distribution of fitness effects of 510 mutants that each carried between 1 and 10 mutations (synonymous and nonsynonymous) in the hisA gene, encoding an essential enzyme in the l-histidine biosynthesis pathway of Salmonella enterica. For the full set of mutants, the distribution was bimodal with many apparently neutral mutations and many lethal mutations. For a subset of 81 single, nonsynonymous mutants most mutations appeared neutral at high expression levels, whereas at low expression levels only a few mutations were neutral. Furthermore, we examined how the magnitude of the observed fitness effects was correlated to several measures of biophysical properties and phylogenetic conservation.We conclude that for HisA: (i) The effect of mutations can be masked by high expression levels, such that mutations that are deleterious to the function of the protein can still be neutral with regard to organism fitness if the protein is expressed at a sufficiently high level; (ii) the shape of the fitness distribution is dependent on the extent to which the protein is rate-limiting for growth; (iii) negative epistatic interactions, on an average, amplified the combined effect of nonsynonymous mutations; and (iv) no single sequence-based predictor could confidently predict the fitness effects of mutations in HisA, but a combination of multiple predictors could predict the effect with a SD of 0.04 resulting in 80% of the mutations predicted within 12% of their observed selection coefficients. PMID:29294020

  8. High Precision Prediction of Functional Sites in Protein Structures

    PubMed Central

    Buturovic, Ljubomir; Wong, Mike; Tang, Grace W.; Altman, Russ B.; Petkovic, Dragutin

    2014-01-01

    We address the problem of assigning biological function to solved protein structures. Computational tools play a critical role in identifying potential active sites and informing screening decisions for further lab analysis. A critical parameter in the practical application of computational methods is the precision, or positive predictive value. Precision measures the level of confidence the user should have in a particular computed functional assignment. Low precision annotations lead to futile laboratory investigations and waste scarce research resources. In this paper we describe an advanced version of the protein function annotation system FEATURE, which achieved 99% precision and average recall of 95% across 20 representative functional sites. The system uses a Support Vector Machine classifier operating on the microenvironment of physicochemical features around an amino acid. We also compared performance of our method with state-of-the-art sequence-level annotator Pfam in terms of precision, recall and localization. To our knowledge, no other functional site annotator has been rigorously evaluated against these key criteria. The software and predictive models are incorporated into the WebFEATURE service at http://feature.stanford.edu/wf4.0-beta. PMID:24632601

  9. Structure prediction of the second extracellular loop in G-protein-coupled receptors.

    PubMed

    Kmiecik, Sebastian; Jamroz, Michal; Kolinski, Michal

    2014-06-03

    G-protein-coupled receptors (GPCRs) play key roles in living organisms. Therefore, it is important to determine their functional structures. The second extracellular loop (ECL2) is a functionally important region of GPCRs, which poses significant challenge for computational structure prediction methods. In this work, we evaluated CABS, a well-established protein modeling tool for predicting ECL2 structure in 13 GPCRs. The ECL2s (with between 13 and 34 residues) are predicted in an environment of other extracellular loops being fully flexible and the transmembrane domain fixed in its x-ray conformation. The modeling procedure used theoretical predictions of ECL2 secondary structure and experimental constraints on disulfide bridges. Our approach yielded ensembles of low-energy conformers and the most populated conformers that contained models close to the available x-ray structures. The level of similarity between the predicted models and x-ray structures is comparable to that of other state-of-the-art computational methods. Our results extend other studies by including newly crystallized GPCRs. Copyright © 2014 The Authors. Published by Elsevier Inc. All rights reserved.

  10. [Prediction of the molecular response to pertubations from single cell measurements].

    PubMed

    Remacle, Françoise; Levine, Raphael D

    2014-12-01

    The response of protein signalization networks to perturbations is analysed from single cell measurements. This experimental approach allows characterizing the fluctuations in protein expression levels from cell to cell. The analysis is based on an information theoretic approach grounded in thermodynamics leading to a quantitative version of Le Chatelier principle which allows to predict the molecular response. Two systems are investigated: human macrophages subjected to lipopolysaccharide challenge, analogous to the immune response against Gram-negative bacteria and the response of the proteins involved in the mTOR signalizing network of GBM cancer cells to changes in partial oxygen pressure. © 2014 médecine/sciences – Inserm.

  11. Completing sparse and disconnected protein-protein network by deep learning.

    PubMed

    Huang, Lei; Liao, Li; Wu, Cathy H

    2018-03-22

    Protein-protein interaction (PPI) prediction remains a central task in systems biology to achieve a better and holistic understanding of cellular and intracellular processes. Recently, an increasing number of computational methods have shifted from pair-wise prediction to network level prediction. Many of the existing network level methods predict PPIs under the assumption that the training network should be connected. However, this assumption greatly affects the prediction power and limits the application area because the current golden standard PPI networks are usually very sparse and disconnected. Therefore, how to effectively predict PPIs based on a training network that is sparse and disconnected remains a challenge. In this work, we developed a novel PPI prediction method based on deep learning neural network and regularized Laplacian kernel. We use a neural network with an autoencoder-like architecture to implicitly simulate the evolutionary processes of a PPI network. Neurons of the output layer correspond to proteins and are labeled with values (1 for interaction and 0 for otherwise) from the adjacency matrix of a sparse disconnected training PPI network. Unlike autoencoder, neurons at the input layer are given all zero input, reflecting an assumption of no a priori knowledge about PPIs, and hidden layers of smaller sizes mimic ancient interactome at different times during evolution. After the training step, an evolved PPI network whose rows are outputs of the neural network can be obtained. We then predict PPIs by applying the regularized Laplacian kernel to the transition matrix that is built upon the evolved PPI network. The results from cross-validation experiments show that the PPI prediction accuracies for yeast data and human data measured as AUC are increased by up to 8.4 and 14.9% respectively, as compared to the baseline. Moreover, the evolved PPI network can also help us leverage complementary information from the disconnected training network and multiple heterogeneous data sources. Tested by the yeast data with six heterogeneous feature kernels, the results show our method can further improve the prediction performance by up to 2%, which is very close to an upper bound that is obtained by an Approximate Bayesian Computation based sampling method. The proposed evolution deep neural network, coupled with regularized Laplacian kernel, is an effective tool in completing sparse and disconnected PPI networks and in facilitating integration of heterogeneous data sources.

  12. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dar, Roy; Shaffer, Sydney M.; Singh, Abhyudai

    Recent analysis demonstrates that the HIV-1 Long Terminal Repeat (HIV LTR) promoter exhibits a range of possible transcriptional burst sizes and frequencies for any mean-expression level. However, these results have also been interpreted as demonstrating that cell-tocell expression variability (noise) and mean are uncorrelated, a significant deviation from previous results. Here, we re-examine the available mRNA and protein abundance data for the HIV LTR and find that noise in mRNA and protein expression scales inversely with the mean along analytically predicted transcriptional burst-size manifolds. We then experimentally perturb transcriptional activity to test a prediction of the multiple burst-size model: thatmore » increasing burst frequency will cause mRNA noise to decrease along given burst-size lines as mRNA levels increase. In conclusion, the data show that mRNA and protein noise decrease as mean expression increases, supporting the canonical inverse correlation between noise and mean.« less

  13. Differential expression pattern of protein markers for predicting chemosensitivity of dexamethasone-based chemotherapy of B cell acute lymphoblastic leukemia.

    PubMed

    Dehghan-Nayeri, Nasrin; Eshghi, Peyman; Pour, Kourosh Goudarzi; Rezaei-Tavirani, Mostafa; Omrani, Mir Davood; Gharehbaghian, Ahmad

    2017-07-01

    Dexamethasone is considered as a direct chemotherapeutic agent in the treatment of pediatric acute lymphoblastic leukemia (ALL). Beside the advantages of the drug, some problems arising from the dose-related side effects are challenging issues during the treatment. Accordingly, the classification of patients to dexamethasone sensitive and resistance groups can help to select optimizing the therapeutic dose with the lowest adverse effects particularly in sensitive cases. For this purpose, we investigated inhibited proliferation and induced cytotoxicity in NALM-6 cells, as sensitive cells, after dexamethasone treatment. In addition, comparative protein expression analysis using the 2DE-MALDI-TOF MS technique was performed to identify the specific altered proteins. In addition, we evaluated mRNA expression levels of the identified proteins in bone-marrow samples from pediatric ALL patients using the real-time q-PCR method. Eventually, proteomic analysis revealed a combination of biomarkers, including capping proteins (CAPZA1 and CAPZB), chloride channel (CLIC1), purine nucleoside phosphorylase (PNP), and proteasome activator (PSME1), in response to the dexamethasone treatment. In addition, our results indicated low expression of identified proteins at both the mRNA and protein expression levels after drug treatment. Moreover, quantitative real-time PCR data analysis indicated that independent of the molecular subtypes of the leukemia, CAPZA1, CAPZB, CLIC1, and PNP expression levels were lower in ALL samples than normal samples, although PSME1 expression level was higher in ALL samples than normal samples. Furthermore, the expression level of all proteins (except PSME1) was different between high-risk and standard-risk patients that suggesting the prognostic value of them. In conclusion, our study suggests a panel of biomarkers comprising CAPZA1, CAPZB, CLIC1, PNP, and PSME1 as early diagnosis and treatment evaluation markers that may differentiate cancer cells which are presumably to benefit from dexamethasone-based chemotherapy and may facilitate the prediction of clinical outcome.

  14. DBAC: A simple prediction method for protein binding hot spots based on burial levels and deeply buried atomic contacts

    PubMed Central

    2011-01-01

    Background A protein binding hot spot is a cluster of residues in the interface that are energetically important for the binding of the protein with its interaction partner. Identifying protein binding hot spots can give useful information to protein engineering and drug design, and can also deepen our understanding of protein-protein interaction. These residues are usually buried inside the interface with very low solvent accessible surface area (SASA). Thus SASA is widely used as an outstanding feature in hot spot prediction by many computational methods. However, SASA is not capable of distinguishing slightly buried residues, of which most are non hot spots, and deeply buried ones that are usually inside a hot spot. Results We propose a new descriptor called “burial level” for characterizing residues, atoms and atomic contacts. Specifically, burial level captures the depth the residues are buried. We identify different kinds of deeply buried atomic contacts (DBAC) at different burial levels that are directly broken in alanine substitution. We use their numbers as input for SVM to classify between hot spot or non hot spot residues. We achieve F measure of 0.6237 under the leave-one-out cross-validation on a data set containing 258 mutations. This performance is better than other computational methods. Conclusions Our results show that hot spot residues tend to be deeply buried in the interface, not just having a low SASA value. This indicates that a high burial level is not only a necessary but also a more sufficient condition than a low SASA for a residue to be a hot spot residue. We find that those deeply buried atoms become increasingly more important when their burial levels rise up. This work also confirms the contribution of deeply buried interfacial atomic contacts to the energy of protein binding hot spot. PMID:21689480

  15. Plasma inflammatory and immune proteins as predictors of intra-amniotic infection and spontaneous preterm delivery in women with preterm labor: a retrospective study.

    PubMed

    Park, Hyunsoo; Park, Kyo Hoon; Kim, Yu Mi; Kook, Song Yi; Jeon, Se Jeong; Yoo, Ha-Na

    2018-05-09

    We investigated whether various inflammatory and immune proteins in plasma predict intra-amniotic infection and imminent preterm delivery in women with preterm labor and compared their predictive ability with that of amniotic fluid (AF) interleukin (IL)-6 and serum C-reactive protein (CRP). This retrospective cohort study included 173 consecutive women with preterm labor who underwent amniocentesis for diagnosis of infection and/or inflammation in the AF. The AF was cultured, and assayed for IL-6. CRP levels and cervical length by transvaginal ultrasound were measured at the time of amniocentesis. The stored maternal plasma was assayed for IL-6, matrix metalloproteinase (MMP)-9, and complements C3a and C5a using ELISA kits. The primary and secondary outcome criteria were positive AF cultures and spontaneous preterm delivery (SPTD) within 48 h, respectively. Univariate, multivariate, and receiver operating characteristic analysis were used for the statistical analysis. In bivariate analyses, elevated plasma IL-6 level was significantly associated with intra-amniotic infection and imminent preterm delivery, whereas elevated plasma levels of MMP-9, C3a, and C5a were not associated with these two outcomes. On multivariate analyses, an elevated plasma IL-6 level was significantly associated with intra-amniotic infection and imminent preterm delivery after adjusting for confounders, including high serum CRP levels and short cervical length. In predicting intra-amniotic infection, the area under the curve (AUC) was significantly lower for plasma IL-6 than for AF IL-6 but was similar to that for serum CRP. Differences in the AUCs between plasma IL-6, AF IL-6, and serum CRP were not statistically significant in predicting imminent preterm delivery. Maternal plasma IL-6 independently predicts intra-amniotic infection in women with preterm labor; however, it has worse diagnostic performance than that of AF IL-6 and similar performance to that of serum CRP. To predict imminent preterm delivery, plasma IL-6 had an overall diagnostic performance similar to that of AF IL-6 and serum CRP. Plasma MMP-9, C3a, and C5a levels could not predict intra-amniotic infection or imminent preterm delivery.

  16. Prediction of conversion from mild cognitive impairment to dementia with neuronally derived blood exosome protein profile.

    PubMed

    Winston, Charisse N; Goetzl, Edward J; Akers, Johnny C; Carter, Bob S; Rockenstein, Edward M; Galasko, Douglas; Masliah, Eliezer; Rissman, Robert A

    2016-01-01

    Levels of Alzheimer's disease (AD)-related proteins in plasma neuronal derived exosomes (NDEs) were quantified to identify biomarkers for prediction and staging of mild cognitive impairment (MCI) and AD. Plasma exosomes were extracted, precipitated, and enriched for neuronal source by anti-L1CAM antibody absorption. NDEs were characterized by size (Nanosight) and shape (TEM) and extracted NDE protein biomarkers were quantified by ELISAs. Plasma NDE cargo was injected into normal mice, and results were characterized by immunohistochemistry to determine pathogenic potential. Plasma NDE levels of P-T181-tau, P-S396-tau, and Aβ1-42 were significantly higher, whereas those of neurogranin (NRGN) and the repressor element 1-silencing transcription factor (REST) were significantly lower in AD and MCI converting to AD (ADC) patients compared to cognitively normal controls (CNC) subjects and stable MCI patients. Mice injected with plasma NDEs from ADC patients displayed increased P-tau (PHF-1 antibody)-positive cells in the CA1 region of the hippocampus compared to plasma NDEs from CNC and stable MCI patients. Abnormal plasma NDE levels of P-tau, Aβ1-42, NRGN, and REST accurately predict conversion of MCI to AD dementia. Plasma NDEs from demented patients seeded tau aggregation and induced AD-like neuropathology in normal mouse CNS.

  17. Predictive value of high sensitivity CRP in patients with diastolic heart failure.

    PubMed

    Michowitz, Yoav; Arbel, Yaron; Wexler, Dov; Sheps, David; Rogowski, Ori; Shapira, Itzhak; Berliner, Shlomo; Keren, Gad; George, Jacob; Roth, Arie

    2008-04-25

    C-reactive protein (CRP) has been tested in patients with systolic heart failure (HF) and mixed results have been obtained with regards to its potential predictive value. However, the role of C-reactive protein (CRP) in patients with diastolic HF is not established. We studied the predictive role of high sensitivity CRP (hsCRP) in patients with diastolic HF. HsCRP levels were measured in a cohort of CHF outpatients, 77 patients with diastolic HF and 217 patients with systolic HF. Concentrations were compared to a large cohort of healthy population (n=7701) and associated with the HF admissions and mortality of the patients. Levels of hsCRP did not differ between patients with systolic and diastolic HF and were significantly elevated compared to the cohort of healthy subjects even after adjustment to various clinical parameters (p<0.0001). In patients with diastolic HF, hsCRP levels associated with New York Heart Association functional class (NYHA-FC) (r=0.31 p=0.01). On univariate Cox regression model hsCRP levels independently predicted hospitalizations in patients with systolic but not diastolic HF (p=0.047). HsCRP concentrations are elevated in patients with diastolic HF and correlate with disease severity; their prognostic value in this patient population should be further investigated.

  18. Protein biogenesis machinery is a driver of replicative aging in yeast

    PubMed Central

    Janssens, Georges E; Meinema, Anne C; González, Javier; Wolters, Justina C; Schmidt, Alexander; Guryev, Victor; Bischoff, Rainer; Wit, Ernst C; Veenhoff, Liesbeth M; Heinemann, Matthias

    2015-01-01

    An integrated account of the molecular changes occurring during the process of cellular aging is crucial towards understanding the underlying mechanisms. Here, using novel culturing and computational methods as well as latest analytical techniques, we mapped the proteome and transcriptome during the replicative lifespan of budding yeast. With age, we found primarily proteins involved in protein biogenesis to increase relative to their transcript levels. Exploiting the dynamic nature of our data, we reconstructed high-level directional networks, where we found the same protein biogenesis-related genes to have the strongest ability to predict the behavior of other genes in the system. We identified metabolic shifts and the loss of stoichiometry in protein complexes as being consequences of aging. We propose a model whereby the uncoupling of protein levels of biogenesis-related genes from their transcript levels is causal for the changes occurring in aging yeast. Our model explains why targeting protein synthesis, or repairing the downstream consequences, can serve as interventions in aging. DOI: http://dx.doi.org/10.7554/eLife.08527.001 PMID:26422514

  19. Protein profiles associated with survival in lung adenocarcinoma

    PubMed Central

    Chen, Guoan; Gharib, Tarek G; Wang, Hong; Huang, Chiang-Ching; Kuick, Rork; Thomas, Dafydd G.; Shedden, Kerby A.; Misek, David E.; Taylor, Jeremy M. G.; Giordano, Thomas J.; Kardia, Sharon L. R.; Iannettoni, Mark D.; Yee, John; Hogg, Philip J.; Orringer, Mark B.; Hanash, Samir M.; Beer, David G.

    2003-01-01

    Morphologic assessment of lung tumors is informative but insufficient to adequately predict patient outcome. We previously identified transcriptional profiles that predict patient survival, and here we identify proteins associated with patient survival in lung adenocarcinoma. A total of 682 individual protein spots were quantified in 90 lung adenocarcinomas by using quantitative two-dimensional polyacrylamide gel electrophoresis analysis. A leave-one-out cross-validation procedure using the top 20 survival-associated proteins identified by Cox modeling indicated that protein profiles as a whole can predict survival in stage I tumor patients (P = 0.01). Thirty-three of 46 survival-associated proteins were identified by using mass spectrometry. Expression of 12 candidate proteins was confirmed as tumor-derived with immunohistochemical analysis and tissue microarrays. Oligonucleotide microarray results from both the same tumors and from an independent study showed mRNAs associated with survival for 11 of 27 encoded genes. Combined analysis of protein and mRNA data revealed 11 components of the glycolysis pathway as associated with poor survival. Among these candidates, phosphoglycerate kinase 1 was associated with survival in the protein study, in both mRNA studies and in an independent validation set of 117 adenocarcinomas and squamous lung tumors using tissue microarrays. Elevated levels of phosphoglycerate kinase 1 in the serum were also significantly correlated with poor outcome in a validation set of 107 patients with lung adenocarcinomas using ELISA analysis. These studies identify new prognostic biomarkers and indicate that protein expression profiles can predict the outcome of patients with early-stage lung cancer. PMID:14573703

  20. Morphine Regulated Synaptic Networks Revealed by Integrated Proteomics and Network Analysis*

    PubMed Central

    Stockton, Steven D.; Gomes, Ivone; Liu, Tong; Moraje, Chandrakala; Hipólito, Lucia; Jones, Matthew R.; Ma'ayan, Avi; Morón, Jose A.; Li, Hong; Devi, Lakshmi A.

    2015-01-01

    Despite its efficacy, the use of morphine for the treatment of chronic pain remains limited because of the rapid development of tolerance, dependence and ultimately addiction. These undesired effects are thought to be because of alterations in synaptic transmission and neuroplasticity within the reward circuitry including the striatum. In this study we used subcellular fractionation and quantitative proteomics combined with computational approaches to investigate the morphine-induced protein profile changes at the striatal postsynaptic density. Over 2,600 proteins were identified by mass spectrometry analysis of subcellular fractions enriched in postsynaptic density associated proteins from saline or morphine-treated striata. Among these, the levels of 34 proteins were differentially altered in response to morphine. These include proteins involved in G-protein coupled receptor signaling, regulation of transcription and translation, chaperones, and protein degradation pathways. The altered expression levels of several of these proteins was validated by Western blotting analysis. Using Genes2Fans software suite we connected the differentially expressed proteins with proteins identified within the known background protein-protein interaction network. This led to the generation of a network consisting of 116 proteins with 40 significant intermediates. To validate this, we confirmed the presence of three proteins predicted to be significant intermediates: caspase-3, receptor-interacting serine/threonine protein kinase 3 and NEDD4 (an E3-ubiquitin ligase identified as a neural precursor cell expressed developmentally down-regulated protein 4). Because this morphine-regulated network predicted alterations in proteasomal degradation, we examined the global ubiquitination state of postsynaptic density proteins and found it to be substantially altered. Together, these findings suggest a role for protein degradation and for the ubiquitin/proteasomal system in the etiology of opiate dependence and addiction. PMID:26149443

  1. Protein construct storage: Bayesian variable selection and prediction with mixtures.

    PubMed

    Clyde, M A; Parmigiani, G

    1998-07-01

    Determining optimal conditions for protein storage while maintaining a high level of protein activity is an important question in pharmaceutical research. A designed experiment based on a space-filling design was conducted to understand the effects of factors affecting protein storage and to establish optimal storage conditions. Different model-selection strategies to identify important factors may lead to very different answers about optimal conditions. Uncertainty about which factors are important, or model uncertainty, can be a critical issue in decision-making. We use Bayesian variable selection methods for linear models to identify important variables in the protein storage data, while accounting for model uncertainty. We also use the Bayesian framework to build predictions based on a large family of models, rather than an individual model, and to evaluate the probability that certain candidate storage conditions are optimal.

  2. Integration of structural dynamics and molecular evolution via protein interaction networks: a new era in genomic medicine.

    PubMed

    Kumar, Avishek; Butler, Brandon M; Kumar, Sudhir; Ozkan, S Banu

    2015-12-01

    Sequencing technologies are revealing many new non-synonymous single nucleotide variants (nsSNVs) in each personal exome. To assess their functional impacts, comparative genomics is frequently employed to predict if they are benign or not. However, evolutionary analysis alone is insufficient, because it misdiagnoses many disease-associated nsSNVs, such as those at positions involved in protein interfaces, and because evolutionary predictions do not provide mechanistic insights into functional change or loss. Structural analyses can aid in overcoming both of these problems by incorporating conformational dynamics and allostery in nSNV diagnosis. Finally, protein-protein interaction networks using systems-level methodologies shed light onto disease etiology and pathogenesis. Bridging these network approaches with structurally resolved protein interactions and dynamics will advance genomic medicine. Copyright © 2015 Elsevier Ltd. All rights reserved.

  3. FINDSITE-metal: Integrating evolutionary information and machine learning for structure-based metal binding site prediction at the proteome level

    PubMed Central

    Brylinski, Michal; Skolnick, Jeffrey

    2010-01-01

    The rapid accumulation of gene sequences, many of which are hypothetical proteins with unknown function, has stimulated the development of accurate computational tools for protein function prediction with evolution/structure-based approaches showing considerable promise. In this paper, we present FINDSITE-metal, a new threading-based method designed specifically to detect metal binding sites in modeled protein structures. Comprehensive benchmarks using different quality protein structures show that weakly homologous protein models provide sufficient structural information for quite accurate annotation by FINDSITE-metal. Combining structure/evolutionary information with machine learning results in highly accurate metal binding annotations; for protein models constructed by TASSER, whose average Cα RMSD from the native structure is 8.9 Å, 59.5% (71.9%) of the best of top five predicted metal locations are within 4 Å (8 Å) from a bound metal in the crystal structure. For most of the targets, multiple metal binding sites are detected with the best predicted binding site at rank 1 and within the top 2 ranks in 65.6% and 83.1% of the cases, respectively. Furthermore, for iron, copper, zinc, calcium and magnesium ions, the binding metal can be predicted with high, typically 70-90%, accuracy. FINDSITE-metal also provides a set of confidence indexes that help assess the reliability of predictions. Finally, we describe the proteome-wide application of FINDSITE-metal that quantifies the metal binding complement of the human proteome. FINDSITE-metal is freely available to the academic community at http://cssb.biology.gatech.edu/findsite-metal/. PMID:21287609

  4. Relative Abundance of Integral Plasma Membrane Proteins in Arabidopsis Leaf and Root Tissue Determined by Metabolic Labeling and Mass Spectrometry

    PubMed Central

    Bernfur, Katja; Larsson, Olaf; Larsson, Christer; Gustavsson, Niklas

    2013-01-01

    Metabolic labeling of proteins with a stable isotope (15N) in intact Arabidopsis plants was used for accurate determination by mass spectrometry of differences in protein abundance between plasma membranes isolated from leaves and roots. In total, 703 proteins were identified, of which 188 were predicted to be integral membrane proteins. Major classes were transporters, receptors, proteins involved in membrane trafficking and cell wall-related proteins. Forty-one of the integral proteins, including nine of the 13 isoforms of the PIP (plasma membrane intrinsic protein) aquaporin subfamily, could be identified by peptides unique to these proteins, which made it possible to determine their relative abundance in leaf and root tissue. In addition, peptides shared between isoforms gave information on the proportions of these isoforms. A comparison between our data for protein levels and corresponding data for mRNA levels in the widely used database Genevestigator showed an agreement for only about two thirds of the proteins. By contrast, localization data available in the literature for 21 of the 41 proteins show a much better agreement with our data, in particular data based on immunostaining of proteins and GUS-staining of promoter activity. Thus, although mRNA levels may provide a useful approximation for protein levels, detection and quantification of isoform-specific peptides by proteomics should generate the most reliable data for the proteome. PMID:23990937

  5. TIM Barrel Protein Structure Classification Using Alignment Approach and Best Hit Strategy

    NASA Astrophysics Data System (ADS)

    Chu, Jia-Han; Lin, Chun Yuan; Chang, Cheng-Wen; Lee, Chihan; Yang, Yuh-Shyong; Tang, Chuan Yi

    2007-11-01

    The classification of protein structures is essential for their function determination in bioinformatics. It has been estimated that around 10% of all known enzymes have TIM barrel domains from the Structural Classification of Proteins (SCOP) database. With its high sequence variation and diverse functionalities, TIM barrel protein becomes to be an attractive target for protein engineering and for the evolution study. Hence, in this paper, an alignment approach with the best hit strategy is proposed to classify the TIM barrel protein structure in terms of superfamily and family levels in the SCOP. This work is also used to do the classification for class level in the Enzyme nomenclature (ENZYME) database. Two testing data sets, TIM40D and TIM95D, both are used to evaluate this approach. The resulting classification has an overall prediction accuracy rate of 90.3% for the superfamily level in the SCOP, 89.5% for the family level in the SCOP and 70.1% for the class level in the ENZYME. These results demonstrate that the alignment approach with the best hit strategy is a simple and viable method for the TIM barrel protein structure classification, even only has the amino acid sequences information.

  6. Statistical potential-based amino acid similarity matrices for aligning distantly related protein sequences.

    PubMed

    Tan, Yen Hock; Huang, He; Kihara, Daisuke

    2006-08-15

    Aligning distantly related protein sequences is a long-standing problem in bioinformatics, and a key for successful protein structure prediction. Its importance is increasing recently in the context of structural genomics projects because more and more experimentally solved structures are available as templates for protein structure modeling. Toward this end, recent structure prediction methods employ profile-profile alignments, and various ways of aligning two profiles have been developed. More fundamentally, a better amino acid similarity matrix can improve a profile itself; thereby resulting in more accurate profile-profile alignments. Here we have developed novel amino acid similarity matrices from knowledge-based amino acid contact potentials. Contact potentials are used because the contact propensity to the other amino acids would be one of the most conserved features of each position of a protein structure. The derived amino acid similarity matrices are tested on benchmark alignments at three different levels, namely, the family, the superfamily, and the fold level. Compared to BLOSUM45 and the other existing matrices, the contact potential-based matrices perform comparably in the family level alignments, but clearly outperform in the fold level alignments. The contact potential-based matrices perform even better when suboptimal alignments are considered. Comparing the matrices themselves with each other revealed that the contact potential-based matrices are very different from BLOSUM45 and the other matrices, indicating that they are located in a different basin in the amino acid similarity matrix space.

  7. Prediction of GCRV virus-host protein interactome based on structural motif-domain interactions.

    PubMed

    Zhang, Aidi; He, Libo; Wang, Yaping

    2017-03-02

    Grass carp hemorrhagic disease, caused by grass carp reovirus (GCRV), is the most fatal causative agent in grass carp aquaculture. Protein-protein interactions between virus and host are one avenue through which GCRV can trigger infection and induce disease. Experimental approaches for the detection of host-virus interactome have many inherent limitations, and studies on protein-protein interactions between GCRV and its host remain rare. In this study, based on known motif-domain interaction information, we systematically predicted the GCRV virus-host protein interactome by using motif-domain interaction pair searching strategy. These proteins derived from different domain families and were predicted to interact with different motif patterns in GCRV. JAM-A protein was successfully predicted to interact with motifs of GCRV Sigma1-like protein, and shared the similar binding mode compared with orthoreovirus. Differentially expressed genes during GCRV infection process were extracted and mapped to our predicted interactome, the overlapped genes displayed different tissue expression distributions on the whole, the overall expression level in intestinal is higher than that of other three tissues, which may suggest that the functions of these genes are more active in intestinal. Function annotation and pathway enrichment analysis revealed that the host targets were largely involved in signaling pathway and immune pathway, such as interferon-gamma signaling pathway, VEGF signaling pathway, EGF receptor signaling pathway, B cell activation, and T cell activation. Although the predicted PPIs may contain some false positives due to limited data resource and poor research background in non-model species, the computational method still provide reasonable amount of interactions, which can be further validated by high throughput experiments. The findings of this work will contribute to the development of system biology for GCRV infectious diseases, and help guide the identification of novel receptors of GCRV in its host.

  8. Molecular cloning and characterization of Aspergillus nidulans cyclophilin B.

    PubMed

    Joseph, J D; Heitman, J; Means, A R

    1999-06-01

    Cyclophilins are an evolutionarily conserved family of proteins which serve as the intracellular receptors for the immunosuppressive drug cyclosporin A. Here we report the characterization of the first cyclophilin cloned from the filamentous fungus Aspergillus nidulans (CYPB). Sequence analysis of the cypB gene predicts an encoded protein with highest homology to the murine cyclophilin B protein. The sequence similarity includes an N-terminal sequence predicted to target the protein to the endoplasmic reticulum (ER) as well as a C-terminal sequence predicted to retain the mature protein in the ER. The bacterially expressed hexa-histidine tagged protein displays peptidyl-prolyl isomerase activity which is inhibited by cyclosporin A. In the presence of cyclosporin A, the expressed protein also inhibits purified calcineurin. When the endogenous cypB gene was disrupted and placed under the control of the regulatable alcohol dehydrogenase promoter, the strain demonstrated no detectable growth phenotype under conditions which induce or repress cypB transcription. Induction or repression of the cypB gene also did not effect sensitivity of A. nidulans to cyclosporin A. cypB mRNA levels were significantly elevated under severe heat shock conditions, indicating a possible role for the A. nidulans cyclophilin B protein during growth in high stress environments. Copyright 1999 Academic Press.

  9. Defining Aggressive Prostate Cancer Using a 12-Gene Model1

    PubMed Central

    Riva, Alberto; Kim, Robert; Varambally, Sooryanarayana; He, Le; Kutok, Jeff; Aster, Jonathan C; Tang, Jeffery; Kuefer, Rainer; Hofer, Matthias D; Febbo, Phillip G; Chinnaiyan, Arul M; Rubin, Mark A

    2006-01-01

    Abstract The critical clinical question in prostate cancer research is: How do we develop means of distinguishing aggressive disease from indolent disease? Using a combination of proteomic and expression array data, we identified a set of 36 genes with concordant dysregulation of protein products that could be evaluated in situ by quantitative immunohistochemistry. Another five prostate cancer biomarkers were included using linear discriminant analysis, we determined that the optimal model used to predict prostate cancer progression consisted of 12 proteins. Using a separate patient population, transcriptional levels of the 12 genes encoding for these proteins predicted prostate-specific antigen failure in 79 men following surgery for clinically localized prostate cancer (P = .0015). This study demonstrates that cross-platform models can lead to predictive models with the possible advantage of being more robust through this selection process. PMID:16533427

  10. RBscore&NBench: a high-level web server for nucleic acid binding residues prediction with a large-scale benchmarking database.

    PubMed

    Miao, Zhichao; Westhof, Eric

    2016-07-08

    RBscore&NBench combines a web server, RBscore and a database, NBench. RBscore predicts RNA-/DNA-binding residues in proteins and visualizes the prediction scores and features on protein structures. The scoring scheme of RBscore directly links feature values to nucleic acid binding probabilities and illustrates the nucleic acid binding energy funnel on the protein surface. To avoid dataset, binding site definition and assessment metric biases, we compared RBscore with 18 web servers and 3 stand-alone programs on 41 datasets, which demonstrated the high and stable accuracy of RBscore. A comprehensive comparison led us to develop a benchmark database named NBench. The web server is available on: http://ahsoka.u-strasbg.fr/rbscorenbench/. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  11. Multiple SNPs Within and Surrounding the Apolipoprotein E Gene Influence Cerebrospinal Fluid Apolipoprotein E Protein Levels

    PubMed Central

    Bekris, Lynn M.; Millard, Steven P.; Galloway, Nichole M.; Vuletic, Simona; Albers, John J.; Li, Ge; Galasko, Douglas R.; DeCarli, Charles; Farlow, Martin R.; Clark, Chris M.; Quinn, Joseph F.; Kaye, Jeffrey A.; Schellenberg, Gerard D.; Tsuang, Debby; Peskind, Elaine R.; Yu, Chang-En

    2010-01-01

    The ε4 allele of the apolipoprotein E gene (APOE) is associated with increased risk and earlier age at onset in late onset Alzheimer’s disease (AD). Other factors, such as expression level of apolipoprotein E protein (apoE), have been postulated to modify the APOE related risk of developing AD. Multiple loci in and outside of APOE are associated with a high risk of AD. The aim of this exploratory hypothesis generating investigation was to determine if some of these loci predict cerebrospinal fluid (CSF) apoE levels in healthy non-demented subjects. CSF apoE levels were measured from healthy non-demented subjects 21–87 years of age (n = 134). Backward regression models were used to evaluate the influence of 21 SNPs, within and surrounding APOE, on CSF apoE levels while taking into account age, gender, APOE ε4 and correlation between SNPs (linkage disequilibrium). APOE ε4 genotype does not predict CSF apoE levels. Three SNPs within the TOMM40 gene, one APOE promoter SNP and two SNPs within distal APOE enhancer elements (ME1 and BCR) predict CSF apoE levels. Further investigation of the genetic influence of these loci on apoE expression levels in the central nervous system is likely to provide new insight into apoE regulation as well as AD pathogenesis. PMID:18430993

  12. Tolerance of a high-protein baked-egg product in egg-allergic children.

    PubMed

    Saifi, Maryam; Swamy, Nithya; Crain, Maria; Brown, L Steven; Bird, John Andrew

    2016-05-01

    Egg allergy is one of the most common immunoglobulin E (IgE)-mediated food allergies. Extensively heating egg has been found to decrease its allergenicity and 64% to 84% of children allergic to egg have been found to tolerate baked-egg products. Because there is no reliable method for predicting baked-egg tolerance, oral food challenges remain the gold standard. Prior studies have reported on baked-egg challenges using up to 2.2 g of egg white (EW) protein. To establish whether children with egg allergy would pass a baked-egg challenge to a larger amount of egg protein and the potential criteria for predicting the likelihood of baked-egg tolerance. A chart review was conducted of all patients 6 months to 18 years of age with egg allergy who underwent oral baked-egg challenges at Children's Medical Center Dallas over a 2-year period. Challenges were conducted in the clinic with a 3.8-g baked-egg product. Fifty-nine of 94 patients (63%) tolerated the 3.8-g baked-egg product. The presence of asthma (P < .01), EW skin prick test (SPT; P < .01) reactive wheal, and EW-specific IgE level (P = .02) correlated with baked-egg reactivity, whereas ovomucoid-specific IgE level did not. The positive predictive value approached 66% at an EW SPT reactive wheal of 10 mm and 60% for an EW-specific IgE level of 8 kUA/L. Most subjects with egg allergy tolerated baked egg. This study is the first to use 3.8 g of EW protein for the challenges. The EW SPT wheal diameter and EW-specific IgE levels were the best predictors of baked-egg tolerance. Copyright © 2016 American College of Allergy, Asthma & Immunology. Published by Elsevier Inc. All rights reserved.

  13. Optimization of protein-protein docking for predicting Fc-protein interactions.

    PubMed

    Agostino, Mark; Mancera, Ricardo L; Ramsland, Paul A; Fernández-Recio, Juan

    2016-11-01

    The antibody crystallizable fragment (Fc) is recognized by effector proteins as part of the immune system. Pathogens produce proteins that bind Fc in order to subvert or evade the immune response. The structural characterization of the determinants of Fc-protein association is essential to improve our understanding of the immune system at the molecular level and to develop new therapeutic agents. Furthermore, Fc-binding peptides and proteins are frequently used to purify therapeutic antibodies. Although several structures of Fc-protein complexes are available, numerous others have not yet been determined. Protein-protein docking could be used to investigate Fc-protein complexes; however, improved approaches are necessary to efficiently model such cases. In this study, a docking-based structural bioinformatics approach is developed for predicting the structures of Fc-protein complexes. Based on the available set of X-ray structures of Fc-protein complexes, three regions of the Fc, loosely corresponding to three turns within the structure, were defined as containing the essential features for protein recognition and used as restraints to filter the initial docking search. Rescoring the filtered poses with an optimal scoring strategy provided a success rate of approximately 80% of the test cases examined within the top ranked 20 poses, compared to approximately 20% by the initial unrestrained docking. The developed docking protocol provides a significant improvement over the initial unrestrained docking and will be valuable for predicting the structures of currently undetermined Fc-protein complexes, as well as in the design of peptides and proteins that target Fc. Copyright © 2016 John Wiley & Sons, Ltd.

  14. Multiobjective evolutionary algorithm with many tables for purely ab initio protein structure prediction.

    PubMed

    Brasil, Christiane Regina Soares; Delbem, Alexandre Claudio Botazzo; da Silva, Fernando Luís Barroso

    2013-07-30

    This article focuses on the development of an approach for ab initio protein structure prediction (PSP) without using any earlier knowledge from similar protein structures, as fragment-based statistics or inference of secondary structures. Such an approach is called purely ab initio prediction. The article shows that well-designed multiobjective evolutionary algorithms can predict relevant protein structures in a purely ab initio way. One challenge for purely ab initio PSP is the prediction of structures with β-sheets. To work with such proteins, this research has also developed procedures to efficiently estimate hydrogen bond and solvation contribution energies. Considering van der Waals, electrostatic, hydrogen bond, and solvation contribution energies, the PSP is a problem with four energetic terms to be minimized. Each interaction energy term can be considered an objective of an optimization method. Combinatorial problems with four objectives have been considered too complex for the available multiobjective optimization (MOO) methods. The proposed approach, called "Multiobjective evolutionary algorithms with many tables" (MEAMT), can efficiently deal with four objectives through the combination thereof, performing a more adequate sampling of the objective space. Therefore, this method can better map the promising regions in this space, predicting structures in a purely ab initio way. In other words, MEAMT is an efficient optimization method for MOO, which explores simultaneously the search space as well as the objective space. MEAMT can predict structures with one or two domains with RMSDs comparable to values obtained by recently developed ab initio methods (GAPFCG , I-PAES, and Quark) that use different levels of earlier knowledge. Copyright © 2013 Wiley Periodicals, Inc.

  15. Computational analysis identifies putative prognostic biomarkers of pathological scarring in skin wounds.

    PubMed

    Nagaraja, Sridevi; Chen, Lin; DiPietro, Luisa A; Reifman, Jaques; Mitrophanov, Alexander Y

    2018-02-20

    Pathological scarring in wounds is a prevalent clinical outcome with limited prognostic options. The objective of this study was to investigate whether cellular signaling proteins could be used as prognostic biomarkers of pathological scarring in traumatic skin wounds. We used our previously developed and validated computational model of injury-initiated wound healing to simulate the time courses for platelets, 6 cell types, and 21 proteins involved in the inflammatory and proliferative phases of wound healing. Next, we analysed thousands of simulated wound-healing scenarios to identify those that resulted in pathological (i.e., excessive) scarring. Then, we identified candidate proteins that were elevated (or decreased) at the early stages of wound healing in those simulations and could therefore serve as predictive biomarkers of pathological scarring outcomes. Finally, we performed logistic regression analysis and calculated the area under the receiver operating characteristic curve to quantitatively assess the predictive accuracy of the model-identified putative biomarkers. We identified three proteins (interleukin-10, tissue inhibitor of matrix metalloproteinase-1, and fibronectin) whose levels were elevated in pathological scars as early as 2 weeks post-wounding and could predict a pathological scarring outcome occurring 40 days after wounding with 80% accuracy. Our method for predicting putative prognostic wound-outcome biomarkers may serve as an effective means to guide the identification of proteins predictive of pathological scarring.

  16. Automated protein structure modeling in CASP9 by I-TASSER pipeline combined with QUARK-based ab initio folding and FG-MD-based structure refinement

    PubMed Central

    Xu, Dong; Zhang, Jian; Roy, Ambrish; Zhang, Yang

    2011-01-01

    I-TASSER is an automated pipeline for protein tertiary structure prediction using multiple threading alignments and iterative structure assembly simulations. In CASP9 experiments, two new algorithms, QUARK and FG-MD, were added to the I-TASSER pipeline for improving the structural modeling accuracy. QUARK is a de novo structure prediction algorithm used for structure modeling of proteins that lack detectable template structures. For distantly homologous targets, QUARK models are found useful as a reference structure for selecting good threading alignments and guiding the I-TASSER structure assembly simulations. FG-MD is an atomic-level structural refinement program that uses structural fragments collected from the PDB structures to guide molecular dynamics simulation and improve the local structure of predicted model, including hydrogen-bonding networks, torsion angles and steric clashes. Despite considerable progress in both the template-based and template-free structure modeling, significant improvements on protein target classification, domain parsing, model selection, and ab initio folding of beta-proteins are still needed to further improve the I-TASSER pipeline. PMID:22069036

  17. Prognostic relevance of the expressions of CAV1 and TES genes on 7q31 in melanoma.

    PubMed

    Vizkeleti, Laura; Ecsedi, Szilvia; Rakosy, Zsuzsa; Begany, Agnes; Emri, Gabriella; Toth, Reka; Orosz, Adrienn; Szollosi, Attila Gabor; Mehes, Gabor; Adany, Roza; Balazs, Margit

    2012-01-01

    The 7q31 locus contains several genes affected in cancer progression. Although evidences exist regarding its impact on tumorigenesis, the role of genetic alterations and the expressions of locus-related genes are still controversial. Our study aimed to define the 7q31 copy number alterations in primary melanomas, primary-metastatic tumor pairs and cell lines. Data were correlated with clinical-pathological parameters. Genetic data show that 7q31 copy number distribution was heterogeneous in both primary and metastatic tumors. Extra copies were highly accompanied by chromosome 7 polisomy, and significantly increased in primary lesions with poor prognosis. Additionally, we determined the mRNA and protein levels of the locus-related CAV1 and TES genes. TES mRNA level was associated with metastatic location. CAV1 mRNA and protein levels were significantly higher in thicker tumors, however, lack of protein was also observed in a subpopulation of thin lesions. Expressions of CAV1 and TES were not associated with 7q31 alterations. In conclusion, 7q31 amplification can predict unfavorable outcome. Alterations of TES mRNA level may predict the location of metastasis. CAV1 possibly affect the cancer cell invasion.

  18. Identification of functional candidates amongst hypothetical proteins of Treponema pallidum ssp. pallidum.

    PubMed

    Naqvi, Ahmad Abu Turab; Shahbaaz, Mohd; Ahmad, Faizan; Hassan, Md Imtaiyaz

    2015-01-01

    Syphilis is a globally occurring venereal disease, and its infection is propagated through sexual contact. The causative agent of syphilis, Treponema pallidum ssp. pallidum, a Gram-negative sphirochaete, is an obligate human parasite. Genome of T. pallidum ssp. pallidum SS14 strain (RefSeq NC_010741.1) encodes 1,027 proteins, of which 444 proteins are known as hypothetical proteins (HPs), i.e., proteins of unknown functions. Here, we performed functional annotation of HPs of T. pallidum ssp. pallidum using various database, domain architecture predictors, protein function annotators and clustering tools. We have analyzed the sequences of 444 HPs of T. pallidum ssp. pallidum and subsequently predicted the function of 207 HPs with a high level of confidence. However, functions of 237 HPs are predicted with less accuracy. We found various enzymes, transporters, binding proteins in the annotated group of HPs that may be possible molecular targets, facilitating for the survival of pathogen. Our comprehensive analysis helps to understand the mechanism of pathogenesis to provide many novel potential therapeutic interventions.

  19. Contribution of Molecular Allergen Analysis in Diagnosis of Milk Allergy.

    PubMed

    Bartuzi, Zbigniew; Cocco, Renata Rodrigues; Muraro, Antonella; Nowak-Węgrzyn, Anna

    2017-07-01

    We sought to describe the available evidence supporting the utilization of the molecular allergen analysis (MAA) for diagnosis and management of cow milk protein allergy (CMPA). Cow milk proteins are among the most common food allergens in IgE- and non-IgE-mediated food allergic disorders in children. Most individuals with CMPA are sensitized to both caseins and whey proteins. Caseins are more resistant to high temperatures compared to whey proteins. MAA is not superior to the conventional diagnostic tests based on the whole allergen extracts for diagnosis of CMPA. However, MAA can be useful in diagnosing tolerance to extensively heated milk proteins in baked foods. Children with CMPA and high levels of casein IgE are less likely to tolerate baked milk compared to children with low levels of casein IgE. Specific IgE-binding patterns to casein and betalactoglobulin peptides may predict the natural course of CMPA and differentiate subjects who are more likely to develop CMPA at a younger age versus those with a more persistent CMPA. Specific IgE-binding patterns to casein and beta-lactoglobulin peptides may also predict response to milk OITand identify patientsmost likely to benefit fromOIT.

  20. Assessment of test methods for evaluating effectiveness of cleaning flexible endoscopes.

    PubMed

    Washburn, Rebecca E; Pietsch, Jennifer J

    2018-06-01

    Strict adherence to each step of reprocessing is imperative to removing potentially infectious agents. Multiple methods for verifying proper reprocessing exist; however, each presents challenges and limitations, and best practice within the industry has not been established. Our goal was to evaluate endoscope cleaning verification tests with particular interest in the evaluation of the manual cleaning step. The results of the cleaning verification tests were compared with microbial culturing to see if a positive cleaning verification test would be predictive of microbial growth. This study was conducted at 2 high-volume endoscopy units within a multisite health care system. Each of the 90 endoscopes were tested for adenosine triphosphate, protein, microbial growth via agar plate, and rapid gram-negative culture via assay. The endoscopes were tested in 3 locations: the instrument channel, control knob, and elevator mechanism. This analysis showed substantial level of agreement between protein detection postmanual cleaning and protein detection post-high-level disinfection at the control head for scopes sampled sequentially. This study suggests that if protein is detected postmanual cleaning, there is a significant likelihood that protein will also be detected post-high-level disinfection. It also infers that a cleaning verification test is not predictive of microbial growth. Copyright © 2018 Association for Professionals in Infection Control and Epidemiology, Inc. Published by Elsevier Inc. All rights reserved.

  1. Identification of Atg3 as an intrinsically disordered polypeptide yields insights into the molecular dynamics of autophagy-related proteins in yeast.

    PubMed

    Popelka, Hana; Uversky, Vladimir N; Klionsky, Daniel J

    2014-06-01

    The mechanism of autophagy relies on complex cell signaling and regulatory processes. Each cell contains many proteins that lack a rigid 3-dimensional structure under physiological conditions. These dynamic proteins, called intrinsically disordered proteins (IDPs) and protein regions (IDPRs), are predominantly involved in cell signaling and regulation. Yet, very little is known about their presence among proteins of the core autophagy machinery. In this work, we characterized the autophagy protein Atg3 from yeast and human along with 2 variants to show that Atg3 is an IDPRs-containing protein and that disorder/order predicted for these proteins from their amino acid sequence corresponds to their experimental characteristics. Based on this consensus, we applied the same prediction methods to all known Atg proteins from Saccharomyces cerevisiae. The data presented here provide an insight into the structural dynamics of each Atg protein. They also show that intrinsic disorder at various levels has to be taken into consideration for about half of the Atg proteins. This work should become a useful tool that will facilitate and encourage exploration of protein intrinsic disorder in autophagy.

  2. Glareosin: a novel sexually dimorphic urinary lipocalin in the bank vole, Myodes glareolus.

    PubMed

    Loxley, Grace M; Unsworth, Jennifer; Turton, Michael J; Jebb, Alexandra; Lilley, Kathryn S; Simpson, Deborah M; Rigden, Daniel J; Hurst, Jane L; Beynon, Robert J

    2017-09-01

    The urine of bank voles ( Myodes glareolus ) contains substantial quantities of a small protein that is expressed at much higher levels in males than females, and at higher levels in males in the breeding season. This protein was purified and completely sequenced at the protein level by mass spectrometry. Leucine/isoleucine ambiguity was completely resolved by metabolic labelling, monitoring the incorporation of dietary deuterated leucine into specific sites in the protein. The predicted mass of the sequenced protein was exactly consonant with the mass of the protein measured in bank vole urine samples, correcting for the formation of two disulfide bonds. The sequence of the protein revealed that it was a lipocalin related to aphrodisin and other odorant-binding proteins (OBPs), but differed from all OBPs previously described. The pattern of secretion in urine used for scent marking by male bank voles, and the similarity to other lipocalins used as chemical signals in rodents, suggest that this protein plays a role in male sexual and/or competitive communication. We propose the name glareosin for this novel protein to reflect the origin of the protein and to emphasize the distinction from known OBPs. © 2017 The Authors.

  3. Comparative analysis and assessment of M. tuberculosis H37Rv protein-protein interaction datasets

    PubMed Central

    2011-01-01

    Background M. tuberculosis is a formidable bacterial pathogen. There is thus an increasing demand on understanding the function and relationship of proteins in various strains of M. tuberculosis. Protein-protein interactions (PPIs) data are crucial for this kind of knowledge. However, the quality of the main available M. tuberculosis PPI datasets is unclear. This hampers the effectiveness of research works that rely on these PPI datasets. Here, we analyze the two main available M. tuberculosis H37Rv PPI datasets. The first dataset is the high-throughput B2H PPI dataset from Wang et al’s recent paper in Journal of Proteome Research. The second dataset is from STRING database, version 8.3, comprising entirely of H37Rv PPIs predicted using various methods. We find that these two datasets have a surprisingly low level of agreement. We postulate the following causes for this low level of agreement: (i) the H37Rv B2H PPI dataset is of low quality; (ii) the H37Rv STRING PPI dataset is of low quality; and/or (iii) the H37Rv STRING PPIs are predictions of other forms of functional associations rather than direct physical interactions. Results To test the quality of these two datasets, we evaluate them based on correlated gene expression profiles, coherent informative GO term annotations, and conservation in other organisms. We observe a significantly greater portion of PPIs in the H37Rv STRING PPI dataset (with score ≥ 770) having correlated gene expression profiles and coherent informative GO term annotations in both interaction partners than that in the H37Rv B2H PPI dataset. Predicted H37Rv interologs derived from non-M. tuberculosis experimental PPIs are much more similar to the H37Rv STRING functional associations dataset (with score ≥ 770) than the H37Rv B2H PPI dataset. H37Rv predicted physical interologs from IntAct also show extremely low similarity with the H37Rv B2H PPI dataset; and this similarity level is much lower than that between the S. aureus MRSA252 predicted physical interologs from IntAct and S. aureus MRSA252 pull-down PPIs. Comparative analysis with several representative two-hybrid PPI datasets in other species further confirms that the H37Rv B2H PPI dataset is of low quality. Next, to test the possibility that the H37Rv STRING PPIs are not purely direct physical interactions, we compare M. tuberculosis H37Rv protein pairs that catalyze adjacent steps in enzymatic reactions to B2H PPIs and predicted PPIs in STRING, which shows it has much lower similarities with the B2H PPIs than with STRING PPIs. This result strongly suggests that the H37Rv STRING PPIs more likely correspond to indirect relationships between protein pairs than to B2H PPIs. For more precise support, we turn to S. cerevisiae for its comprehensively studied interactome. We compare S. cerevisiae predicted PPIs in STRING to three independent protein relationship datasets which respectively comprise PPIs reported in Y2H assays, protein pairs reported to be in the same protein complexes, and protein pairs that catalyze successive reaction steps in enzymatic reactions. Our analysis reveals that S. cerevisiae predicted STRING PPIs have much higher similarity to the latter two types of protein pairs than to two-hybrid PPIs. As H37Rv STRING PPIs are predicted using similar methods as S. cerevisiae predicted STRING PPIs, this suggests that these H37Rv STRING PPIs are more likely to correspond to the latter two types of protein pairs rather than to two-hybrid PPIs as well. Conclusions The H37Rv B2H PPI dataset has low quality. It should not be used as the gold standard to assess the quality of other (possibly predicted) H37Rv PPI datasets. The H37Rv STRING PPI dataset also has low quality; nevertheless, a subset consisting of STRING PPIs with score ≥770 has satisfactory quality. However, these STRING “PPIs” should be interpreted as functional associations, which include a substantial portion of indirect protein interactions, rather than direct physical interactions. These two factors cause the strikingly low similarity between these two main H37Rv PPI datasets. The results and conclusions from this comparative analysis provide valuable guidance in using these M. tuberculosis H37Rv PPI datasets in subsequent studies for a wide range of purposes. PMID:22369691

  4. Predictive models for customizing chemotherapy in advanced non-small cell lung cancer (NSCLC).

    PubMed

    Bonanno, Laura

    2013-06-01

    The backbone of first-line treatment for Epidermal Growth Factor (EGFR) wild-type (wt) advanced Non-small cell lung cancer (NSCLC) patients is the use of a platinum-based chemotherapy combination. The treatment is characterized by great inter-individual variability in outcome. Molecular predictive markers are extremely needed in order to identify patients most likely to benefit from platinum-based treatment and resistant ones, thus optimizing chemotherapy approach in NSCLC. Several components of DNA repair response (DRR) have been investigated as potential predictive markers. Among them, high levels of expression of ERCC1, both at protein and mRNA levels, have been associated with resistance to cisplatin in NSCLC. In addition, low levels of expression of RRM1, a target for gemcitabine, have been associated with improved OS in advanced NSCLC patients treated with cisplatin and gemcitabine. Preclinical data and retrospective analyses showed that BRCA1 is able to induce resistance to cisplatin and sensitivity to antimicrotubule agents. In addition, the mRNA levels of expression of RAP80, encoding for a protein cooperating with BRCA1 in homologous recombination (HR), have demonstrated to further sub-classify low BRCA1 NSCLC tumors, improving the predictive model. On the basis of biological knowledge on DNA repair pathway and recent controversial results from clinical validation of potential molecular markers, integrated analysis of multiple DNA repair components could improve predictive information and pave the way to a new approach to customized chemotherapy clinical trials.

  5. Conservation of coevolving protein interfaces bridges prokaryote–eukaryote homologies in the twilight zone

    PubMed Central

    Rodriguez-Rivas, Juan; Marsili, Simone; Juan, David; Valencia, Alfonso

    2016-01-01

    Protein–protein interactions are fundamental for the proper functioning of the cell. As a result, protein interaction surfaces are subject to strong evolutionary constraints. Recent developments have shown that residue coevolution provides accurate predictions of heterodimeric protein interfaces from sequence information. So far these approaches have been limited to the analysis of families of prokaryotic complexes for which large multiple sequence alignments of homologous sequences can be compiled. We explore the hypothesis that coevolution points to structurally conserved contacts at protein–protein interfaces, which can be reliably projected to homologous complexes with distantly related sequences. We introduce a domain-centered protocol to study the interplay between residue coevolution and structural conservation of protein–protein interfaces. We show that sequence-based coevolutionary analysis systematically identifies residue contacts at prokaryotic interfaces that are structurally conserved at the interface of their eukaryotic counterparts. In turn, this allows the prediction of conserved contacts at eukaryotic protein–protein interfaces with high confidence using solely mutational patterns extracted from prokaryotic genomes. Even in the context of high divergence in sequence (the twilight zone), where standard homology modeling of protein complexes is unreliable, our approach provides sequence-based accurate information about specific details of protein interactions at the residue level. Selected examples of the application of prokaryotic coevolutionary analysis to the prediction of eukaryotic interfaces further illustrate the potential of this approach. PMID:27965389

  6. Highly sensitive detection of individual HEAT and ARM repeats with HHpred and COACH.

    PubMed

    Kippert, Fred; Gerloff, Dietlind L

    2009-09-24

    HEAT and ARM repeats occur in a large number of eukaryotic proteins. As these repeats are often highly diverged, the prediction of HEAT or ARM domains can be challenging. Except for the most clear-cut cases, identification at the individual repeat level is indispensable, in particular for determining domain boundaries. However, methods using single sequence queries do not have the sensitivity required to deal with more divergent repeats and, when applied to proteins with known structures, in some cases failed to detect a single repeat. Testing algorithms which use multiple sequence alignments as queries, we found two of them, HHpred and COACH, to detect HEAT and ARM repeats with greatly enhanced sensitivity. Calibration against experimentally determined structures suggests the use of three score classes with increasing confidence in the prediction, and prediction thresholds for each method. When we applied a new protocol using both HHpred and COACH to these structures, it detected 82% of HEAT repeats and 90% of ARM repeats, with the minimum for a given protein of 57% for HEAT repeats and 60% for ARM repeats. Application to bona fide HEAT and ARM proteins or domains indicated that similar numbers can be expected for the full complement of HEAT/ARM proteins. A systematic screen of the Protein Data Bank for false positive hits revealed their number to be low, in particular for ARM repeats. Double false positive hits for a given protein were rare for HEAT and not at all observed for ARM repeats. In combination with fold prediction and consistency checking (multiple sequence alignments, secondary structure prediction, and position analysis), repeat prediction with the new HHpred/COACH protocol dramatically improves prediction in the twilight zone of fold prediction methods, as well as the delineation of HEAT/ARM domain boundaries. A protocol is presented for the identification of individual HEAT or ARM repeats which is straightforward to implement. It provides high sensitivity at a low false positive rate and will therefore greatly enhance the accuracy of predictions of HEAT and ARM domains.

  7. Highly Sensitive Detection of Individual HEAT and ARM Repeats with HHpred and COACH

    PubMed Central

    Kippert, Fred; Gerloff, Dietlind L.

    2009-01-01

    Background HEAT and ARM repeats occur in a large number of eukaryotic proteins. As these repeats are often highly diverged, the prediction of HEAT or ARM domains can be challenging. Except for the most clear-cut cases, identification at the individual repeat level is indispensable, in particular for determining domain boundaries. However, methods using single sequence queries do not have the sensitivity required to deal with more divergent repeats and, when applied to proteins with known structures, in some cases failed to detect a single repeat. Methodology and Principal Findings Testing algorithms which use multiple sequence alignments as queries, we found two of them, HHpred and COACH, to detect HEAT and ARM repeats with greatly enhanced sensitivity. Calibration against experimentally determined structures suggests the use of three score classes with increasing confidence in the prediction, and prediction thresholds for each method. When we applied a new protocol using both HHpred and COACH to these structures, it detected 82% of HEAT repeats and 90% of ARM repeats, with the minimum for a given protein of 57% for HEAT repeats and 60% for ARM repeats. Application to bona fide HEAT and ARM proteins or domains indicated that similar numbers can be expected for the full complement of HEAT/ARM proteins. A systematic screen of the Protein Data Bank for false positive hits revealed their number to be low, in particular for ARM repeats. Double false positive hits for a given protein were rare for HEAT and not at all observed for ARM repeats. In combination with fold prediction and consistency checking (multiple sequence alignments, secondary structure prediction, and position analysis), repeat prediction with the new HHpred/COACH protocol dramatically improves prediction in the twilight zone of fold prediction methods, as well as the delineation of HEAT/ARM domain boundaries. Significance A protocol is presented for the identification of individual HEAT or ARM repeats which is straightforward to implement. It provides high sensitivity at a low false positive rate and will therefore greatly enhance the accuracy of predictions of HEAT and ARM domains. PMID:19777061

  8. Serum Mac-2 binding protein glycosylation isomer predicts grade F4 liver fibrosis in patients with biliary atresia.

    PubMed

    Yamada, Naoya; Sanada, Yukihiro; Tashiro, Masahisa; Hirata, Yuta; Okada, Noriki; Ihara, Yoshiyuki; Urahashi, Taizen; Mizuta, Koichi

    2017-02-01

    Mac-2 Binding Protein Glycosylation Isomer (M2BPGi) is a novel fibrosis marker. We examined the ability of M2BPGi to predict liver fibrosis in patients with biliary atresia. Sixty-four patients who underwent living donor liver transplantation (LDLT) were included [median age, 1.1 years (range 0.4-16.0), male 16 patients (25.0 %)]. We examined M2BPGi levels in serum obtained the day before LDLT, and we compared the value of the preoperative M2BPGi levels with the histological evaluation of fibrosis using the METAVIR fibrosis score. Subsequently, we assessed the ability of M2BPGi levels to predict fibrosis. The median M2BPGi level in patients with BA was 6.02 (range, 0.36-20.0), and 0, 1, 1, 11, and 51 patients had METAVIR fibrosis scores of F0, F1, F2, F3, and F4, respectively. In patients with F4 fibrosis, the median M2BPGi level was 6.88 (quartile; 5.235, 12.10), significantly higher than that in patients with F3 fibrosis who had a median level of 2.42 (quartile; 1.93, 2.895, p < 0.01). Area under the curve analysis for the ability of M2BPGi level to predict grade fibrosis was 0.917, with a specificity and sensitivity of 0.923 and 0.941, respectively. In comparison with other fibrosis markers such as hyaluronic acid, procollagen-III-peptide, type IV collagen 7 s, and aspartate aminotransferase platelet ratio index, M2BPGi showed the strongest ability to predict grade F4 fibrosis. M2BPGi is a novel fibrosis marker for evaluating the status of the liver in patients with BA, especially when predicting grade F4 fibrosis.

  9. Serum early prostate cancer antigen (EPCA) as a significant predictor of incidental prostate cancer in patients undergoing transurethral resection of the prostate for benign prostatic hyperplasia.

    PubMed

    Zhao, Zhigang; Zeng, Guohua; Zhong, Wen

    2010-12-01

    Early prostate cancer antigen (EPCA), a nuclear matrix protein, has been recently suggested as a novel biomarker in malignant lesions of the prostate. This study was to determine whether preoperative serum EPCA levels predicted the presence of incidental prostate cancer (IPCa) in patients undergoing TURP for BPH. Serum EPCA levels were measured by ELISA in 449 consecutive patients with symptomatic BPH treated with TURP and 112 healthy men. Predictive performance of serum EPCA levels for IPCa were evaluated. With a cutoff of 10ng/ml, serum EPCA protein had a 100% specificity for the healthy men and a 98% specificity and a 100% sensitivity in separating men with IPCa from those without. Serum EPCA levels in patients with IPCa were significantly higher than in those without and in healthy controls (17.63±2.42ng/ml vs. 5.58±1.61 ng/ml and 4.95±1.43 ng/ml, all P<0.001), whereas an indwelling transurethral catheter presence and 5α-reductase inhibitor therapy had no effect on EPCA levels (P=0.144 and P=0.238, respectively). The area under ROC curves (AUC) showed that serum EPCA level had the best predictive accuracy of all IPCa (AUC: 0.952, 95% CI: 0.912-0.981, P<0.001). Univariate and multivariate Cox regression analyses further demonstrated the independently predictive performance by preoperative serum EPCA (Hazards Ratio: 4.23, 95% CI: 3.62-6.46, P<0.001). This study firstly shows that EPCA might be used as a highly sensitive and specific serum biomarker to predict IPCa presence and to help reduce the unnecessary biopsies taken before TURP in patients with BPH. © 2010 Wiley-Liss, Inc.

  10. Predictive Value of IL-8 for Sepsis and Severe Infections after Burn Injury - A Clinical Study

    PubMed Central

    Kraft, Robert; Herndon, David N; Finnerty, Celeste C; Cox, Robert A; Song, Juquan; Jeschke, Marc G

    2014-01-01

    The inflammatory response induced by burn injury contributes to increased incidence of infections, sepsis, organ failure, and mortality. Thus, monitoring post-burn inflammation is of paramount importance but so far there are no reliable biomarkers available to monitor and/or predict infectious complications after burn. As IL-8 is a major mediator for inflammatory responses, the aim of our study was to determine whether IL-8 expression can be used to predict post-burn sepsis, infections, and mortality other outcomes post-burn. Plasma cytokines, acute phase proteins, constitutive proteins, and hormones were analyzed during the first 60 days post injury from 468 pediatric burn patients. Demographics and clinical outcome variables (length of stay, infection, sepsis, multiorgan failure (MOF), and mortality were recorded. A cut-off level for IL-8 was determined using receiver operating characteristic (ROC) analysis. Statistical significance is set at (p<0.05). ROC analysis identified a cut-off level of 234 pg/ml for IL-8 for survival. Patients were grouped according to their average IL-8 levels relative to this cut off and stratified into high (H) (n=133) and low (L) (n=335) groups. In the L group, regression analysis revealed a significant predictive value of IL-8 to percent of total body surface area (TBSA) burned and incidence of MOF (p<0.001). In the H group IL-8 levels were able to predict sepsis (p<0.002). In the H group, elevated IL-8 was associated with increased inflammatory and acute phase responses compared to the L group (p<0.05). High levels of IL-8 correlated with increased MOF, sepsis, and mortality. These data suggest that serum levels of IL-8 may be a valid biomarker for monitoring sepsis, infections, and mortality in burn patients. PMID:25514427

  11. The value of time-averaged serum high-sensitivity C-reactive protein in prediction of mortality and dropout in peritoneal dialysis patients.

    PubMed

    Liu, Shou-Hsuan; Chen, Chao-Yu; Li, Yi-Jung; Wu, Hsin-Hsu; Lin, Chan-Yu; Chen, Yung-Chang; Chang, Ming-Yang; Hsu, Hsiang-Hao; Ku, Cheng-Lung; Tian, Ya-Chung

    2017-01-01

    C-reactive protein (CRP) is a useful biomarker for prediction of long-term outcomes in patients undergoing chronic dialysis. This observational cohort study evaluated whether the time-averaged serum high-sensitivity CRP (HS-CRP) level was a better predictor of clinical outcomes than a single HS-CRP level in patients undergoing peritoneal dialysis (PD). We classified 335 patients into three tertiles according to the time-averaged serum HS-CRP level and followed up regularly from January 2010 to December 2014. Clinical outcomes such as cardiovascular events, infection episodes, newly developed malignancy, encapsulating peritoneal sclerosis (EPS), dropout (death plus conversion to hemodialysis), and mortality were assessed. During a 5-year follow-up, 164 patients (49.0%) ceased PD; this included 52 patient deaths (15.5%), 100 patients (29.9%) who converted to hemodialysis, and 12 patients (3.6%) who received a kidney transplantation. The Kaplan-Meier survival analysis and log-rank test revealed a significantly worse survival accumulation in patients with high time-average HS-CRP levels. A multivariate Cox regression analysis revealed that a higher time-averaged serum HS-CRP level, older age, and the occurrence of cardiovascular events were independent mortality predictors. A higher time-averaged serum HS-CRP level, the occurrence of cardiovascular events, infection episodes, and EPS were important predictors of dropout. The receiver operating characteristic analysis verified that the value of the time-average HS-CRP level in predicting the 5-year mortality and dropout was superior to a single serum baseline HS-CRP level. This study shows that the time-averaged serum HS-CRP level is a better marker than a single baseline measurement in predicting the 5-year mortality and dropout in PD patients.

  12. SVM-Fold: a tool for discriminative multi-class protein fold and superfamily recognition

    PubMed Central

    Melvin, Iain; Ie, Eugene; Kuang, Rui; Weston, Jason; Stafford, William Noble; Leslie, Christina

    2007-01-01

    Background Predicting a protein's structural class from its amino acid sequence is a fundamental problem in computational biology. Much recent work has focused on developing new representations for protein sequences, called string kernels, for use with support vector machine (SVM) classifiers. However, while some of these approaches exhibit state-of-the-art performance at the binary protein classification problem, i.e. discriminating between a particular protein class and all other classes, few of these studies have addressed the real problem of multi-class superfamily or fold recognition. Moreover, there are only limited software tools and systems for SVM-based protein classification available to the bioinformatics community. Results We present a new multi-class SVM-based protein fold and superfamily recognition system and web server called SVM-Fold, which can be found at . Our system uses an efficient implementation of a state-of-the-art string kernel for sequence profiles, called the profile kernel, where the underlying feature representation is a histogram of inexact matching k-mer frequencies. We also employ a novel machine learning approach to solve the difficult multi-class problem of classifying a sequence of amino acids into one of many known protein structural classes. Binary one-vs-the-rest SVM classifiers that are trained to recognize individual structural classes yield prediction scores that are not comparable, so that standard "one-vs-all" classification fails to perform well. Moreover, SVMs for classes at different levels of the protein structural hierarchy may make useful predictions, but one-vs-all does not try to combine these multiple predictions. To deal with these problems, our method learns relative weights between one-vs-the-rest classifiers and encodes information about the protein structural hierarchy for multi-class prediction. In large-scale benchmark results based on the SCOP database, our code weighting approach significantly improves on the standard one-vs-all method for both the superfamily and fold prediction in the remote homology setting and on the fold recognition problem. Moreover, our code weight learning algorithm strongly outperforms nearest-neighbor methods based on PSI-BLAST in terms of prediction accuracy on every structure classification problem we consider. Conclusion By combining state-of-the-art SVM kernel methods with a novel multi-class algorithm, the SVM-Fold system delivers efficient and accurate protein fold and superfamily recognition. PMID:17570145

  13. Lowered quality of life in mood disorders is associated with increased neuro-oxidative stress and basal thyroid-stimulating hormone levels and use of anticonvulsant mood stabilizers.

    PubMed

    Nunes, Caroline Sampaio; Maes, Michael; Roomruangwong, Chutima; Moraes, Juliana Brum; Bonifacio, Kamila Landucci; Vargas, Heber Odebrecht; Barbosa, Decio Sabbatini; Anderson, George; de Melo, Luiz Gustavo Piccoli; Drozdstoj, Stoyanov; Moreira, Estefania; Carvalho, André F; Nunes, Sandra Odebrecht Vargas

    2018-04-17

    Major affective disorders including bipolar disorder (BD) and major depressive disorder (MDD) are associated with impaired health-related quality of life (HRQoL). Oxidative stress and subtle thyroid abnormalities may play a pathophysiological role in both disorders. Thus, the current study was performed to examine whether neuro-oxidative biomarkers and thyroid-stimulating hormone (TSH) levels could predict HRQoL in BD and MDD. This cross-sectional study enrolled 68 BD and 37 MDD patients and 66 healthy controls. The World Health Organization (WHO) QoL-BREF scale was used to assess 4 QoL subdomains. Peripheral blood malondialdehyde (MDA), advanced oxidation protein products, paraoxonaxe/CMPAase activity, a composite index of nitro-oxidative stress, and basal TSH were measured. In the total WHOQoL score, 17.3% of the variance was explained by increased advanced oxidation protein products and TSH levels and lowered CMPAase activity and male gender. Physical HRQoL (14.4%) was associated with increased MDA and TSH levels and lowered CMPAase activity. Social relations HRQoL (17.4%) was predicted by higher nitro-oxidative index and TSH values, while mental and environment HRQoL were independently predicted by CMPAase activity. Finally, 73.0% of the variance in total HRQoL was explained by severity of depressive symptoms, use of anticonvulsants, lower income, early lifetime emotional neglect, MDA levels, the presence of mood disorders, and suicidal ideation. These data show that lowered HRQoL in major affective disorders could at least in part result from the effects of lipid peroxidation, protein oxidation, lowered antioxidant enzyme activities, and higher levels of TSH. © 2018 John Wiley & Sons, Ltd.

  14. Characterization of the effects of trace concentrations of graphene oxide on zebrafish larvae through proteomic and standard methods.

    PubMed

    Zou, Wei; Zhou, Qixing; Zhang, Xingli; Mu, Li; Hu, Xiangang

    2018-09-15

    The effects of graphene oxide (GO) carbon nanomaterials on ecosystems have been well characterized, but the toxicity of GO at predicted environmental concentrations to living organisms at the protein level remain largely unknown. In the present work, the adverse effects and mechanisms of GO at predicted environmental concentrations were evaluated by integrating proteomics and standard analyses for the first time. The abundances of 243 proteins, including proteins involved in endocytosis (e.g., cltcb, arf6, capzb and dnm1a), oxidative stress (e.g., gpx4b, sod2, and prdx1), cytoskeleton assembly (e.g., krt8, krt94, lmna and vim), mitochondrial function (e.g., ndufa10, ndufa8, cox5aa, and cox6b1), Ca 2+ handling (e.g., atp1b2a, atp1b1a, atp6v0a1b and ncx4a) and cardiac function (e.g., tpm4a, tpm2, tnni2a.1 and tnnt3b), were found to be notably altered in response to exposure 100 μg/L GO. The results revealed that GO caused malformation and mortality, likely through the downregulation of proteins related to actin filaments and formation of the cytoskeleton, and induced oxidative stress and mitochondrial disorders by altering the levels of antioxidant enzymes and proteins associated with the mitochondrial membrane respiratory chain. Exposure to GO also increased the heart rate of zebrafish larvae and induced pericardial edema, likely by changing the expression of proteins related to Ca 2+ balance and cardiac function. This study provides new proteomic-level insights into GO toxicity against aquatic organisms, which will greatly benefit our understanding of the bio-safety of GO and its toxicity at predicted environmental concentrations. Copyright © 2018 Elsevier Inc. All rights reserved.

  15. LOCSVMPSI: a web server for subcellular localization of eukaryotic proteins using SVM and profile of PSI-BLAST

    PubMed Central

    Xie, Dan; Li, Ao; Wang, Minghui; Fan, Zhewen; Feng, Huanqing

    2005-01-01

    Subcellular location of a protein is one of the key functional characters as proteins must be localized correctly at the subcellular level to have normal biological function. In this paper, a novel method named LOCSVMPSI has been introduced, which is based on the support vector machine (SVM) and the position-specific scoring matrix generated from profiles of PSI-BLAST. With a jackknife test on the RH2427 data set, LOCSVMPSI achieved a high overall prediction accuracy of 90.2%, which is higher than the prediction results by SubLoc and ESLpred on this data set. In addition, prediction performance of LOCSVMPSI was evaluated with 5-fold cross validation test on the PK7579 data set and the prediction results were consistently better than the previous method based on several SVMs using composition of both amino acids and amino acid pairs. Further test on the SWISSPROT new-unique data set showed that LOCSVMPSI also performed better than some widely used prediction methods, such as PSORTII, TargetP and LOCnet. All these results indicate that LOCSVMPSI is a powerful tool for the prediction of eukaryotic protein subcellular localization. An online web server (current version is 1.3) based on this method has been developed and is freely available to both academic and commercial users, which can be accessed by at . PMID:15980436

  16. MobiDB-lite: fast and highly specific consensus prediction of intrinsic disorder in proteins.

    PubMed

    Necci, Marco; Piovesan, Damiano; Dosztányi, Zsuzsanna; Tosatto, Silvio C E

    2017-05-01

    Intrinsic disorder (ID) is established as an important feature of protein sequences. Its use in proteome annotation is however hampered by the availability of many methods with similar performance at the single residue level, which have mostly not been optimized to predict long ID regions of size comparable to domains. Here, we have focused on providing a single consensus-based prediction, MobiDB-lite, optimized for highly specific (i.e. few false positive) predictions of long disorder. The method uses eight different predictors to derive a consensus which is then filtered for spurious short predictions. Consensus prediction is shown to outperform the single methods when annotating long ID regions. MobiDB-lite can be useful in large-scale annotation scenarios and has indeed already been integrated in the MobiDB, DisProt and InterPro databases. MobiDB-lite is available as part of the MobiDB database from URL: http://mobidb.bio.unipd.it/. An executable can be downloaded from URL: http://protein.bio.unipd.it/mobidblite/. silvio.tosatto@unipd.it. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  17. Decreased C-reactive protein levels in Alzheimer disease.

    PubMed

    O'Bryant, Sid E; Waring, Stephen C; Hobson, Valerie; Hall, James R; Moore, Carol B; Bottiglieri, Teodoro; Massman, Paul; Diaz-Arrastia, Ramon

    2010-03-01

    C-reactive protein (CRP) is an acute-phase reactant that has been found to be associated with Alzheimer disease (AD) in histopathological and longitudinal studies; however, little data exist regarding serum CRP levels in patients with established AD. The current study evaluated CRP levels in 192 patients diagnosed with probable AD (mean age = 75.8 +/- 8.2 years; 50% female) as compared to 174 nondemented controls (mean age = 70.6 +/- 8.2 years; 63% female). Mean CRP levels were found to be significantly decreased in AD (2.9 microg/mL) versus controls (4.9 microg/mL; P = .003). In adjusted models, elevated CRP significantly predicted poorer (elevated) Clinical Dementia Rating Scale sum of boxes (CDR SB) scores in patients with AD. In controls, CRP was negatively associated with Mini-Mental State Examination (MMSE) scores and positively associated with CDR SB scores. These findings, together with previously published results, are consistent with the hypothesis that midlife elevations in CRP are associated with increased risk of AD development though elevated CRP levels are not useful for prediction in the immediate prodrome years before AD becomes clinically manifest. However, for a subgroup of patients with AD, elevated CRP continues to predict increased dementia severity suggestive of a possible proinflammatory endophenotype in AD.

  18. Decreased C-Reactive Protein Levels in Alzheimer Disease

    PubMed Central

    O’Bryant, Sid E.; Waring, Stephen C.; Hobson, Valerie; Hall, James R.; Moore, Carol B.; Bottiglieri, Teodoro; Massman, Paul; Diaz-Arrastia, Ramon

    2011-01-01

    C-reactive protein (CRP) is an acute-phase reactant that has been found to be associated with Alzheimer disease (AD) in histo-pathological and longitudinal studies; however, little data exist regarding serum CRP levels in patients with established AD. The current study evaluated CRP levels in 192 patients diagnosed with probable AD (mean age = 75.8 ± 8.2 years; 50% female) as compared to 174 nondemented controls (mean age = 70.6 ± 8.2 years; 63% female). Mean CRP levels were found to be significantly decreased in AD (2.9 µg/mL) versus controls (4.9 µg/mL; P = .003). In adjusted models, elevated CRP significantly predicted poorer (elevated) Clinical Dementia Rating Scale sum of boxes (CDR SB) scores in patients with AD. In controls, CRP was negatively associated with Mini-Mental State Examination (MMSE) scores and positively associated with CDR SB scores. These findings, together with previously published results, are consistent with the hypothesis that midlife elevations in CRP are associated with increased risk of AD development though elevated CRP levels are not useful for prediction in the immediate prodrome years before AD becomes clinically manifest. However, for a subgroup of patients with AD, elevated CRP continues to predict increased dementia severity suggestive of a possible proinflammatory endophenotype in AD. PMID:19933496

  19. A Minimalistic Resource Allocation Model to Explain Ubiquitous Increase in Protein Expression with Growth Rate

    PubMed Central

    Keren, Leeat; Segal, Eran; Milo, Ron

    2016-01-01

    Most proteins show changes in level across growth conditions. Many of these changes seem to be coordinated with the specific growth rate rather than the growth environment or the protein function. Although cellular growth rates, gene expression levels and gene regulation have been at the center of biological research for decades, there are only a few models giving a base line prediction of the dependence of the proteome fraction occupied by a gene with the specific growth rate. We present a simple model that predicts a widely coordinated increase in the fraction of many proteins out of the proteome, proportionally with the growth rate. The model reveals how passive redistribution of resources, due to active regulation of only a few proteins, can have proteome wide effects that are quantitatively predictable. Our model provides a potential explanation for why and how such a coordinated response of a large fraction of the proteome to the specific growth rate arises under different environmental conditions. The simplicity of our model can also be useful by serving as a baseline null hypothesis in the search for active regulation. We exemplify the usage of the model by analyzing the relationship between growth rate and proteome composition for the model microorganism E.coli as reflected in recent proteomics data sets spanning various growth conditions. We find that the fraction out of the proteome of a large number of proteins, and from different cellular processes, increases proportionally with the growth rate. Notably, ribosomal proteins, which have been previously reported to increase in fraction with growth rate, are only a small part of this group of proteins. We suggest that, although the fractions of many proteins change with the growth rate, such changes may be partially driven by a global effect, not necessarily requiring specific cellular control mechanisms. PMID:27073913

  20. Analysis of Bioactive Amino Acids from Fish Hydrolysates with a New Bioinformatic Intelligent System Approach.

    PubMed

    Elaziz, Mohamed Abd; Hemdan, Ahmed Monem; Hassanien, AboulElla; Oliva, Diego; Xiong, Shengwu

    2017-09-07

    The current economics of the fish protein industry demand rapid, accurate and expressive prediction algorithms at every step of protein production especially with the challenge of global climate change. This help to predict and analyze functional and nutritional quality then consequently control food allergies in hyper allergic patients. As, it is quite expensive and time-consuming to know these concentrations by the lab experimental tests, especially to conduct large-scale projects. Therefore, this paper introduced a new intelligent algorithm using adaptive neuro-fuzzy inference system based on whale optimization algorithm. This algorithm is used to predict the concentration levels of bioactive amino acids in fish protein hydrolysates at different times during the year. The whale optimization algorithm is used to determine the optimal parameters in adaptive neuro-fuzzy inference system. The results of proposed algorithm are compared with others and it is indicated the higher performance of the proposed algorithm.

  1. 7TMRmine: a Web server for hierarchical mining of 7TMR proteins

    PubMed Central

    Lu, Guoqing; Wang, Zhifang; Jones, Alan M; Moriyama, Etsuko N

    2009-01-01

    Background Seven-transmembrane region-containing receptors (7TMRs) play central roles in eukaryotic signal transduction. Due to their biomedical importance, thorough mining of 7TMRs from diverse genomes has been an active target of bioinformatics and pharmacogenomics research. The need for new and accurate 7TMR/GPCR prediction tools is paramount with the accelerated rate of acquisition of diverse sequence information. Currently available and often used protein classification methods (e.g., profile hidden Markov Models) are highly accurate for identifying their membership information among already known 7TMR subfamilies. However, these alignment-based methods are less effective for identifying remote similarities, e.g., identifying proteins from highly divergent or possibly new 7TMR families. In this regard, more sensitive (e.g., alignment-free) methods are needed to complement the existing protein classification methods. A better strategy would be to combine different classifiers, from more specific to more sensitive methods, to identify a broader spectrum of 7TMR protein candidates. Description We developed a Web server, 7TMRmine, by integrating alignment-free and alignment-based classifiers specifically trained to identify candidate 7TMR proteins as well as transmembrane (TM) prediction methods. This new tool enables researchers to easily assess the distribution of GPCR functionality in diverse genomes or individual newly-discovered proteins. 7TMRmine is easily customized and facilitates exploratory analysis of diverse genomes. Users can integrate various alignment-based, alignment-free, and TM-prediction methods in any combination and in any hierarchical order. Sixteen classifiers (including two TM-prediction methods) are available on the 7TMRmine Web server. Not only can the 7TMRmine tool be used for 7TMR mining, but also for general TM-protein analysis. Users can submit protein sequences for analysis, or explore pre-analyzed results for multiple genomes. The server currently includes prediction results and the summary statistics for 68 genomes. Conclusion 7TMRmine facilitates the discovery of 7TMR proteins. By combining prediction results from different classifiers in a multi-level filtering process, prioritized sets of 7TMR candidates can be obtained for further investigation. 7TMRmine can be also used as a general TM-protein classifier. Comparisons of TM and 7TMR protein distributions among 68 genomes revealed interesting differences in evolution of these protein families among major eukaryotic phyla. PMID:19538753

  2. Molecular cloning, functional identification and expressional analyses of FasL in Tilapia, Oreochromis niloticus.

    PubMed

    Ma, Tai-yang; Wu, Jin-ying; Gao, Xiao-ke; Wang, Jing-yuan; Zhan, Xu-liang; Li, Wen-sheng

    2014-10-01

    FasL is the most extensively studied apoptosis ligand. In 2000, tilapia FasL was identified using anti-human FasL monoclonal antibody by Evans's research group. Recently, a tilapia FasL-like protein of smaller molecule weight was predicted in Genbank (XM_003445156.2). Based on several clues drawn from previous studies, we cast doubt on the authenticity of the formerly identified tilapia FasL. Conversely, using reverse transcription polymerase chain reaction (RT-PCR), the existence of the predicted FasL-like was verified at the mRNA level (The Genbank accession number of the FasL mRNA sequence we cloned is KM008610). Through multiple alignments, this FasL-like protein was found to be highly similar to the FasL of the Japanese flounder. Moreover, we artificially expressed the functional region of the predicted protein and later confirmed its apoptosis-inducing activity using a methyl thiazolyl tetrazolium (MTT) assay, Annexin-V/Propidium iodide (PI) double staining, and DNA fragment detection. Supported by these evidences, we suggest that the predicted protein is the authentic tilapia FasL. To advance this research further, tilapia FasL mRNA and its protein across different tissues were quantified. High expression levels were identified in the tilapia immune system and sites where active cell turnover conservatively occurs. In this regard, FasL may assume an active role in the immune system and cell homeostasis maintenance in tilapia, similar to that shown in other species. In addition, because the distribution pattern of FasL mRNA did not synchronize with that of the protein, post-transcriptional expression regulation is suggested. Such regulation may be dominated by potential adenylate- and uridylate-rich elements (AREs) featuring AUUUA repeats found in the 3' untranslated region (UTR) of tilapia FasL mRNA. Copyright © 2014 Elsevier Ltd. All rights reserved.

  3. Prediction and Testing of Biological Networks Underlying Intestinal Cancer

    PubMed Central

    Mariadason, John M.; Wang, Donghai; Augenlicht, Leonard H.; Chance, Mark R.

    2010-01-01

    Colorectal cancer progresses through an accumulation of somatic mutations, some of which reside in so-called “driver” genes that provide a growth advantage to the tumor. To identify points of intersection between driver gene pathways, we implemented a network analysis framework using protein interactions to predict likely connections – both precedented and novel – between key driver genes in cancer. We applied the framework to find significant connections between two genes, Apc and Cdkn1a (p21), known to be synergistic in tumorigenesis in mouse models. We then assessed the functional coherence of the resulting Apc-Cdkn1a network by engineering in vivo single node perturbations of the network: mouse models mutated individually at Apc (Apc1638N+/−) or Cdkn1a (Cdkn1a−/−), followed by measurements of protein and gene expression changes in intestinal epithelial tissue. We hypothesized that if the predicted network is biologically coherent (functional), then the predicted nodes should associate more specifically with dysregulated genes and proteins than stochastically selected genes and proteins. The predicted Apc-Cdkn1a network was significantly perturbed at the mRNA-level by both single gene knockouts, and the predictions were also strongly supported based on physical proximity and mRNA coexpression of proteomic targets. These results support the functional coherence of the proposed Apc-Cdkn1a network and also demonstrate how network-based predictions can be statistically tested using high-throughput biological data. PMID:20824133

  4. Nonlinear scoring functions for similarity-based ligand docking and binding affinity prediction.

    PubMed

    Brylinski, Michal

    2013-11-25

    A common strategy for virtual screening considers a systematic docking of a large library of organic compounds into the target sites in protein receptors with promising leads selected based on favorable intermolecular interactions. Despite a continuous progress in the modeling of protein-ligand interactions for pharmaceutical design, important challenges still remain, thus the development of novel techniques is required. In this communication, we describe eSimDock, a new approach to ligand docking and binding affinity prediction. eSimDock employs nonlinear machine learning-based scoring functions to improve the accuracy of ligand ranking and similarity-based binding pose prediction, and to increase the tolerance to structural imperfections in the target structures. In large-scale benchmarking using the Astex/CCDC data set, we show that 53.9% (67.9%) of the predicted ligand poses have RMSD of <2 Å (<3 Å). Moreover, using binding sites predicted by recently developed eFindSite, eSimDock models ligand binding poses with an RMSD of 4 Å for 50.0-39.7% of the complexes at the protein homology level limited to 80-40%. Simulations against non-native receptor structures, whose mean backbone rearrangements vary from 0.5 to 5.0 Å Cα-RMSD, show that the ratio of docking accuracy and the estimated upper bound is at a constant level of ∼0.65. Pearson correlation coefficient between experimental and predicted by eSimDock Ki values for a large data set of the crystal structures of protein-ligand complexes from BindingDB is 0.58, which decreases only to 0.46 when target structures distorted to 3.0 Å Cα-RMSD are used. Finally, two case studies demonstrate that eSimDock can be customized to specific applications as well. These encouraging results show that the performance of eSimDock is largely unaffected by the deformations of ligand binding regions, thus it represents a practical strategy for across-proteome virtual screening using protein models. eSimDock is freely available to the academic community as a Web server at http://www.brylinski.org/esimdock .

  5. Nanoparticles-cell association predicted by protein corona fingerprints

    NASA Astrophysics Data System (ADS)

    Palchetti, S.; Digiacomo, L.; Pozzi, D.; Peruzzi, G.; Micarelli, E.; Mahmoudi, M.; Caracciolo, G.

    2016-06-01

    In a physiological environment (e.g., blood and interstitial fluids) nanoparticles (NPs) will bind proteins shaping a ``protein corona'' layer. The long-lived protein layer tightly bound to the NP surface is referred to as the hard corona (HC) and encodes information that controls NP bioactivity (e.g. cellular association, cellular signaling pathways, biodistribution, and toxicity). Decrypting this complex code has become a priority to predict the NP biological outcomes. Here, we use a library of 16 lipid NPs of varying size (Ø ~ 100-250 nm) and surface chemistry (unmodified and PEGylated) to investigate the relationships between NP physicochemical properties (nanoparticle size, aggregation state and surface charge), protein corona fingerprints (PCFs), and NP-cell association. We found out that none of the NPs' physicochemical properties alone was exclusively able to account for association with human cervical cancer cell line (HeLa). For the entire library of NPs, a total of 436 distinct serum proteins were detected. We developed a predictive-validation modeling that provides a means of assessing the relative significance of the identified corona proteins. Interestingly, a minor fraction of the HC, which consists of only 8 PCFs were identified as main promoters of NP association with HeLa cells. Remarkably, identified PCFs have several receptors with high level of expression on the plasma membrane of HeLa cells.In a physiological environment (e.g., blood and interstitial fluids) nanoparticles (NPs) will bind proteins shaping a ``protein corona'' layer. The long-lived protein layer tightly bound to the NP surface is referred to as the hard corona (HC) and encodes information that controls NP bioactivity (e.g. cellular association, cellular signaling pathways, biodistribution, and toxicity). Decrypting this complex code has become a priority to predict the NP biological outcomes. Here, we use a library of 16 lipid NPs of varying size (Ø ~ 100-250 nm) and surface chemistry (unmodified and PEGylated) to investigate the relationships between NP physicochemical properties (nanoparticle size, aggregation state and surface charge), protein corona fingerprints (PCFs), and NP-cell association. We found out that none of the NPs' physicochemical properties alone was exclusively able to account for association with human cervical cancer cell line (HeLa). For the entire library of NPs, a total of 436 distinct serum proteins were detected. We developed a predictive-validation modeling that provides a means of assessing the relative significance of the identified corona proteins. Interestingly, a minor fraction of the HC, which consists of only 8 PCFs were identified as main promoters of NP association with HeLa cells. Remarkably, identified PCFs have several receptors with high level of expression on the plasma membrane of HeLa cells. Electronic supplementary information (ESI) available: Table S1. Cell viability (%) and cell association of the different nanoparticles used. Table S2. Total number of identified proteins on the different nanoparticles used. Tables S3-S18. Top 25 most abundant corona proteins identified in the protein corona of nanoparticles NP2-NP16 following 1 hour incubation with HP. Table S19. List of descriptors used. Table S20. Potential targets of protein corona fingerprints with its own interaction score (mentha) and the expression median value in Hela cells. Fig. S1 and S2. Effect of exposure to human plasma on size and zeta potential of NPs. Fig. S3. Predictive modeling of nanoparticle-cell association. See DOI: 10.1039/c6nr03898k

  6. Protein docking prediction using predicted protein-protein interface.

    PubMed

    Li, Bin; Kihara, Daisuke

    2012-01-10

    Many important cellular processes are carried out by protein complexes. To provide physical pictures of interacting proteins, many computational protein-protein prediction methods have been developed in the past. However, it is still difficult to identify the correct docking complex structure within top ranks among alternative conformations. We present a novel protein docking algorithm that utilizes imperfect protein-protein binding interface prediction for guiding protein docking. Since the accuracy of protein binding site prediction varies depending on cases, the challenge is to develop a method which does not deteriorate but improves docking results by using a binding site prediction which may not be 100% accurate. The algorithm, named PI-LZerD (using Predicted Interface with Local 3D Zernike descriptor-based Docking algorithm), is based on a pair wise protein docking prediction algorithm, LZerD, which we have developed earlier. PI-LZerD starts from performing docking prediction using the provided protein-protein binding interface prediction as constraints, which is followed by the second round of docking with updated docking interface information to further improve docking conformation. Benchmark results on bound and unbound cases show that PI-LZerD consistently improves the docking prediction accuracy as compared with docking without using binding site prediction or using the binding site prediction as post-filtering. We have developed PI-LZerD, a pairwise docking algorithm, which uses imperfect protein-protein binding interface prediction to improve docking accuracy. PI-LZerD consistently showed better prediction accuracy over alternative methods in the series of benchmark experiments including docking using actual docking interface site predictions as well as unbound docking cases.

  7. Three reasons protein disorder analysis makes more sense in the light of collagen

    PubMed Central

    Oates, Matt E.; Tompa, Peter; Gough, Julian

    2016-01-01

    Abstract We have identified that the collagen helix has the potential to be disruptive to analyses of intrinsically disordered proteins. The collagen helix is an extended fibrous structure that is both promiscuous and repetitive. Whilst its sequence is predicted to be disordered, this type of protein structure is not typically considered as intrinsic disorder. Here, we show that collagen‐encoding proteins skew the distribution of exon lengths in genes. We find that previous results, demonstrating that exons encoding disordered regions are more likely to be symmetric, are due to the abundance of the collagen helix. Other related results, showing increased levels of alternative splicing in disorder‐encoding exons, still hold after considering collagen‐containing proteins. Aside from analyses of exons, we find that the set of proteins that contain collagen significantly alters the amino acid composition of regions predicted as disordered. We conclude that research in this area should be conducted in the light of the collagen helix. PMID:26941008

  8. Docking-based modeling of protein-protein interfaces for extensive structural and functional characterization of missense mutations.

    PubMed

    Barradas-Bautista, Didier; Fernández-Recio, Juan

    2017-01-01

    Next-generation sequencing (NGS) technologies are providing genomic information for an increasing number of healthy individuals and patient populations. In the context of the large amount of generated genomic data that is being generated, understanding the effect of disease-related mutations at molecular level can contribute to close the gap between genotype and phenotype and thus improve prevention, diagnosis or treatment of a pathological condition. In order to fully characterize the effect of a pathological mutation and have useful information for prediction purposes, it is important first to identify whether the mutation is located at a protein-binding interface, and second to understand the effect on the binding affinity of the affected interaction/s. Computational methods, such as protein docking are currently used to complement experimental efforts and could help to build the human structural interactome. Here we have extended the original pyDockNIP method to predict the location of disease-associated nsSNPs at protein-protein interfaces, when there is no available structure for the protein-protein complex. We have applied this approach to the pathological interaction networks of six diseases with low structural data on PPIs. This approach can almost double the number of nsSNPs that can be characterized and identify edgetic effects in many nsSNPs that were previously unknown. This can help to annotate and interpret genomic data from large-scale population studies, and to achieve a better understanding of disease at molecular level.

  9. Docking-based modeling of protein-protein interfaces for extensive structural and functional characterization of missense mutations

    PubMed Central

    2017-01-01

    Next-generation sequencing (NGS) technologies are providing genomic information for an increasing number of healthy individuals and patient populations. In the context of the large amount of generated genomic data that is being generated, understanding the effect of disease-related mutations at molecular level can contribute to close the gap between genotype and phenotype and thus improve prevention, diagnosis or treatment of a pathological condition. In order to fully characterize the effect of a pathological mutation and have useful information for prediction purposes, it is important first to identify whether the mutation is located at a protein-binding interface, and second to understand the effect on the binding affinity of the affected interaction/s. Computational methods, such as protein docking are currently used to complement experimental efforts and could help to build the human structural interactome. Here we have extended the original pyDockNIP method to predict the location of disease-associated nsSNPs at protein-protein interfaces, when there is no available structure for the protein-protein complex. We have applied this approach to the pathological interaction networks of six diseases with low structural data on PPIs. This approach can almost double the number of nsSNPs that can be characterized and identify edgetic effects in many nsSNPs that were previously unknown. This can help to annotate and interpret genomic data from large-scale population studies, and to achieve a better understanding of disease at molecular level. PMID:28841721

  10. TargetCrys: protein crystallization prediction by fusing multi-view features with two-layered SVM.

    PubMed

    Hu, Jun; Han, Ke; Li, Yang; Yang, Jing-Yu; Shen, Hong-Bin; Yu, Dong-Jun

    2016-11-01

    The accurate prediction of whether a protein will crystallize plays a crucial role in improving the success rate of protein crystallization projects. A common critical problem in the development of machine-learning-based protein crystallization predictors is how to effectively utilize protein features extracted from different views. In this study, we aimed to improve the efficiency of fusing multi-view protein features by proposing a new two-layered SVM (2L-SVM) which switches the feature-level fusion problem to a decision-level fusion problem: the SVMs in the 1st layer of the 2L-SVM are trained on each of the multi-view feature sets; then, the outputs of the 1st layer SVMs, which are the "intermediate" decisions made based on the respective feature sets, are further ensembled by a 2nd layer SVM. Based on the proposed 2L-SVM, we implemented a sequence-based protein crystallization predictor called TargetCrys. Experimental results on several benchmark datasets demonstrated the efficacy of the proposed 2L-SVM for fusing multi-view features. We also compared TargetCrys with existing sequence-based protein crystallization predictors and demonstrated that the proposed TargetCrys outperformed most of the existing predictors and is competitive with the state-of-the-art predictors. The TargetCrys webserver and datasets used in this study are freely available for academic use at: http://csbio.njust.edu.cn/bioinf/TargetCrys .

  11. Proteome-level interplay between folding and aggregation propensities of proteins.

    PubMed

    Tartaglia, Gian Gaetano; Vendruscolo, Michele

    2010-10-08

    With the advent of proteomics, there is an increasing need of tools for predicting the properties of large numbers of proteins by using the information provided by their amino acid sequences, even in the absence of the knowledge of their structures. One of the most important types of predictions concerns whether proteins will fold or aggregate. Here, we study the competition between these two processes by analyzing the relationship between the folding and aggregation propensity profiles for the human and Escherichia coli proteomes. These profiles are calculated, respectively, using the CamFold method, which we introduce in this work, and the Zyggregator method. Our results indicate that the kinetic behavior of proteins is, to a large extent, determined by the interplay between regions of low folding and high aggregation propensities. Copyright © 2010. Published by Elsevier Ltd.

  12. WRF-TMH: predicting transmembrane helix by fusing composition index and physicochemical properties of amino acids.

    PubMed

    Hayat, Maqsood; Khan, Asifullah

    2013-05-01

    Membrane protein is the prime constituent of a cell, which performs a role of mediator between intra and extracellular processes. The prediction of transmembrane (TM) helix and its topology provides essential information regarding the function and structure of membrane proteins. However, prediction of TM helix and its topology is a challenging issue in bioinformatics and computational biology due to experimental complexities and lack of its established structures. Therefore, the location and orientation of TM helix segments are predicted from topogenic sequences. In this regard, we propose WRF-TMH model for effectively predicting TM helix segments. In this model, information is extracted from membrane protein sequences using compositional index and physicochemical properties. The redundant and irrelevant features are eliminated through singular value decomposition. The selected features provided by these feature extraction strategies are then fused to develop a hybrid model. Weighted random forest is adopted as a classification approach. We have used two benchmark datasets including low and high-resolution datasets. tenfold cross validation is employed to assess the performance of WRF-TMH model at different levels including per protein, per segment, and per residue. The success rates of WRF-TMH model are quite promising and are the best reported so far on the same datasets. It is observed that WRF-TMH model might play a substantial role, and will provide essential information for further structural and functional studies on membrane proteins. The accompanied web predictor is accessible at http://111.68.99.218/WRF-TMH/ .

  13. Protein pathway activation associated with sustained virologic response in patients with chronic hepatitis C treated with pegylated interferon (PEG-IFN) and ribavirin (RBV).

    PubMed

    Younossi, Zobair M; Limongi, Dolores; Stepanova, Maria; Pierobon, Mariaelena; Afendy, Arian; Mehta, Rohini; Baranova, Ancha; Liotta, Lance; Petricoin, Emanuel

    2011-02-04

    Only half of chronic hepatitis C (CH-C) patients treated with pegylated interferon and ribavirin (PEG-IFN+RBV) achieve sustained virologic response) SVR. In addition to known factors, we postulated that activation of key protein signaling networks in the peripheral blood mononuclear cells (PBMCs) may contribute to SVR due to inherent patient-specific basal immune cell signaling architecture. In this study, we included 92 patients with CH-C. PBMCs were collected while patients were not receiving treatment and used for phosphoprotein-based network profiling. Patients received a full course of PEG-IFN+RBV with overall SVR of 55%. From PBMC, protein lysates were extracted and then used for Reverse Phase Protein Microarray (RPMA) analysis, which quantitatively measured the levels of cytokines and activation levels of 25 key protein signaling molecules involved in immune cell regulation and interferon alpha signaling. Regression models for predicting SVR were generated by stepwise bidirectional selection. Both clinical-laboratory and RPMA parameters were used as predictor variables. Model accuracies were estimated using 10-fold cross-validation. Our results show that by comparing patients who achieved SVR to those who did not, phosphorylation levels of 6 proteins [AKT(T308), JAK1(Y1022/1023), p70 S6 Kinase (S371), PKC zeta/lambda(T410/403), TYK2(Y1054/1055), ZAP-70(Y319)/Syk(Y352)] and overall levels of 6 unmodified proteins [IL2, IL10, IL4, IL5, TNF-alpha, CD5L] were significantly different (P < 0.05). For SVR, the model based on a combination of clinical and proteome parameters was developed, with an AUC = 0.914, sensitivity of 92.16%, and specificity of 85.0%. This model included the following parameters: viral genotype, previous treatment status, BMI, phosphorylated states of STAT2, AKT, LCK, and TYK2 kinases as well as steady state levels of IL4, IL5, and TNF-alpha. In conclusion, SVR could be predicted by a combination of clinical, cytokine, and protein signaling activation profiles. Signaling events elucidated in the study may shed some light into molecular mechanisms of response to anti-HCV treatment.

  14. Expression Differentiation Is Constrained to Low-Expression Proteins over Ecological Timescales

    PubMed Central

    Margres, Mark J.; Wray, Kenneth P.; Seavy, Margaret; McGivern, James J.; Herrera, Nathanael D.; Rokyta, Darin R.

    2016-01-01

    Protein expression level is one of the strongest predictors of protein sequence evolutionary rate, with high-expression protein sequences evolving at slower rates than low-expression protein sequences largely because of constraints on protein folding and function. Expression evolutionary rates also have been shown to be negatively correlated with expression level across human and mouse orthologs over relatively long divergence times (i.e., ∼100 million years). Long-term evolutionary patterns, however, often cannot be extrapolated to microevolutionary processes (and vice versa), and whether this relationship holds for traits evolving under directional selection within a single species over ecological timescales (i.e., <5000 years) is unknown and not necessarily expected. Expression is a metabolically costly process, and the expression level of a particular protein is predicted to be a tradeoff between the benefit of its function and the costs of its expression. Selection should drive the expression level of all proteins close to values that maximize fitness, particularly for high-expression proteins because of the increased energetic cost of production. Therefore, stabilizing selection may reduce the amount of standing expression variation for high-expression proteins, and in combination with physiological constraints that may place an upper bound on the range of beneficial expression variation, these constraints could severely limit the availability of beneficial expression variants. To determine whether rapid-expression evolution was restricted to low-expression proteins owing to these constraints on highly expressed proteins over ecological timescales, we compared venom protein expression levels across mainland and island populations for three species of pit vipers. We detected significant differentiation in protein expression levels in two of the three species and found that rapid-expression differentiation was restricted to low-expression proteins. Our results suggest that various constraints on high-expression proteins reduce the availability of beneficial expression variants relative to low-expression proteins, enabling low-expression proteins to evolve and potentially lead to more rapid adaptation. PMID:26546003

  15. Energy storage and fecundity explain deviations from ecological stoichiometry predictions under global warming and size-selective predation.

    PubMed

    Zhang, Chao; Jansen, Mieke; De Meester, Luc; Stoks, Robby

    2016-11-01

    A key challenge for ecologists is to predict how single and joint effects of global warming and predation risk translate from the individual level up to ecosystem functions. Recently, stoichiometric theory linked these levels through changes in body stoichiometry, predicting that both higher temperatures and predation risk induce shifts in energy storage (increases in C-rich carbohydrates and reductions in N-rich proteins) and body stoichiometry (increases in C : N and C : P). This promising theory, however, is rarely tested and assumes that prey will divert energy away from reproduction under predation risk, while under size-selective predation, prey instead increase fecundity. We exposed the water flea Daphnia magna to 4 °C warming and fish predation risk to test whether C-rich carbohydrates increase and N-rich proteins decrease, and as a result, C : N and C : P increase under warming and predation risk. Unexpectedly, warming decreased body C : N, which was driven by reductions in C-rich fat and sugar contents while the protein content did not change. This reflected a trade-off where the accelerated intrinsic growth rate under warming occurred at the cost of a reduced energy storage. Warming reduced C : N less and only increased C : P and N : P in the fish-period Daphnia. These evolved stoichiometric responses to warming were largely driven by stronger warming-induced reductions in P than in C and N and could be explained by the better ability to deal with warming in the fish-period Daphnia. In contrast to theory predictions, body C : N decreased under predation risk due to a strong increase in the N-rich protein content that offsets the increase in C-rich fat content. The higher investment in fecundity (more N-rich eggs) under predation risk contributed to this stronger increase in protein content. Similarly, the lower body C : N of pre-fish Daphnia also matched their higher fecundity. Warming and predation risk independently shaped body stoichiometry, largely by changing levels of energy storage molecules. Our results highlight that two widespread patterns, the trade-off between rapid development and energy storage and the increased investment in reproduction under size-selective predation, cause predictable deviations from current ecological stoichiometry theory. © 2016 The Authors. Journal of Animal Ecology © 2016 British Ecological Society.

  16. Plasma protein biomarkers enhance the clinical prediction of kidney injury recovery in patients undergoing liver transplantation.

    PubMed

    Levitsky, Josh; Baker, Talia B; Jie, Chunfa; Ahya, Shubhada; Levin, Murray; Friedewald, John; Al-Saden, Patrice; Salomon, Daniel R; Abecassis, Michael M

    2014-12-01

    Biomarkers predictive of recovery from acute kidney injury (AKI) after liver transplantation (LT) could enhance decision algorithms regarding the need for liver-kidney transplantation or renal sparing regimens. Multianalyte plasma/urine kidney injury protein panels were performed immediately before and 1 month post-LT in an initial test group divided by reversible pre-LT AKI (rAKI = post-LT renal recovery) versus no AKI (nAKI). This was followed by a larger validation set that included an additional group: irreversible pre-LT AKI (iAKI = no post-LT renal recovery). In the test group (n = 16), six pre-LT plasma (not urine) kidney injury proteins (osteopontin [OPN], neutrophil gelatinase-associated lipocalin, cystatin C, trefoil factor 3, tissue inhibitor of metalloproteinase [TIMP]-1, and β-2-microglobulin) were higher in rAKI versus nAKI (P < 0.05) and returned to normal values with renal recovery post-LT. In the validation set (n = 46), a number of proteins were significantly higher in both rAKI and iAKI versus nAKI. However, only pre-LT plasma OPN (P = 0.009) and TIMP-1 (P = 0.019) levels were significantly higher in rAKI versus iAKI. Logistic regression modeling was used to correlate the probability of post-LT rAKI, factoring in both pre-LT protein markers and clinical variables. A combined model including elevated OPN and TIMP-1 levels, age <57, and absence of diabetes had the highest area under the curve of 0.82, compared to protein-only and clinical variable-only models. These data suggest that plasma protein profiles might improve the prediction of pre-LT kidney injury recovery after LT. However, multicenter, prospective studies are needed to validate these findings and ultimately test the value of such protein panels in perioperative management and decision making. © 2014 by the American Association for the Study of Liver Diseases.

  17. Predicting domain-domain interaction based on domain profiles with feature selection and support vector machines

    PubMed Central

    2010-01-01

    Background Protein-protein interaction (PPI) plays essential roles in cellular functions. The cost, time and other limitations associated with the current experimental methods have motivated the development of computational methods for predicting PPIs. As protein interactions generally occur via domains instead of the whole molecules, predicting domain-domain interaction (DDI) is an important step toward PPI prediction. Computational methods developed so far have utilized information from various sources at different levels, from primary sequences, to molecular structures, to evolutionary profiles. Results In this paper, we propose a computational method to predict DDI using support vector machines (SVMs), based on domains represented as interaction profile hidden Markov models (ipHMM) where interacting residues in domains are explicitly modeled according to the three dimensional structural information available at the Protein Data Bank (PDB). Features about the domains are extracted first as the Fisher scores derived from the ipHMM and then selected using singular value decomposition (SVD). Domain pairs are represented by concatenating their selected feature vectors, and classified by a support vector machine trained on these feature vectors. The method is tested by leave-one-out cross validation experiments with a set of interacting protein pairs adopted from the 3DID database. The prediction accuracy has shown significant improvement as compared to InterPreTS (Interaction Prediction through Tertiary Structure), an existing method for PPI prediction that also uses the sequences and complexes of known 3D structure. Conclusions We show that domain-domain interaction prediction can be significantly enhanced by exploiting information inherent in the domain profiles via feature selection based on Fisher scores, singular value decomposition and supervised learning based on support vector machines. Datasets and source code are freely available on the web at http://liao.cis.udel.edu/pub/svdsvm. Implemented in Matlab and supported on Linux and MS Windows. PMID:21034480

  18. PARS: a web server for the prediction of Protein Allosteric and Regulatory Sites.

    PubMed

    Panjkovich, Alejandro; Daura, Xavier

    2014-05-01

    The regulation of protein activity is a key aspect of life at the molecular level. Unveiling its details is thus crucial to understanding signalling and metabolic pathways. The most common and powerful mechanism of protein-function regulation is allostery, which has been increasingly calling the attention of medicinal chemists due to its potential for the discovery of novel therapeutics. In this context, PARS is a simple and fast method that queries protein dynamics and structural conservation to identify pockets on a protein structure that may exert a regulatory effect on the binding of a small-molecule ligand.

  19. A computational interactome for prioritizing genes associated with complex agronomic traits in rice (Oryza sativa).

    PubMed

    Liu, Shiwei; Liu, Yihui; Zhao, Jiawei; Cai, Shitao; Qian, Hongmei; Zuo, Kaijing; Zhao, Lingxia; Zhang, Lida

    2017-04-01

    Rice (Oryza sativa) is one of the most important staple foods for more than half of the global population. Many rice traits are quantitative, complex and controlled by multiple interacting genes. Thus, a full understanding of genetic relationships will be critical to systematically identify genes controlling agronomic traits. We developed a genome-wide rice protein-protein interaction network (RicePPINet, http://netbio.sjtu.edu.cn/riceppinet) using machine learning with structural relationship and functional information. RicePPINet contained 708 819 predicted interactions for 16 895 non-transposable element related proteins. The power of the network for discovering novel protein interactions was demonstrated through comparison with other publicly available protein-protein interaction (PPI) prediction methods, and by experimentally determined PPI data sets. Furthermore, global analysis of domain-mediated interactions revealed RicePPINet accurately reflects PPIs at the domain level. Our studies showed the efficiency of the RicePPINet-based method in prioritizing candidate genes involved in complex agronomic traits, such as disease resistance and drought tolerance, was approximately 2-11 times better than random prediction. RicePPINet provides an expanded landscape of computational interactome for the genetic dissection of agronomically important traits in rice. © 2017 The Authors The Plant Journal © 2017 John Wiley & Sons Ltd.

  20. GalaxyHomomer: a web server for protein homo-oligomer structure prediction from a monomer sequence or structure.

    PubMed

    Baek, Minkyung; Park, Taeyong; Heo, Lim; Park, Chiwook; Seok, Chaok

    2017-07-03

    Homo-oligomerization of proteins is abundant in nature, and is often intimately related with the physiological functions of proteins, such as in metabolism, signal transduction or immunity. Information on the homo-oligomer structure is therefore important to obtain a molecular-level understanding of protein functions and their regulation. Currently available web servers predict protein homo-oligomer structures either by template-based modeling using homo-oligomer templates selected from the protein structure database or by ab initio docking of monomer structures resolved by experiment or predicted by computation. The GalaxyHomomer server, freely accessible at http://galaxy.seoklab.org/homomer, carries out template-based modeling, ab initio docking or both depending on the availability of proper oligomer templates. It also incorporates recently developed model refinement methods that can consistently improve model quality. Moreover, the server provides additional options that can be chosen by the user depending on the availability of information on the monomer structure, oligomeric state and locations of unreliable/flexible loops or termini. The performance of the server was better than or comparable to that of other available methods when tested on benchmark sets and in a recent CASP performed in a blind fashion. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  1. Identification and Characterization of LFD-2, a Predicted Fringe Protein Required for Membrane Integrity during Cell Fusion in Neurospora crassa

    PubMed Central

    Palma-Guerrero, Javier; Zhao, Jiuhai; Gonçalves, A. Pedro; Starr, Trevor L.

    2015-01-01

    The molecular mechanisms of membrane merger during somatic cell fusion in eukaryotic species are poorly understood. In the filamentous fungus Neurospora crassa, somatic cell fusion occurs between genetically identical germinated asexual spores (germlings) and between hyphae to form the interconnected network characteristic of a filamentous fungal colony. In N. crassa, two proteins have been identified to function at the step of membrane fusion during somatic cell fusion: PRM1 and LFD-1. The absence of either one of these two proteins results in an increase of germling pairs arrested during cell fusion with tightly appressed plasma membranes and an increase in the frequency of cell lysis of adhered germlings. The level of cell lysis in ΔPrm1 or Δlfd-1 germlings is dependent on the extracellular calcium concentration. An available transcriptional profile data set was used to identify genes encoding predicted transmembrane proteins that showed reduced expression levels in germlings cultured in the absence of extracellular calcium. From these analyses, we identified a mutant (lfd-2, for late fusion defect-2) that showed a calcium-dependent cell lysis phenotype. lfd-2 encodes a protein with a Fringe domain and showed endoplasmic reticulum and Golgi membrane localization. The deletion of an additional gene predicted to encode a low-affinity calcium transporter, fig1, also resulted in a strain that showed a calcium-dependent cell lysis phenotype. Genetic analyses showed that LFD-2 and FIG1 likely function in separate pathways to regulate aspects of membrane merger and repair during cell fusion. PMID:25595444

  2. High-sensitive factor I and C-reactive protein based biomarkers for coronary artery disease.

    PubMed

    Zhao, Qing; Du, Jian-Shi; Han, Dong-Mei; Ma, Ying

    2014-01-01

    An analysis of high-sensitive factor I and C-reactive proteins as biomarkers for coronary artery disease has been performed from 19 anticipated cohort studies that included 21,567 participants having no information about coronary artery disease. Besides, the clinical implications of statin therapy initiated due to assessment of factor I and C-reactive proteins have also been modeled during studies. The measure of risk discrimination (C-index) was increased (by 0.0101) as per the prognostic model for coronary artery disease with respect to sex, smoking status, age, blood pressure, total cholesterol level along with diabetic history characteristic parameters. The C-index was further raised by 0.0045 and 0.0053 when factor I and C-reactive proteins based information were added, respectively which finally predicted 10-year risk categories as: high (> 20%), medium (10% to < 20%), and low (< 10%) risks. We found 2,254 persons (among 15,000 adults (age ≥ 45 years)) would initially be classified as being at medium risk for coronary artery disease when only conventional risk factors were used as calculated risk. Besides, persons with a predicted risk of more than 20% as well as for persons suffering from other risk factors (i.e. diabetes), statin therapy was initiated (irrespective of their decade old predicted risk). We conclude that under current treatment guidelines assessment of factor I and C-reactive proteins levels (as biomarker) in people at medium risk for coronary artery disease could prevent one additional coronary artery disease risk over a period a decade for every 390-500 people screened.

  3. Quantifying condition-dependent intracellular protein levels enables high-precision fitness estimates.

    PubMed

    Geiler-Samerotte, Kerry A; Hashimoto, Tatsunori; Dion, Michael F; Budnik, Bogdan A; Airoldi, Edoardo M; Drummond, D Allan

    2013-01-01

    Countless studies monitor the growth rate of microbial populations as a measure of fitness. However, an enormous gap separates growth-rate differences measurable in the laboratory from those that natural selection can distinguish efficiently. Taking advantage of the recent discovery that transcript and protein levels in budding yeast closely track growth rate, we explore the possibility that growth rate can be more sensitively inferred by monitoring the proteomic response to growth, rather than growth itself. We find a set of proteins whose levels, in aggregate, enable prediction of growth rate to a higher precision than direct measurements. However, we find little overlap between these proteins and those that closely track growth rate in other studies. These results suggest that, in yeast, the pathways that set the pace of cell division can differ depending on the growth-altering stimulus. Still, with proper validation, protein measurements can provide high-precision growth estimates that allow extension of phenotypic growth-based assays closer to the limits of evolutionary selection.

  4. Levels of uninvolved immunoglobulins predict clinical status and progression-free survival for multiple myeloma patients.

    PubMed

    Harutyunyan, Nika M; Vardanyan, Suzie; Ghermezi, Michael; Gottlieb, Jillian; Berenson, Ariana; Andreu-Vieyra, Claudia; Berenson, James R

    2016-07-01

    Multiple myeloma (MM) is characterized by the enhanced production of the same monoclonal immunoglobulin (M-Ig or M protein). Techniques such as serum protein electrophoresis and nephelometry are routinely used to quantify levels of this protein in the serum of MM patients. However, these methods are not without their shortcomings and problems accurately quantifying M proteins remain. Precise quantification of the types and levels of M-Ig present is critical to monitoring patient response to therapy. In this study, we investigated the ability of the HevyLite (HLC) immunoassay to correlate with clinical status based on levels of involved and uninvolved antibodies. In our cohort of MM patients, we observed that significantly higher ratios and greater differences of involved HLC levels compared to uninvolved HLC levels correlated with a worse clinical status. Similarly, higher absolute levels of involved HLC antibodies and lower levels of uninvolved HLC antibodies also correlated with a worse clinical status and a shorter progression-free survival. These findings suggest that the HLC assay is a useful and a promising tool for determining the clinical status and survival time for patients with multiple myeloma. © 2016 John Wiley & Sons Ltd.

  5. Protein Folding and Structure Prediction from the Ground Up: The Atomistic Associative Memory, Water Mediated, Structure and Energy Model.

    PubMed

    Chen, Mingchen; Lin, Xingcheng; Zheng, Weihua; Onuchic, José N; Wolynes, Peter G

    2016-08-25

    The associative memory, water mediated, structure and energy model (AWSEM) is a coarse-grained force field with transferable tertiary interactions that incorporates local in sequence energetic biases using bioinformatically derived structural information about peptide fragments with locally similar sequences that we call memories. The memory information from the protein data bank (PDB) database guides proper protein folding. The structural information about available sequences in the database varies in quality and can sometimes lead to frustrated free energy landscapes locally. One way out of this difficulty is to construct the input fragment memory information from all-atom simulations of portions of the complete polypeptide chain. In this paper, we investigate this approach first put forward by Kwac and Wolynes in a more complete way by studying the structure prediction capabilities of this approach for six α-helical proteins. This scheme which we call the atomistic associative memory, water mediated, structure and energy model (AAWSEM) amounts to an ab initio protein structure prediction method that starts from the ground up without using bioinformatic input. The free energy profiles from AAWSEM show that atomistic fragment memories are sufficient to guide the correct folding when tertiary forces are included. AAWSEM combines the efficiency of coarse-grained simulations on the full protein level with the local structural accuracy achievable from all-atom simulations of only parts of a large protein. The results suggest that a hybrid use of atomistic fragment memory and database memory in structural predictions may well be optimal for many practical applications.

  6. Identification of Functional Candidates amongst Hypothetical Proteins of Treponema pallidum ssp. pallidum

    PubMed Central

    Naqvi, Ahmad Abu Turab; Shahbaaz, Mohd; Ahmad, Faizan; Hassan, Md. Imtaiyaz

    2015-01-01

    Syphilis is a globally occurring venereal disease, and its infection is propagated through sexual contact. The causative agent of syphilis, Treponema pallidum ssp. pallidum, a Gram-negative sphirochaete, is an obligate human parasite. Genome of T. pallidum ssp. pallidum SS14 strain (RefSeq NC_010741.1) encodes 1,027 proteins, of which 444 proteins are known as hypothetical proteins (HPs), i.e., proteins of unknown functions. Here, we performed functional annotation of HPs of T. pallidum ssp. pallidum using various database, domain architecture predictors, protein function annotators and clustering tools. We have analyzed the sequences of 444 HPs of T. pallidum ssp. pallidum and subsequently predicted the function of 207 HPs with a high level of confidence. However, functions of 237 HPs are predicted with less accuracy. We found various enzymes, transporters, binding proteins in the annotated group of HPs that may be possible molecular targets, facilitating for the survival of pathogen. Our comprehensive analysis helps to understand the mechanism of pathogenesis to provide many novel potential therapeutic interventions. PMID:25894582

  7. Prediction of the partitioning behaviour of proteins in aqueous two-phase systems using only their amino acid composition.

    PubMed

    Salgado, J Cristian; Andrews, Barbara A; Ortuzar, Maria Fernanda; Asenjo, Juan A

    2008-01-18

    The prediction of the partition behaviour of proteins in aqueous two-phase systems (ATPS) using mathematical models based on their amino acid composition was investigated. The predictive models are based on the average surface hydrophobicity (ASH). The ASH was estimated by means of models that use the three-dimensional structure of proteins and by models that use only the amino acid composition of proteins. These models were evaluated for a set of 11 proteins with known experimental partition coefficient in four-phase systems: polyethylene glycol (PEG) 4000/phosphate, sulfate, citrate and dextran and considering three levels of NaCl concentration (0.0% w/w, 0.6% w/w and 8.8% w/w). The results indicate that such prediction is feasible even though the quality of the prediction depends strongly on the ATPS and its operational conditions such as the NaCl concentration. The ATPS 0 model which use the three-dimensional structure obtains similar results to those given by previous models based on variables measured in the laboratory. In addition it maintains the main characteristics of the hydrophobic resolution and intrinsic hydrophobicity reported before. Three mathematical models, ATPS I-III, based only on the amino acid composition were evaluated. The best results were obtained by the ATPS I model which assumes that all of the amino acids are completely exposed. The performance of the ATPS I model follows the behaviour reported previously, i.e. its correlation coefficients improve as the NaCl concentration increases in the system and, therefore, the effect of the protein hydrophobicity prevails over other effects such as charge or size. Its best predictive performance was obtained for the PEG/dextran system at high NaCl concentration. An increase in the predictive capacity of at least 54.4% with respect to the models which use the three-dimensional structure of the protein was obtained for that system. In addition, the ATPS I model exhibits high correlation coefficients in that system being higher than 0.88 on average. The ATPS I model exhibited correlation coefficients higher than 0.67 for the rest of the ATPS at high NaCl concentration. Finally, we tested our best model, the ATPS I model, on the prediction of the partition coefficient of the protein invertase. We found that the predictive capacities of the ATPS I model are better in PEG/dextran systems, where the relative error of the prediction with respect to the experimental value is 15.6%.

  8. A systems-level approach for investigating organophosphorus pesticide toxicity.

    PubMed

    Zhu, Jingbo; Wang, Jing; Ding, Yan; Liu, Baoyue; Xiao, Wei

    2018-03-01

    The full understanding of the single and joint toxicity of a variety of organophosphorus (OP) pesticides is still unavailable, because of the extreme complex mechanism of action. This study established a systems-level approach based on systems toxicology to investigate OP pesticide toxicity by incorporating ADME/T properties, protein prediction, and network and pathway analysis. The results showed that most OP pesticides are highly toxic according to the ADME/T parameters, and can interact with significant receptor proteins to cooperatively lead to various diseases by the established OP pesticide -protein and protein-disease networks. Furthermore, the studies that multiple OP pesticides potentially act on the same receptor proteins and/or the functionally diverse proteins explained that multiple OP pesticides could mutually enhance toxicological synergy or additive on a molecular/systematic level. To the end, the integrated pathways revealed the mechanism of toxicity of the interaction of OP pesticides and elucidated the pathogenesis induced by OP pesticides. This study demonstrates a systems-level approach for investigating OP pesticide toxicity that can be further applied to risk assessments of various toxins, which is of significant interest to food security and environmental protection. Copyright © 2017 Elsevier Inc. All rights reserved.

  9. Individuality in nutritional preferences: a multi-level approach in field crickets.

    PubMed

    Han, Chang S; Jäger, Heidi Y; Dingemanse, Niels J

    2016-06-30

    Selection may favour individuals of the same population to differ consistently in nutritional preference, for example, because optimal diets covary with morphology or personality. We provided Southern field crickets (Gryllus bimaculatus) with two synthetic food sources (carbohydrates and proteins) and quantified repeatedly how much of each macronutrient was consumed by each individual. We then quantified (i) whether individuals were repeatable in carbohydrate and protein intake rate, (ii) whether an individual's average daily intake of carbohydrates was correlated with its average daily intake of protein, and (iii) whether short-term changes in intake of carbohydrates coincided with changes in intake of protein within individuals. Intake rates were individually repeatable for both macronutrients. However, individuals differed in their relative daily intake of carbohydrates versus proteins (i.e., 'nutritional preference'). By contrast, total consumption varied plastically as a function of body weight within individuals. Body weight-but not personality (i.e., aggression, exploration behaviour)-positively predicted nutritional preference at the individual level as large crickets repeatedly consumed a higher carbohydrate to protein ratio compared to small ones. Our finding of level-specific associations between the consumption of distinct nutritional components demonstrates the merit of applying multivariate and multi-level viewpoints to the study of nutritional preference.

  10. Pretreatment 14-3-3 epsilon level is predictive for advanced extranodal NK/T cell lymphoma therapeutic response to asparaginase-based chemotherapy.

    PubMed

    Qiu, Yajuan; Zhou, Zhiyuan; Li, Zhaoming; Lu, Lisha; Li, Ling; Li, Xin; Wang, Xinhua; Zhang, Mingzhi

    2017-03-01

    The aim of the present study was to identify the potential relevant biomarkers to predict the therapeutic response of advanced extranodal natural killer/T cell lymphoma(ENKTL) treated with asparaginase-based treatment. Proteomic technology is used to identify differentially expressed proteins between chemotherapy-resistant and chemotherapy-sensitive patients. Then enzyme-linked immunosorbent assay is used to validate the predictive value of selective biomarkers. A total of 61 upregulated and 22 downregulated proteins are identified in chemotherapy-resistant patients compared with chemotherapy-sensitive patients. Furthermore, they validated that pretreatment high level 14-3-3 epsilon(ε)(≥61.95 ng/mL, 84.0 and 95.2% for sensitivity and specificity, respectively) is associated with poor 2-year overall survival (OS) (5.3 vs 68.8%, p<0.0001) and PFS (4.5 vs 76.9%, p<0.0001). In multivariate survival analysis, pretreatment high level 14-3-3 epsilon significantly is correlated with both inferior OS (p = 0.033) and PFS (p = 0.005). These findings indicate that pretreatment high level 14-3-3 epsilon is an independent predictor of chemotherapy-resistance and poor prognosis for patients with advanced ENKTL in the era of asparaginase. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  11. A cherry protein and its gene, abundantly expressed in ripening fruit, have been identified as thaumatin-like.

    PubMed

    Fils-Lycaon, B R; Wiersma, P A; Eastwell, K C; Sautiere, P

    1996-05-01

    A 29-kD polypeptide is the most abundant soluble protein in ripe cherry fruit (Prunus avium L); accumulation begins at the onset of ripening as the fruit turns from yellow to red. This protein was extracted from ripe cherries and purified by size-exclusion and ion-exchange chromatography. Antibodies to the purified protein were used to screen a cDNA library from ripe cherries. Numerous recombinant plaques reacted positively with the antibodies; the DNA sequence of representative clones encoded a polypeptide of 245 amino acid residues. A signal peptide was indicated, and the predicted mature protein corresponded to the purified protein in size (23.3 kD, by mass spectrometry) and isoelectric point (4.2). A search of known protein sequences revealed a strong similarity between this polypeptide and the thaumatin family of pathogenesis-related proteins. The cherry thaumatin-like protein does not have a sweet taste, and no antifungal activity was seen in preliminary assays. Expression of the protein appears to be regulated at the gene level, with mRNA levels at their highest in the ripe fruit.

  12. A cherry protein and its gene, abundantly expressed in ripening fruit, have been identified as thaumatin-like.

    PubMed Central

    Fils-Lycaon, B R; Wiersma, P A; Eastwell, K C; Sautiere, P

    1996-01-01

    A 29-kD polypeptide is the most abundant soluble protein in ripe cherry fruit (Prunus avium L); accumulation begins at the onset of ripening as the fruit turns from yellow to red. This protein was extracted from ripe cherries and purified by size-exclusion and ion-exchange chromatography. Antibodies to the purified protein were used to screen a cDNA library from ripe cherries. Numerous recombinant plaques reacted positively with the antibodies; the DNA sequence of representative clones encoded a polypeptide of 245 amino acid residues. A signal peptide was indicated, and the predicted mature protein corresponded to the purified protein in size (23.3 kD, by mass spectrometry) and isoelectric point (4.2). A search of known protein sequences revealed a strong similarity between this polypeptide and the thaumatin family of pathogenesis-related proteins. The cherry thaumatin-like protein does not have a sweet taste, and no antifungal activity was seen in preliminary assays. Expression of the protein appears to be regulated at the gene level, with mRNA levels at their highest in the ripe fruit. PMID:8685266

  13. Protein loop modeling using a new hybrid energy function and its application to modeling in inaccurate structural environments.

    PubMed

    Park, Hahnbeom; Lee, Gyu Rie; Heo, Lim; Seok, Chaok

    2014-01-01

    Protein loop modeling is a tool for predicting protein local structures of particular interest, providing opportunities for applications involving protein structure prediction and de novo protein design. Until recently, the majority of loop modeling methods have been developed and tested by reconstructing loops in frameworks of experimentally resolved structures. In many practical applications, however, the protein loops to be modeled are located in inaccurate structural environments. These include loops in model structures, low-resolution experimental structures, or experimental structures of different functional forms. Accordingly, discrepancies in the accuracy of the structural environment assumed in development of the method and that in practical applications present additional challenges to modern loop modeling methods. This study demonstrates a new strategy for employing a hybrid energy function combining physics-based and knowledge-based components to help tackle this challenge. The hybrid energy function is designed to combine the strengths of each energy component, simultaneously maintaining accurate loop structure prediction in a high-resolution framework structure and tolerating minor environmental errors in low-resolution structures. A loop modeling method based on global optimization of this new energy function is tested on loop targets situated in different levels of environmental errors, ranging from experimental structures to structures perturbed in backbone as well as side chains and template-based model structures. The new method performs comparably to force field-based approaches in loop reconstruction in crystal structures and better in loop prediction in inaccurate framework structures. This result suggests that higher-accuracy predictions would be possible for a broader range of applications. The web server for this method is available at http://galaxy.seoklab.org/loop with the PS2 option for the scoring function.

  14. The interface of protein structure, protein biophysics, and molecular evolution

    PubMed Central

    Liberles, David A; Teichmann, Sarah A; Bahar, Ivet; Bastolla, Ugo; Bloom, Jesse; Bornberg-Bauer, Erich; Colwell, Lucy J; de Koning, A P Jason; Dokholyan, Nikolay V; Echave, Julian; Elofsson, Arne; Gerloff, Dietlind L; Goldstein, Richard A; Grahnen, Johan A; Holder, Mark T; Lakner, Clemens; Lartillot, Nicholas; Lovell, Simon C; Naylor, Gavin; Perica, Tina; Pollock, David D; Pupko, Tal; Regan, Lynne; Roger, Andrew; Rubinstein, Nimrod; Shakhnovich, Eugene; Sjölander, Kimmen; Sunyaev, Shamil; Teufel, Ashley I; Thorne, Jeffrey L; Thornton, Joseph W; Weinreich, Daniel M; Whelan, Simon

    2012-01-01

    Abstract The interface of protein structural biology, protein biophysics, molecular evolution, and molecular population genetics forms the foundations for a mechanistic understanding of many aspects of protein biochemistry. Current efforts in interdisciplinary protein modeling are in their infancy and the state-of-the art of such models is described. Beyond the relationship between amino acid substitution and static protein structure, protein function, and corresponding organismal fitness, other considerations are also discussed. More complex mutational processes such as insertion and deletion and domain rearrangements and even circular permutations should be evaluated. The role of intrinsically disordered proteins is still controversial, but may be increasingly important to consider. Protein geometry and protein dynamics as a deviation from static considerations of protein structure are also important. Protein expression level is known to be a major determinant of evolutionary rate and several considerations including selection at the mRNA level and the role of interaction specificity are discussed. Lastly, the relationship between modeling and needed high-throughput experimental data as well as experimental examination of protein evolution using ancestral sequence resurrection and in vitro biochemistry are presented, towards an aim of ultimately generating better models for biological inference and prediction. PMID:22528593

  15. First trimesters Pregnancy-Associated Plasma Protein-A levels value to Predict Gestational diabetes Mellitus: A systematic review and meta-analysis of the literature.

    PubMed

    Talasaz, Zahra Hadizadeh; Sadeghi, Ramin; Askari, Fariba; Dadgar, Salmeh; Vatanchi, Atiyeh

    2018-04-01

    Detecting pregnant women at risk of diabetes in first months can help them by early intervention for delaying or preventing onset of GDM. In this study, we aimed to assess the Predictive value of first trimester Pregnancy related plasma protein-A (PAPP-A) levels for detecting Gestational diabetes Mellitus (GDM). This systematic review and meta-analysis was conducted through probing in databases. PubMed, Scopus, Medline and Google scholar citations were searched to find the published papers from 1974 to 2017. Studies were considered eligible if they were cohorts, case-control studies, reported GDM result, not other types, conducted on singleton pregnancy, measured Serum pregnancy associated plasma protein A in the first trimester and evaluated the relation of first trimester pregnancy associated plasma protein-A and GDM. Two reviewers independently assessed the quality with Newcastle-Ottawa and extracted data in the Pre-defined checklist. Analysis of the data was carried out by "Comprehensive Meta-analysis Version 2 (CAM)" and Metadisc software. 17 articles have our inclusion criteria and were considered in our systematic review, 5 studies included in Meta-analysis. Meta-analysis of these articles showed that the predictive value of PAPP-A for GDM has 55% sensitivity (53-58), 90% (89-90) specificity, LR + 2.48 (0.83-7.36) and LR - 0.70 (0.45-1.09) with 95% confidence intervals. In our study PAPP-A has low predictive accuracy overall, but it may be useful when combined with other tests, and this is an active part for future research. One limitation of our study is significant heterogeneity because of different adjusted variables and varied diagnostic criteria. Copyright © 2018. Published by Elsevier B.V.

  16. Predicting the Effect of Mutations on Protein-Protein Binding Interactions through Structure-Based Interface Profiles

    PubMed Central

    Brender, Jeffrey R.; Zhang, Yang

    2015-01-01

    The formation of protein-protein complexes is essential for proteins to perform their physiological functions in the cell. Mutations that prevent the proper formation of the correct complexes can have serious consequences for the associated cellular processes. Since experimental determination of protein-protein binding affinity remains difficult when performed on a large scale, computational methods for predicting the consequences of mutations on binding affinity are highly desirable. We show that a scoring function based on interface structure profiles collected from analogous protein-protein interactions in the PDB is a powerful predictor of protein binding affinity changes upon mutation. As a standalone feature, the differences between the interface profile score of the mutant and wild-type proteins has an accuracy equivalent to the best all-atom potentials, despite being two orders of magnitude faster once the profile has been constructed. Due to its unique sensitivity in collecting the evolutionary profiles of analogous binding interactions and the high speed of calculation, the interface profile score has additional advantages as a complementary feature to combine with physics-based potentials for improving the accuracy of composite scoring approaches. By incorporating the sequence-derived and residue-level coarse-grained potentials with the interface structure profile score, a composite model was constructed through the random forest training, which generates a Pearson correlation coefficient >0.8 between the predicted and observed binding free-energy changes upon mutation. This accuracy is comparable to, or outperforms in most cases, the current best methods, but does not require high-resolution full-atomic models of the mutant structures. The binding interface profiling approach should find useful application in human-disease mutation recognition and protein interface design studies. PMID:26506533

  17. Biochemical Characterisation of TSC1 and TSC2 Variants Identified in Patients with Tuberous Sclerosis Complex

    DTIC Science & Technology

    2009-07-01

    that pathogenic TSC1 amino acid changes are clustered to a conserved ~300 amino acid region close to the N-terminal of the protein . These substitutions ...Genet. (2009) 18 2378-2387. 15. Ng PC and Henikoff S. Predicting the effects of amino acid substitutions on protein function. Annu. Rev...amino acid substitutions in the N-terminal region of TSC1 that result in reduced steady state levels of the protein and lead to increased mTOR

  18. Dissecting the expression relationships between RNA-binding proteins and their cognate targets in eukaryotic post-transcriptional regulatory networks.

    PubMed

    Nishtala, Sneha; Neelamraju, Yaseswini; Janga, Sarath Chandra

    2016-05-10

    RNA-binding proteins (RBPs) are pivotal in orchestrating several steps in the metabolism of RNA in eukaryotes thereby controlling an extensive network of RBP-RNA interactions. Here, we employed CLIP (cross-linking immunoprecipitation)-seq datasets for 60 human RBPs and RIP-ChIP (RNP immunoprecipitation-microarray) data for 69 yeast RBPs to construct a network of genome-wide RBP- target RNA interactions for each RBP. We show in humans that majority (~78%) of the RBPs are strongly associated with their target transcripts at transcript level while ~95% of the studied RBPs were also found to be strongly associated with expression levels of target transcripts when protein expression levels of RBPs were employed. At transcript level, RBP - RNA interaction data for the yeast genome, exhibited a strong association for 63% of the RBPs, confirming the association to be conserved across large phylogenetic distances. Analysis to uncover the features contributing to these associations revealed the number of target transcripts and length of the selected protein-coding transcript of an RBP at the transcript level while intensity of the CLIP signal, number of RNA-Binding domains, location of the binding site on the transcript, to be significant at the protein level. Our analysis will contribute to improved modelling and prediction of post-transcriptional networks.

  19. Dissecting the expression relationships between RNA-binding proteins and their cognate targets in eukaryotic post-transcriptional regulatory networks

    NASA Astrophysics Data System (ADS)

    Nishtala, Sneha; Neelamraju, Yaseswini; Janga, Sarath Chandra

    2016-05-01

    RNA-binding proteins (RBPs) are pivotal in orchestrating several steps in the metabolism of RNA in eukaryotes thereby controlling an extensive network of RBP-RNA interactions. Here, we employed CLIP (cross-linking immunoprecipitation)-seq datasets for 60 human RBPs and RIP-ChIP (RNP immunoprecipitation-microarray) data for 69 yeast RBPs to construct a network of genome-wide RBP- target RNA interactions for each RBP. We show in humans that majority (~78%) of the RBPs are strongly associated with their target transcripts at transcript level while ~95% of the studied RBPs were also found to be strongly associated with expression levels of target transcripts when protein expression levels of RBPs were employed. At transcript level, RBP - RNA interaction data for the yeast genome, exhibited a strong association for 63% of the RBPs, confirming the association to be conserved across large phylogenetic distances. Analysis to uncover the features contributing to these associations revealed the number of target transcripts and length of the selected protein-coding transcript of an RBP at the transcript level while intensity of the CLIP signal, number of RNA-Binding domains, location of the binding site on the transcript, to be significant at the protein level. Our analysis will contribute to improved modelling and prediction of post-transcriptional networks.

  20. Urinary vitamin D-binding protein, a novel biomarker for lupus nephritis, predicts the development of proteinuric flare.

    PubMed

    Go, D J; Lee, J Y; Kang, M J; Lee, E Y; Lee, E B; Yi, E C; Song, Y W

    2018-01-01

    Lupus nephritis (LN) is a major complication of systemic lupus erythematosus (SLE). Conventional biomarkers for assessing renal disease activity are imperfect in predicting clinical outcomes associated with LN. The aim of this study is to identify urinary protein biomarkers that reliably reflect the disease activity or predict clinical outcomes. A quantitative proteomic analysis was performed to identify protein biomarker candidates that can differentiate between SLE patients with and without LN. Selected biomarker candidates were further verified by enzyme-linked immunosorbent assay using urine samples from a larger cohort of SLE patients ( n = 121) to investigate their predictive values for LN activity measure. Furthermore, the association between urinary levels of a selected panel of potential biomarkers and prognosis of LN was assessed with a four-year follow-up study of renal outcomes. Urinary vitamin D-binding protein (VDBP), transthyretin (TTR), retinol binding protein 4 (RBP4), and prostaglandin D synthase (PTGDS) were significantly elevated in SLE patients with LN, especially in patients with active LN ( n = 21). Among them, VDBP well correlated with severity of proteinuria (rho = 0.661, p < 0.001) and renal SLE Disease Activity Index (renal SLEDAI) (rho = 0.520, p < 0.001). In the four-year follow-up, VDBP was a significant risk factor (hazard ratio 9.627, 95% confidence interval 1.698 to 54.571, p = 0.011) for the development of proteinuric flare in SLE patients without proteinuria ( n = 100) after adjustments for multiple confounders. Urinary VDBP correlated with proteinuria and renal SLEDAI, and predicted the development of proteinuria.

  1. Protein Structure Validation and Refinement Using Amide Proton Chemical Shifts Derived from Quantum Mechanics

    PubMed Central

    Christensen, Anders S.; Linnet, Troels E.; Borg, Mikael; Boomsma, Wouter; Lindorff-Larsen, Kresten; Hamelryck, Thomas; Jensen, Jan H.

    2013-01-01

    We present the ProCS method for the rapid and accurate prediction of protein backbone amide proton chemical shifts - sensitive probes of the geometry of key hydrogen bonds that determine protein structure. ProCS is parameterized against quantum mechanical (QM) calculations and reproduces high level QM results obtained for a small protein with an RMSD of 0.25 ppm (r = 0.94). ProCS is interfaced with the PHAISTOS protein simulation program and is used to infer statistical protein ensembles that reflect experimentally measured amide proton chemical shift values. Such chemical shift-based structural refinements, starting from high-resolution X-ray structures of Protein G, ubiquitin, and SMN Tudor Domain, result in average chemical shifts, hydrogen bond geometries, and trans-hydrogen bond (h3 JNC') spin-spin coupling constants that are in excellent agreement with experiment. We show that the structural sensitivity of the QM-based amide proton chemical shift predictions is needed to obtain this agreement. The ProCS method thus offers a powerful new tool for refining the structures of hydrogen bonding networks to high accuracy with many potential applications such as protein flexibility in ligand binding. PMID:24391900

  2. A 100-Year Review: Protein and amino acid nutrition in dairy cows.

    PubMed

    Schwab, Charles G; Broderick, Glen A

    2017-12-01

    Considerable progress has been made in understanding the protein and amino acid (AA) nutrition of dairy cows. The chemistry of feed crude protein (CP) appears to be well understood, as is the mechanism of ruminal protein degradation by rumen bacteria and protozoa. It has been shown that ammonia released from AA degradation in the rumen is used for bacterial protein formation and that urea can be a useful N supplement when lower protein diets are fed. It is now well documented that adequate rumen ammonia levels must be maintained for maximal synthesis of microbial protein and that a deficiency of rumen-degradable protein can decrease microbial protein synthesis, fiber digestibility, and feed intake. Rumen-synthesized microbial protein accounts for most of the CP flowing to the small intestine and is considered a high-quality protein for dairy cows because of apparent high digestibility and good AA composition. Much attention has been given to evaluating different methods to quantify ruminal protein degradation and escape and for measuring ruminal outflows of microbial protein and rumen-undegraded feed protein. The methods and accompanying results are used to determine the nutritional value of protein supplements and to develop nutritional models and evaluate their predictive ability. Lysine, methionine, and histidine have been identified most often as the most-limiting amino acids, with rumen-protected forms of lysine and methionine available for ration supplementation. Guidelines for protein feeding have evolved from simple feeding standards for dietary CP to more complex nutrition models that are designed to predict supplies and requirements for rumen ammonia and peptides and intestinally absorbable AA. The industry awaits more robust and mechanistic models for predicting supplies and requirements of rumen-available N and absorbed AA. Such models will be useful in allowing for feeding lower protein diets and increased efficiency of microbial protein synthesis. Copyright © 2017 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  3. Perceptual changes and drivers of liking in high protein extruded snacks.

    PubMed

    Kreger, Joseph W; Lee, Youngsoo; Lee, Soo-Yeun

    2012-04-01

    Increasing the amount of protein in snack foods can add to their satiating ability, which aligns with many health-based trends currently seen in the food industry. Understanding the effect of adding high levels of protein in a food matrix is essential for product development. The objective for this research was to determine the effects of varying protein type and level on the sensory-related aspects of a model extruded snack food. Independent variables in the design of the snacks were the level of total protein and the protein type in the formulation. The level of protein ranged from 28% to 43% (w/w) in 5% increments. The protein type varied in the ratio of whey to soy protein ranging from 0: 100 to 100: 0, in 25% increments. Descriptive analysis was conducted on the samples to profile their sensory characteristics. Protein type was found to be the predominant variable in differentiating the sensory characteristics of the samples. Soy protein imparted nutty, grainy aromas-by-mouth, and increased expansion during processing, resulting in a lighter, crispier texture. Whey protein imparted dairy related aromas-by-mouth and inhibited expansion during processing, resulting in a more dense, crunchy texture. Separately, 100 consumers rated their acceptance of the samples using the 9-point hedonic scale. It was found that protein type was also the predominant variable in affecting acceptance, with some clusters of consumers preferring samples comprised of soy protein, and others preferring samples with whey. Food product developers can use these findings to predict changes in a similar food product by varying protein level or protein type. This work shows how the perceivable appearance, aroma, and texture characteristics of puffed snack foods change when adding protein or changing the protein type. The type of protein incorporated was shown to have major effects on the characteristics of the snacks, partially because of their impact on how much the snacks puffed during processing. The findings from this research can help develop acceptable products that incorporate high levels of protein to be aligned with current health trends in the market. © 2012 Institute of Food Technologists®

  4. Plasma pro-surfactant protein B and lung function decline in smokers.

    PubMed

    Leung, Janice M; Mayo, John; Tan, Wan; Tammemagi, C Martin; Liu, Geoffrey; Peacock, Stuart; Shepherd, Frances A; Goffin, John; Goss, Glenwood; Nicholas, Garth; Tremblay, Alain; Johnston, Michael; Martel, Simon; Laberge, Francis; Bhatia, Rick; Roberts, Heidi; Burrowes, Paul; Manos, Daria; Stewart, Lori; Seely, Jean M; Gingras, Michel; Pasian, Sergio; Tsao, Ming-Sound; Lam, Stephen; Sin, Don D

    2015-04-01

    Plasma pro-surfactant protein B (pro-SFTPB) levels have recently been shown to predict the development of lung cancer in current and ex-smokers, but the ability of pro-SFTPB to predict measures of chronic obstructive pulmonary disease (COPD) severity is unknown. We evaluated the performance characteristics of pro-SFTPB as a biomarker of lung function decline in a population of current and ex-smokers. Plasma pro-SFTPB levels were measured in 2503 current and ex-smokers enrolled in the Pan-Canadian Early Detection of Lung Cancer Study. Linear regression was performed to determine the relationship of pro-SFTPB levels to changes in forced expiratory volume in 1 s (FEV1) over a 2-year period as well as to baseline FEV1 and the burden of emphysema observed in computed tomography (CT) scans. Plasma pro-SFTPB levels were inversely related to both FEV1 % predicted (p=0.024) and FEV1/forced vital capacity (FVC) (p<0.001), and were positively related to the burden of emphysema on CT scans (p<0.001). Higher plasma pro-SFTPB levels were also associated with a more rapid decline in FEV1 at 1 year (p=0.024) and over 2 years of follow-up (p=0.004). Higher plasma pro-SFTPB levels are associated with increased severity of airflow limitation and accelerated decline in lung function. Pro-SFTPB is a promising biomarker for COPD severity and progression. Copyright ©ERS 2015.

  5. PRISM-EM: template interface-based modelling of multi-protein complexes guided by cryo-electron microscopy density maps.

    PubMed

    Kuzu, Guray; Keskin, Ozlem; Nussinov, Ruth; Gursoy, Attila

    2016-10-01

    The structures of protein assemblies are important for elucidating cellular processes at the molecular level. Three-dimensional electron microscopy (3DEM) is a powerful method to identify the structures of assemblies, especially those that are challenging to study by crystallography. Here, a new approach, PRISM-EM, is reported to computationally generate plausible structural models using a procedure that combines crystallographic structures and density maps obtained from 3DEM. The predictions are validated against seven available structurally different crystallographic complexes. The models display mean deviations in the backbone of <5 Å. PRISM-EM was further tested on different benchmark sets; the accuracy was evaluated with respect to the structure of the complex, and the correlation with EM density maps and interface predictions were evaluated and compared with those obtained using other methods. PRISM-EM was then used to predict the structure of the ternary complex of the HIV-1 envelope glycoprotein trimer, the ligand CD4 and the neutralizing protein m36.

  6. Comprehensive modeling of microRNA targets predicts functional non-conserved and non-canonical sites.

    PubMed

    Betel, Doron; Koppal, Anjali; Agius, Phaedra; Sander, Chris; Leslie, Christina

    2010-01-01

    mirSVR is a new machine learning method for ranking microRNA target sites by a down-regulation score. The algorithm trains a regression model on sequence and contextual features extracted from miRanda-predicted target sites. In a large-scale evaluation, miRanda-mirSVR is competitive with other target prediction methods in identifying target genes and predicting the extent of their downregulation at the mRNA or protein levels. Importantly, the method identifies a significant number of experimentally determined non-canonical and non-conserved sites.

  7. Regulation of the activity of the promoter of RNA-induced silencing, C3PO.

    PubMed

    Sahu, Shriya; Williams, Leo; Perez, Alberto; Philip, Finly; Caso, Giuseppe; Zurawsky, Walter; Scarlata, Suzanne

    2017-09-01

    RNA-induced silencing is a process which allows cells to regulate the synthesis of specific proteins. RNA silencing is promoted by the protein C3PO (component 3 of RISC). We have previously found that phospholipase Cβ, which increases intracellular calcium levels in response to specific G protein signals, inhibits C3PO activity towards certain genes. Understanding the parameters that control C3PO activity and which genes are impacted by G protein activation would help predict which genes are more vulnerable to downregulation. Here, using a library of 10 18 oligonucleotides, we show that C3PO binds oligonucleotides with structural specificity but little sequence specificity. Alternately, C3PO hydrolyzes oligonucleotides with a rate that is sensitive to substrate stability. Importantly, we find that oligonucleotides with higher Tm values are inhibited by bound PLCβ. This finding is supported by microarray analysis in cells over-expressing PLCβ1. Taken together, this study allows predictions of the genes whose post-transcriptional regulation is responsive to the G protein/phospholipase Cβ/calcium signaling pathway. © 2017 The Protein Society.

  8. Predictive and Experimental Approaches for Elucidating Protein–Protein Interactions and Quaternary Structures

    PubMed Central

    Nealon, John Oliver; Philomina, Limcy Seby

    2017-01-01

    The elucidation of protein–protein interactions is vital for determining the function and action of quaternary protein structures. Here, we discuss the difficulty and importance of establishing protein quaternary structure and review in vitro and in silico methods for doing so. Determining the interacting partner proteins of predicted protein structures is very time-consuming when using in vitro methods, this can be somewhat alleviated by use of predictive methods. However, developing reliably accurate predictive tools has proved to be difficult. We review the current state of the art in predictive protein interaction software and discuss the problem of scoring and therefore ranking predictions. Current community-based predictive exercises are discussed in relation to the growth of protein interaction prediction as an area within these exercises. We suggest a fusion of experimental and predictive methods that make use of sparse experimental data to determine higher resolution predicted protein interactions as being necessary to drive forward development. PMID:29206185

  9. Network-based function prediction and interactomics: the case for metabolic enzymes.

    PubMed

    Janga, S C; Díaz-Mejía, J Javier; Moreno-Hagelsieb, G

    2011-01-01

    As sequencing technologies increase in power, determining the functions of unknown proteins encoded by the DNA sequences so produced becomes a major challenge. Functional annotation is commonly done on the basis of amino-acid sequence similarity alone. Long after sequence similarity becomes undetectable by pair-wise comparison, profile-based identification of homologs can often succeed due to the conservation of position-specific patterns, important for a protein's three dimensional folding and function. Nevertheless, prediction of protein function from homology-driven approaches is not without problems. Homologous proteins might evolve different functions and the power of homology detection has already started to reach its maximum. Computational methods for inferring protein function, which exploit the context of a protein in cellular networks, have come to be built on top of homology-based approaches. These network-based functional inference techniques provide both a first hand hint into a proteins' functional role and offer complementary insights to traditional methods for understanding the function of uncharacterized proteins. Most recent network-based approaches aim to integrate diverse kinds of functional interactions to boost both coverage and confidence level. These techniques not only promise to solve the moonlighting aspect of proteins by annotating proteins with multiple functions, but also increase our understanding on the interplay between different functional classes in a cell. In this article we review the state of the art in network-based function prediction and describe some of the underlying difficulties and successes. Given the volume of high-throughput data that is being reported the time is ripe to employ these network-based approaches, which can be used to unravel the functions of the uncharacterized proteins accumulating in the genomic databases. © 2010 Elsevier Inc. All rights reserved.

  10. [Study on proteomics of familial systemic lupus erythematosus patients in one family from Sichuan, China].

    PubMed

    Wu, Yong-kang; Huang, Zhuo-chun; Shi, Yun-ying; Cai, Bei; Wang, Lan-lan; Ying, Bin-wu; Hu, Chao-jun; Li, Yong-zhe

    2009-05-01

    To investigate the proteomic characteristics of systemic lupus erythematosus (SLE) in a SLE family from Sichuan, China which consisting of 7 members with 3 SLE cases, and to find the proteins correlated with the heredity of SLE. A total of 153 serum samples were collected from 7 members including 3 SLE sisters in this SLE family, 63 individual SLE patients, as well as 83 healthy controls. The diagnosis of SLE is based on the American College of Rheumatology criteria (1997). All serum samples were analyzed using matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF-MS) combined with magnetic beads technology. Serum protein profiles were obtained by MALDI-TOF-MS combined with magnetic beads in order to identify predictive biomarkers of risk of suffering SLE. The resulting spectra were analyzed with Biomarker Wizard software 3.1.0. Four discriminative mass/charge (m/z) proteins serving as pathogenic biomarkers were identified on arrays for family SLE cases versus individual SLE and healthy controls. The protein level of peak intensities at m/z of 9342.23 was significantly greater in SLE family group compared with that in individual SLE patients and healthy controls (P<0.05), those of individual SLE patients were significantly greater compared with healthy controls (P<0.05); the proteins level of peak intensities at m/z of 4094.03, 5905.35 and 7973.53 in SLE family group were significantly lower compared with that in individual SLE patients and healthy controls (P<0.05), those of individual SLE patients were significantly lower compared with healthy controls (P<0.05). The proteins of m/z of 9342.23, 4094.03, 5905.35 and 7973.53 maybe play a great role in assemble pathogenesis of SLE and predict the risk of suffering SLE. The higher protein level of m/z of 9342.23 and the lower protein level of m/z of 4094.03, 5905.35 and 7973.53, the higher risk of sufferring with SLE.

  11. Accurate secondary structure prediction and fold recognition for circular dichroism spectroscopy

    PubMed Central

    Micsonai, András; Wien, Frank; Kernya, Linda; Lee, Young-Ho; Goto, Yuji; Réfrégiers, Matthieu; Kardos, József

    2015-01-01

    Circular dichroism (CD) spectroscopy is a widely used technique for the study of protein structure. Numerous algorithms have been developed for the estimation of the secondary structure composition from the CD spectra. These methods often fail to provide acceptable results on α/β-mixed or β-structure–rich proteins. The problem arises from the spectral diversity of β-structures, which has hitherto been considered as an intrinsic limitation of the technique. The predictions are less reliable for proteins of unusual β-structures such as membrane proteins, protein aggregates, and amyloid fibrils. Here, we show that the parallel/antiparallel orientation and the twisting of the β-sheets account for the observed spectral diversity. We have developed a method called β-structure selection (BeStSel) for the secondary structure estimation that takes into account the twist of β-structures. This method can reliably distinguish parallel and antiparallel β-sheets and accurately estimates the secondary structure for a broad range of proteins. Moreover, the secondary structure components applied by the method are characteristic to the protein fold, and thus the fold can be predicted to the level of topology in the CATH classification from a single CD spectrum. By constructing a web server, we offer a general tool for a quick and reliable structure analysis using conventional CD or synchrotron radiation CD (SRCD) spectroscopy for the protein science research community. The method is especially useful when X-ray or NMR techniques fail. Using BeStSel on data collected by SRCD spectroscopy, we investigated the structure of amyloid fibrils of various disease-related proteins and peptides. PMID:26038575

  12. General overview on structure prediction of twilight-zone proteins.

    PubMed

    Khor, Bee Yin; Tye, Gee Jun; Lim, Theam Soon; Choong, Yee Siew

    2015-09-04

    Protein structure prediction from amino acid sequence has been one of the most challenging aspects in computational structural biology despite significant progress in recent years showed by critical assessment of protein structure prediction (CASP) experiments. When experimentally determined structures are unavailable, the predictive structures may serve as starting points to study a protein. If the target protein consists of homologous region, high-resolution (typically <1.5 Å) model can be built via comparative modelling. However, when confronted with low sequence similarity of the target protein (also known as twilight-zone protein, sequence identity with available templates is less than 30%), the protein structure prediction has to be initiated from scratch. Traditionally, twilight-zone proteins can be predicted via threading or ab initio method. Based on the current trend, combination of different methods brings an improved success in the prediction of twilight-zone proteins. In this mini review, the methods, progresses and challenges for the prediction of twilight-zone proteins were discussed.

  13. A traveling salesman approach for predicting protein functions.

    PubMed

    Johnson, Olin; Liu, Jing

    2006-10-12

    Protein-protein interaction information can be used to predict unknown protein functions and to help study biological pathways. Here we present a new approach utilizing the classic Traveling Salesman Problem to study the protein-protein interactions and to predict protein functions in budding yeast Saccharomyces cerevisiae. We apply the global optimization tool from combinatorial optimization algorithms to cluster the yeast proteins based on the global protein interaction information. We then use this clustering information to help us predict protein functions. We use our algorithm together with the direct neighbor algorithm 1 on characterized proteins and compare the prediction accuracy of the two methods. We show our algorithm can produce better predictions than the direct neighbor algorithm, which only considers the immediate neighbors of the query protein. Our method is a promising one to be used as a general tool to predict functions of uncharacterized proteins and a successful sample of using computer science knowledge and algorithms to study biological problems.

  14. A traveling salesman approach for predicting protein functions

    PubMed Central

    Johnson, Olin; Liu, Jing

    2006-01-01

    Background Protein-protein interaction information can be used to predict unknown protein functions and to help study biological pathways. Results Here we present a new approach utilizing the classic Traveling Salesman Problem to study the protein-protein interactions and to predict protein functions in budding yeast Saccharomyces cerevisiae. We apply the global optimization tool from combinatorial optimization algorithms to cluster the yeast proteins based on the global protein interaction information. We then use this clustering information to help us predict protein functions. We use our algorithm together with the direct neighbor algorithm [1] on characterized proteins and compare the prediction accuracy of the two methods. We show our algorithm can produce better predictions than the direct neighbor algorithm, which only considers the immediate neighbors of the query protein. Conclusion Our method is a promising one to be used as a general tool to predict functions of uncharacterized proteins and a successful sample of using computer science knowledge and algorithms to study biological problems. PMID:17147783

  15. Twelve Transmembrane Helices Form the Functional Core of Mammalian MATE1 (Multidrug and Toxin Extruder 1) Protein*

    PubMed Central

    Zhang, Xiaohong; He, Xiao; Baker, Joseph; Tama, Florence; Chang, Geoffrey; Wright, Stephen H.

    2012-01-01

    The x-ray structure of the prototypic MATE family member, NorM from Vibrio cholerae, reveals a protein fold composed of 12 transmembrane helices (TMHs), confirming hydropathy analyses of the majority of (prokaryotic and plant) MATE transporters. However, the mammalian MATEs are generally predicted to have a 13th TMH and an extracellular C terminus. Here we affirm this prediction, showing that the C termini of epitope-tagged, full-length human, rabbit, and mouse MATE1 were accessible to antibodies from the extracellular face of the membrane. Truncation of these proteins at or near the predicted junction between the 13th TMH and the long cytoplasmic loop that precedes it resulted in proteins that (i) trafficked to the membrane and (ii) interacted with antibodies only after permeabilization of the plasma membrane. CHO cells expressing rbMate1 truncated at residue Gly-545 supported levels of pH-sensitive transport similar to that of cells expressing the full-length protein. Although the high transport rate of the Gly-545 truncation mutant was associated with higher levels of membrane expression (than full-length MATE1), suggesting the 13th TMH may influence substrate translocation, the selectivity profile of the mutant indicated that TMH13 has little impact on ligand binding. We conclude that the functional core of MATE1 consists of 12 (not 13) TMHs. Therefore, we used the x-ray structure of NorM to develop a homology model of the first 12 TMHs of MATE1. The model proved to be stable in molecular dynamic simulations and agreed with topology evident from preliminary cysteine scanning of intracellular versus extracellular loops. PMID:22722930

  16. [Faecal calprotectin as an aid to the diagnosis of non-IgE mediated cow's milk protein allergy].

    PubMed

    Trillo Belizón, Carlos; Ortega Páez, Eduardo; Medina Claros, Antonio F; Rodríguez Sánchez, Isabel; Reina González, Ana; Vera Medialdea, Rafael; Ramón Salguero, José Manuel

    2016-06-01

    The aim of the study was to assess the use of faecal calprotectin (FCP) in infants with signs and symptoms of non-IgE-mediated cow's milk protein allergy (CMA) for both diagnosis and prediction of clinical response at the time of withdrawal of milk proteins. A one year prospective study was conducted on 82 infants between 1 and 12 months of age in the Eastern area of Málaga-Axarquía, of whom 40 of them had been diagnosed with non-IgE-mediated CMA (with suggestive symptoms and positive response to milk withdrawal), 12 not diagnosed with CMA, and 30 of them were the control group. FCP was measured at three different times: time of diagnosis, and one and three months later. ANOVA for repeated measures, nominal logistic regression and ROC curves were prepared using the SPSS.20 package and Medcalc. Differences between diagnostic and control groups were assessed: there was a statistically significant relationship (p<.0001) between high FCP levels and infants suffering CMA, as well as the levels at time of diagnosis, 1 and 3 months (p <.001). A ROC curve was constructed between FCP levels and diagnosis of CMA, with 138 ug/g, with the best cut-off being with an area under the curve of 0.89. However, it is only 0.68 to predict a clinical response. FCP levels lower than 138ug/g could be useful to rule out non-IgE-mediated CMA diagnosis. Calprotectin is not a good test to predict clinical response to milk withdrawal. Copyright © 2015 Asociación Española de Pediatría. Published by Elsevier España, S.L.U. All rights reserved.

  17. MEGADOCK-Web: an integrated database of high-throughput structure-based protein-protein interaction predictions.

    PubMed

    Hayashi, Takanori; Matsuzaki, Yuri; Yanagisawa, Keisuke; Ohue, Masahito; Akiyama, Yutaka

    2018-05-08

    Protein-protein interactions (PPIs) play several roles in living cells, and computational PPI prediction is a major focus of many researchers. The three-dimensional (3D) structure and binding surface are important for the design of PPI inhibitors. Therefore, rigid body protein-protein docking calculations for two protein structures are expected to allow elucidation of PPIs different from known complexes in terms of 3D structures because known PPI information is not explicitly required. We have developed rapid PPI prediction software based on protein-protein docking, called MEGADOCK. In order to fully utilize the benefits of computational PPI predictions, it is necessary to construct a comprehensive database to gather prediction results and their predicted 3D complex structures and to make them easily accessible. Although several databases exist that provide predicted PPIs, the previous databases do not contain a sufficient number of entries for the purpose of discovering novel PPIs. In this study, we constructed an integrated database of MEGADOCK PPI predictions, named MEGADOCK-Web. MEGADOCK-Web provides more than 10 times the number of PPI predictions than previous databases and enables users to conduct PPI predictions that cannot be found in conventional PPI prediction databases. In MEGADOCK-Web, there are 7528 protein chains and 28,331,628 predicted PPIs from all possible combinations of those proteins. Each protein structure is annotated with PDB ID, chain ID, UniProt AC, related KEGG pathway IDs, and known PPI pairs. Additionally, MEGADOCK-Web provides four powerful functions: 1) searching precalculated PPI predictions, 2) providing annotations for each predicted protein pair with an experimentally known PPI, 3) visualizing candidates that may interact with the query protein on biochemical pathways, and 4) visualizing predicted complex structures through a 3D molecular viewer. MEGADOCK-Web provides a huge amount of comprehensive PPI predictions based on docking calculations with biochemical pathways and enables users to easily and quickly assess PPI feasibilities by archiving PPI predictions. MEGADOCK-Web also promotes the discovery of new PPIs and protein functions and is freely available for use at http://www.bi.cs.titech.ac.jp/megadock-web/ .

  18. Conformational Heterogeneity of Unbound Proteins Enhances Recognition in Protein-Protein Encounters.

    PubMed

    Pallara, Chiara; Rueda, Manuel; Abagyan, Ruben; Fernández-Recio, Juan

    2016-07-12

    To understand cellular processes at the molecular level we need to improve our knowledge of protein-protein interactions, from a structural, mechanistic, and energetic point of view. Current theoretical studies and computational docking simulations show that protein dynamics plays a key role in protein association and support the need for including protein flexibility in modeling protein interactions. Assuming the conformational selection binding mechanism, in which the unbound state can sample bound conformers, one possible strategy to include flexibility in docking predictions would be the use of conformational ensembles originated from unbound protein structures. Here we present an exhaustive computational study about the use of precomputed unbound ensembles in the context of protein docking, performed on a set of 124 cases of the Protein-Protein Docking Benchmark 3.0. Conformational ensembles were generated by conformational optimization and refinement with MODELLER and by short molecular dynamics trajectories with AMBER. We identified those conformers providing optimal binding and investigated the role of protein conformational heterogeneity in protein-protein recognition. Our results show that a restricted conformational refinement can generate conformers with better binding properties and improve docking encounters in medium-flexible cases. For more flexible cases, a more extended conformational sampling based on Normal Mode Analysis was proven helpful. We found that successful conformers provide better energetic complementarity to the docking partners, which is compatible with recent views of binding association. In addition to the mechanistic considerations, these findings could be exploited for practical docking predictions of improved efficiency.

  19. The novel ER membrane protein PRO41 is essential for sexual development in the filamentous fungus Sordaria macrospora.

    PubMed

    Nowrousian, Minou; Frank, Sandra; Koers, Sandra; Strauch, Peter; Weitner, Thomas; Ringelberg, Carol; Dunlap, Jay C; Loros, Jennifer J; Kück, Ulrich

    2007-05-01

    The filamentous fungus Sordaria macrospora develops complex fruiting bodies (perithecia) to propagate its sexual spores. Here, we present an analysis of the sterile mutant pro41 that is unable to produce mature fruiting bodies. The mutant carries a deletion of 4 kb and is complemented by the pro41 open reading frame that is contained within the region deleted in the mutant. In silico analyses predict PRO41 to be an endoplasmic reticulum (ER) membrane protein, and a PRO41-EGFP fusion protein colocalizes with ER-targeted DsRED. Furthermore, Western blot analysis shows that the PRO41-EGFP fusion protein is present in the membrane fraction. A fusion of the predicted N-terminal signal sequence of PRO41 with EGFP is secreted out of the cell, indicating that the signal sequence is functional. pro41 transcript levels are upregulated during sexual development. This increase in transcript levels was not observed in the sterile mutant pro1 that lacks a transcription factor gene. Moreover, microarray analysis of gene expression in the mutants pro1, pro41 and the pro1/41 double mutant showed that pro41 is partly epistatic to pro1. Taken together, these data show that PRO41 is a novel ER membrane protein essential for fruiting body formation in filamentous fungi.

  20. The novel ER membrane protein PRO41 is essential for sexual development in the filamentous fungus Sordaria macrospora

    PubMed Central

    Nowrousian, Minou; Frank, Sandra; Koers, Sandra; Strauch, Peter; Weitner, Thomas; Ringelberg, Carol; Dunlap, Jay C.; Loros, Jennifer J.; Kück, Ulrich

    2013-01-01

    Summary The filamentous fungus Sordaria macrospora develops complex fruiting bodies (perithecia) to propagate its sexual spores. Here, we present an analysis of the sterile mutant pro41 that is unable to produce mature fruiting bodies. The mutant carries a deletion of 4 kb and is complemented by the pro41 open reading frame that is contained within the region deleted in the mutant. In silico analyses predict PRO41 to be an endoplasmic reticulum (ER) membrane protein, and a PRO41–EGFP fusion protein colocalizes with ER-targeted DsRED. Furthermore, Western blot analysis shows that the PRO41–EGFP fusion protein is present in the membrane fraction. A fusion of the predicted N-terminal signal sequence of PRO41 with EGFP is secreted out of the cell, indicating that the signal sequence is functional. pro41 transcript levels are upregulated during sexual development. This increase in transcript levels was not observed in the sterile mutant pro1 that lacks a transcription factor gene. Moreover, microarray analysis of gene expression in the mutants pro1, pro41 and the pro1/41 double mutant showed that pro41 is partly epistatic to pro1. Taken together, these data show that PRO41 is a novel ER membrane protein essential for fruiting body formation in filamentous fungi. PMID:17501918

  1. A Comparative Quantitative Proteomic Study Identifies New Proteins Relevant for Sulfur Oxidation in the Purple Sulfur Bacterium Allochromatium vinosum

    PubMed Central

    Weissgerber, Thomas; Sylvester, Marc; Kröninger, Lena

    2014-01-01

    In the present study, we compared the proteome response of Allochromatium vinosum when growing photoautotrophically in the presence of sulfide, thiosulfate, and elemental sulfur with the proteome response when the organism was growing photoheterotrophically on malate. Applying tandem mass tag analysis as well as two-dimensional (2D) PAGE, we detected 1,955 of the 3,302 predicted proteins by identification of at least two peptides (59.2%) and quantified 1,848 of the identified proteins. Altered relative protein amounts (≥1.5-fold) were observed for 385 proteins, corresponding to 20.8% of the quantified A. vinosum proteome. A significant number of the proteins exhibiting strongly enhanced relative protein levels in the presence of reduced sulfur compounds are well documented essential players during oxidative sulfur metabolism, e.g., the dissimilatory sulfite reductase DsrAB. Changes in protein levels generally matched those observed for the respective relative mRNA levels in a previous study and allowed identification of new genes/proteins participating in oxidative sulfur metabolism. One gene cluster (hyd; Alvin_2036-Alvin_2040) and one hypothetical protein (Alvin_2107) exhibiting strong responses on both the transcriptome and proteome levels were chosen for gene inactivation and phenotypic analyses of the respective mutant strains, which verified the importance of the so-called Isp hydrogenase supercomplex for efficient oxidation of sulfide and a crucial role of Alvin_2107 for the oxidation of sulfur stored in sulfur globules to sulfite. In addition, we analyzed the sulfur globule proteome and identified a new sulfur globule protein (SgpD; Alvin_2515). PMID:24487535

  2. Diagnostic properties of C-reactive protein for detecting pneumonia in children.

    PubMed

    Koster, Madieke J; Broekhuizen, Berna D L; Minnaard, Margaretha C; Balemans, Walter A F; Hopstaken, Rogier M; de Jong, Pim A; Verheij, Theo J M

    2013-07-01

    The diagnostic value of C-reactive protein (CRP) level for pneumonia in children is unknown. As a first step in the assessment of the value of CRP, a diagnostic study was performed in children at an emergency department (ED). In this cross-sectional study, data were retrospectively collected from children presenting with suspected pneumonia at the ED of Antonius Hospital Nieuwegein in The Netherlands between January 2007 and January 2012. Diagnostic outcome was pneumonia yes/no according to independent radiologist. (Un)adjusted association between CRP level and pneumonia and diagnostic value of CRP were calculated. Of 687 presenting children, 286 underwent both CRP measurement and chest radiography. 148 had pneumonia (52%). The proportion of pneumonia increased with CRP level. Negative predictive values declined, but positive predictive values increased with higher CRP thresholds. Univariable odds ratio for the association between CRP level and pneumonia was 1.2 (95% CI 1.11-1.21) per 10 mg/L increase. After adjustment for baseline characteristics CRP level remained associated with pneumonia. CRP level has independent diagnostic value for pneumonia in children presenting at the ED with suspected pneumonia, but low levels do not exclude pneumonia in this setting. These results prompt evaluation of CRP in primary care children with LRTI. Copyright © 2013 Elsevier Ltd. All rights reserved.

  3. From sequence to enzyme mechanism using multi-label machine learning.

    PubMed

    De Ferrari, Luna; Mitchell, John B O

    2014-05-19

    In this work we predict enzyme function at the level of chemical mechanism, providing a finer granularity of annotation than traditional Enzyme Commission (EC) classes. Hence we can predict not only whether a putative enzyme in a newly sequenced organism has the potential to perform a certain reaction, but how the reaction is performed, using which cofactors and with susceptibility to which drugs or inhibitors, details with important consequences for drug and enzyme design. Work that predicts enzyme catalytic activity based on 3D protein structure features limits the prediction of mechanism to proteins already having either a solved structure or a close relative suitable for homology modelling. In this study, we evaluate whether sequence identity, InterPro or Catalytic Site Atlas sequence signatures provide enough information for bulk prediction of enzyme mechanism. By splitting MACiE (Mechanism, Annotation and Classification in Enzymes database) mechanism labels to a finer granularity, which includes the role of the protein chain in the overall enzyme complex, the method can predict at 96% accuracy (and 96% micro-averaged precision, 99.9% macro-averaged recall) the MACiE mechanism definitions of 248 proteins available in the MACiE, EzCatDb (Database of Enzyme Catalytic Mechanisms) and SFLD (Structure Function Linkage Database) databases using an off-the-shelf K-Nearest Neighbours multi-label algorithm. We find that InterPro signatures are critical for accurate prediction of enzyme mechanism. We also find that incorporating Catalytic Site Atlas attributes does not seem to provide additional accuracy. The software code (ml2db), data and results are available online at http://sourceforge.net/projects/ml2db/ and as supplementary files.

  4. Calnexin, an ER stress-induced protein, is a prognostic marker and potential therapeutic target in colorectal cancer.

    PubMed

    Ryan, Deborah; Carberry, Steven; Murphy, Áine C; Lindner, Andreas U; Fay, Joanna; Hector, Suzanne; McCawley, Niamh; Bacon, Orna; Concannon, Caoimhin G; Kay, Elaine W; McNamara, Deborah A; Prehn, Jochen H M

    2016-07-01

    Colorectal cancer (CRC) is a leading cause of cancer mortality in the Western world and commonly treated with genotoxic chemotherapy. Stress in the endoplasmic reticulum (ER) was implicated to contribute to chemotherapeutic resistance. Hence, ER stress related protein may be of prognostic or therapeutic significance. The expression levels of ER stress proteins calnexin, calreticulin, GRP78 and GRP94 were determined in n = 23 Stage II and III colon cancer fresh frozen tumour and matched normal tissue samples. Data were validated in a cohort of n = 11 rectal cancer patients treated with radiochemotherapy in the neoadjuvant setting. The calnexin gene was silenced using siRNA in HCT116 cells. There were no increased levels of ER stress proteins in tumour compared to matched normal tissue samples in Stage II or III CRC. However, increased calnexin protein levels were predictive of poor clinical outcome in the patient cohort. Data were validated in the rectal cancer cohort treated in the neoadjuvant setting. Calnexin gene-silencing significantly reduced cell survival and increased cancer cell susceptibility to 5FU chemotherapy. Increased tumour protein levels of calnexin may be of prognostic significance in CRC, and calnexin may represent a potential target for future therapies.

  5. Identification of immune signatures predictive of clinical protection from malaria.

    PubMed

    Valletta, John Joseph; Recker, Mario

    2017-10-01

    Antibodies are thought to play an essential role in naturally acquired immunity to malaria. Prospective cohort studies have frequently shown how continuous exposure to the malaria parasite Plasmodium falciparum cause an accumulation of specific responses against various antigens that correlate with a decreased risk of clinical malaria episodes. However, small effect sizes and the often polymorphic nature of immunogenic parasite proteins make the robust identification of the true targets of protective immunity ambiguous. Furthermore, the degree of individual-level protection conferred by elevated responses to these antigens has not yet been explored. Here we applied a machine learning approach to identify immune signatures predictive of individual-level protection against clinical disease. We find that commonly assumed immune correlates are poor predictors of clinical protection in children. On the other hand, antibody profiles predictive of an individual's malaria protective status can be found in data comprising responses to a large set of diverse parasite proteins. We show that this pattern emerges only after years of continuous exposure to the malaria parasite, whereas susceptibility to clinical episodes in young hosts (< 10 years) cannot be ascertained by measured antibody responses alone.

  6. A novel nano-immunoassay method for quantification of proteins from CD138-purified myeloma cells: biological and clinical utility

    PubMed Central

    Misiewicz-Krzeminska, Irena; Corchete, Luis Antonio; Rojas, Elizabeta A.; Martínez-López, Joaquín; García-Sanz, Ramón; Oriol, Albert; Bladé, Joan; Lahuerta, Juan-José; Miguel, Jesús San; Mateos, María-Victoria; Gutiérrez, Norma C.

    2018-01-01

    Protein analysis in bone marrow samples from patients with multiple myeloma has been limited by the low concentration of proteins obtained after CD138+ cell selection. A novel approach based on capillary nano-immunoassay could make it possible to quantify dozens of proteins from each myeloma sample in an automated manner. Here we present a method for the accurate and robust quantification of the expression of multiple proteins extracted from CD138-purified multiple myeloma samples frozen in RLT Plus buffer, which is commonly used for nucleic acid preservation and isolation. Additionally, the biological and clinical value of this analysis for a panel of 12 proteins essential to the pathogenesis of multiple myeloma was evaluated in 63 patients with newly diagnosed multiple myeloma. The analysis of the prognostic impact of CRBN/Cereblon and IKZF1/Ikaros mRNA/protein showed that only the protein levels were able to predict progression-free survival of patients; mRNA levels were not associated with prognosis. Interestingly, high levels of Cereblon and Ikaros proteins were associated with longer progression-free survival only in patients who received immunomodulatory drugs and not in those treated with other drugs. In conclusion, the capillary nano-immunoassay platform provides a novel opportunity for automated quantification of the expression of more than 20 proteins in CD138+ primary multiple myeloma samples. PMID:29545347

  7. A novel nano-immunoassay method for quantification of proteins from CD138-purified myeloma cells: biological and clinical utility.

    PubMed

    Misiewicz-Krzeminska, Irena; Corchete, Luis Antonio; Rojas, Elizabeta A; Martínez-López, Joaquín; García-Sanz, Ramón; Oriol, Albert; Bladé, Joan; Lahuerta, Juan-José; Miguel, Jesús San; Mateos, María-Victoria; Gutiérrez, Norma C

    2018-05-01

    Protein analysis in bone marrow samples from patients with multiple myeloma has been limited by the low concentration of proteins obtained after CD138 + cell selection. A novel approach based on capillary nano-immunoassay could make it possible to quantify dozens of proteins from each myeloma sample in an automated manner. Here we present a method for the accurate and robust quantification of the expression of multiple proteins extracted from CD138-purified multiple myeloma samples frozen in RLT Plus buffer, which is commonly used for nucleic acid preservation and isolation. Additionally, the biological and clinical value of this analysis for a panel of 12 proteins essential to the pathogenesis of multiple myeloma was evaluated in 63 patients with newly diagnosed multiple myeloma. The analysis of the prognostic impact of CRBN /Cereblon and IKZF1 /Ikaros mRNA/protein showed that only the protein levels were able to predict progression-free survival of patients; mRNA levels were not associated with prognosis. Interestingly, high levels of Cereblon and Ikaros proteins were associated with longer progression-free survival only in patients who received immunomodulatory drugs and not in those treated with other drugs. In conclusion, the capillary nano-immunoassay platform provides a novel opportunity for automated quantification of the expression of more than 20 proteins in CD138 + primary multiple myeloma samples. Copyright © 2018 Ferrata Storti Foundation.

  8. A feature-based approach to modeling protein-protein interaction hot spots.

    PubMed

    Cho, Kyu-il; Kim, Dongsup; Lee, Doheon

    2009-05-01

    Identifying features that effectively represent the energetic contribution of an individual interface residue to the interactions between proteins remains problematic. Here, we present several new features and show that they are more effective than conventional features. By combining the proposed features with conventional features, we develop a predictive model for interaction hot spots. Initially, 54 multifaceted features, composed of different levels of information including structure, sequence and molecular interaction information, are quantified. Then, to identify the best subset of features for predicting hot spots, feature selection is performed using a decision tree. Based on the selected features, a predictive model for hot spots is created using support vector machine (SVM) and tested on an independent test set. Our model shows better overall predictive accuracy than previous methods such as the alanine scanning methods Robetta and FOLDEF, and the knowledge-based method KFC. Subsequent analysis yields several findings about hot spots. As expected, hot spots have a larger relative surface area burial and are more hydrophobic than other residues. Unexpectedly, however, residue conservation displays a rather complicated tendency depending on the types of protein complexes, indicating that this feature is not good for identifying hot spots. Of the selected features, the weighted atomic packing density, relative surface area burial and weighted hydrophobicity are the top 3, with the weighted atomic packing density proving to be the most effective feature for predicting hot spots. Notably, we find that hot spots are closely related to pi-related interactions, especially pi . . . pi interactions.

  9. TMFoldWeb: a web server for predicting transmembrane protein fold class.

    PubMed

    Kozma, Dániel; Tusnády, Gábor E

    2015-09-17

    Here we present TMFoldWeb, the web server implementation of TMFoldRec, a transmembrane protein fold recognition algorithm. TMFoldRec uses statistical potentials and utilizes topology filtering and a gapless threading algorithm. It ranks template structures and selects the most likely candidates and estimates the reliability of the obtained lowest energy model. The statistical potential was developed in a maximum likelihood framework on a representative set of the PDBTM database. According to the benchmark test the performance of TMFoldRec is about 77 % in correctly predicting fold class for a given transmembrane protein sequence. An intuitive web interface has been developed for the recently published TMFoldRec algorithm. The query sequence goes through a pipeline of topology prediction and a systematic sequence to structure alignment (threading). Resulting templates are ordered by energy and reliability values and are colored according to their significance level. Besides the graphical interface, a programmatic access is available as well, via a direct interface for developers or for submitting genome-wide data sets. The TMFoldWeb web server is unique and currently the only web server that is able to predict the fold class of transmembrane proteins while assigning reliability scores for the prediction. This method is prepared for genome-wide analysis with its easy-to-use interface, informative result page and programmatic access. Considering the info-communication evolution in the last few years, the developed web server, as well as the molecule viewer, is responsive and fully compatible with the prevalent tablets and mobile devices.

  10. Systematic Proteomic Approach to Characterize the Impacts of ...

    EPA Pesticide Factsheets

    Chemical interactions have posed a big challenge in toxicity characterization and human health risk assessment of environmental mixtures. To characterize the impacts of chemical interactions on protein and cytotoxicity responses to environmental mixtures, we established a systems biology approach integrating proteomics, bioinformatics, statistics, and computational toxicology to measure expression or phosphorylation levels of 21 critical toxicity pathway regulators and 445 downstream proteins in human BEAS-28 cells treated with 4 concentrations of nickel, 2 concentrations each of cadmium and chromium, as well as 12 defined binary and 8 defined ternary mixtures of these metals in vitro. Multivariate statistical analysis and mathematical modeling of the metal-mediated proteomic response patterns showed a high correlation between changes in protein expression or phosphorylation and cellular toxic responses to both individual metals and metal mixtures. Of the identified correlated proteins, only a small set of proteins including HIF-1a is likely to be responsible for selective cytotoxic responses to different metals and metals mixtures. Furthermore, support vector machine learning was utilized to computationally predict protein responses to uncharacterized metal mixtures using experimentally generated protein response profiles corresponding to known metal mixtures. This study provides a novel proteomic approach for characterization and prediction of toxicities of

  11. Classification of G-protein coupled receptors based on a rich generation of convolutional neural network, N-gram transformation and multiple sequence alignments.

    PubMed

    Li, Man; Ling, Cheng; Xu, Qi; Gao, Jingyang

    2018-02-01

    Sequence classification is crucial in predicting the function of newly discovered sequences. In recent years, the prediction of the incremental large-scale and diversity of sequences has heavily relied on the involvement of machine-learning algorithms. To improve prediction accuracy, these algorithms must confront the key challenge of extracting valuable features. In this work, we propose a feature-enhanced protein classification approach, considering the rich generation of multiple sequence alignment algorithms, N-gram probabilistic language model and the deep learning technique. The essence behind the proposed method is that if each group of sequences can be represented by one feature sequence, composed of homologous sites, there should be less loss when the sequence is rebuilt, when a more relevant sequence is added to the group. On the basis of this consideration, the prediction becomes whether a query sequence belonging to a group of sequences can be transferred to calculate the probability that the new feature sequence evolves from the original one. The proposed work focuses on the hierarchical classification of G-protein Coupled Receptors (GPCRs), which begins by extracting the feature sequences from the multiple sequence alignment results of the GPCRs sub-subfamilies. The N-gram model is then applied to construct the input vectors. Finally, these vectors are imported into a convolutional neural network to make a prediction. The experimental results elucidate that the proposed method provides significant performance improvements. The classification error rate of the proposed method is reduced by at least 4.67% (family level I) and 5.75% (family Level II), in comparison with the current state-of-the-art methods. The implementation program of the proposed work is freely available at: https://github.com/alanFchina/CNN .

  12. Using the underlying biological organization of the Mycobacterium tuberculosis functional network for protein function prediction.

    PubMed

    Mazandu, Gaston K; Mulder, Nicola J

    2012-07-01

    Despite ever-increasing amounts of sequence and functional genomics data, there is still a deficiency of functional annotation for many newly sequenced proteins. For Mycobacterium tuberculosis (MTB), more than half of its genome is still uncharacterized, which hampers the search for new drug targets within the bacterial pathogen and limits our understanding of its pathogenicity. As for many other genomes, the annotations of proteins in the MTB proteome were generally inferred from sequence homology, which is effective but its applicability has limitations. We have carried out large-scale biological data integration to produce an MTB protein functional interaction network. Protein functional relationships were extracted from the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) database, and additional functional interactions from microarray, sequence and protein signature data. The confidence level of protein relationships in the additional functional interaction data was evaluated using a dynamic data-driven scoring system. This functional network has been used to predict functions of uncharacterized proteins using Gene Ontology (GO) terms, and the semantic similarity between these terms measured using a state-of-the-art GO similarity metric. To achieve better trade-off between improvement of quality, genomic coverage and scalability, this prediction is done by observing the key principles driving the biological organization of the functional network. This study yields a new functionally characterized MTB strain CDC1551 proteome, consisting of 3804 and 3698 proteins out of 4195 with annotations in terms of the biological process and molecular function ontologies, respectively. These data can contribute to research into the Development of effective anti-tubercular drugs with novel biological mechanisms of action. Copyright © 2011 Elsevier B.V. All rights reserved.

  13. Genome-Wide Tuning of Protein Expression Levels to Rapidly Engineer Microbial Traits.

    PubMed

    Freed, Emily F; Winkler, James D; Weiss, Sophie J; Garst, Andrew D; Mutalik, Vivek K; Arkin, Adam P; Knight, Rob; Gill, Ryan T

    2015-11-20

    The reliable engineering of biological systems requires quantitative mapping of predictable and context-independent expression over a broad range of protein expression levels. However, current techniques for modifying expression levels are cumbersome and are not amenable to high-throughput approaches. Here we present major improvements to current techniques through the design and construction of E. coli genome-wide libraries using synthetic DNA cassettes that can tune expression over a ∼10(4) range. The cassettes also contain molecular barcodes that are optimized for next-generation sequencing, enabling rapid and quantitative tracking of alleles that have the highest fitness advantage. We show these libraries can be used to determine which genes and expression levels confer greater fitness to E. coli under different growth conditions.

  14. The AMP-activated protein kinase AAK-2 links energy levels and insulin-like signals to lifespan in C. elegans

    PubMed Central

    Apfeld, Javier; O'Connor, Greg; McDonagh, Tom; DiStefano, Peter S.; Curtis, Rory

    2004-01-01

    Although limiting energy availability extends lifespan in many organisms, it is not understood how lifespan is coupled to energy levels. We find that the AMP:ATP ratio, a measure of energy levels, increases with age in Caenorhabditis elegans and can be used to predict life expectancy. The C. elegans AMP-activated protein kinase α subunit AAK-2 is activated by AMP and functions to extend lifespan. In addition, either an environmental stressor that increases the AMP:ATP ratio or mutations that lower insulin-like signaling extend lifespan in an aak-2-dependent manner. Thus, AAK-2 is a sensor that couples lifespan to information about energy levels and insulin-like signals. PMID:15574588

  15. Influence of somatic cell count and breed on capillary electrophoretic protein profiles of ewes' milk: a chemometric study.

    PubMed

    Rodríguez-Nogales, J M; Vivar-Quintana, A M; Revilla, I

    2007-07-01

    Bulk tank ewe milk from the Assaf, Castellana, and Churra breeds categorized into 3 somatic cell count (SCC) groups (<500,000; 1,000,000 to 1,500,000; and >2,500,000 cells/mL) was used to investigate changes in chemical composition and capillary electrophoresis protein profiles. The results obtained indicated that breed affected fat, protein, and total solids levels, and differences were also observed for the following milk proteins: beta-, beta1-, beta2-, and alpha(s1)-III-casein, alpha-lactalbumin, and beta-lactoglobulin. High SCC affected fat and protein contents and bacterial counts. The level of beta1-, beta2-, and alpha(s1)-I-casein, and alpha-lactalbumin were significantly lower in milk with SCC scores >2,500,000 cells/mL. A preliminary study of the chemical, microbiological, and electrophoretic data was performed by cluster analysis and principal components analysis. Applying discriminant analysis, it was possible to group the milk samples according to breed and level of SCC, obtaining a prediction of 100 and 97% of the samples, respectively.

  16. Can history and exam alone reliably predict pneumonia?

    PubMed

    Graffelman, A W; le Cessie, S; Knuistingh Neven, A; Wilemssen, F E J A; Zonderland, H M; van den Broek, P J

    2007-06-01

    Prediction rules based on clinical information have been developed to support the diagnosis of pneumonia and help limit the use of expensive diagnostic tests. However, these prediction rules need to be validated in the primary care setting. Adults who met our definition of lower respiratory tract infection (LRTI) were recruited for a prospective study on the causes of LRTI, between November 15, 1998 and June 1, 2001 in the Leiden region of The Netherlands. Clinical information was collected and chest radiography was performed. A literature search was also done to find prediction rules for pneumonia. 129 patients--26 with pneumonia and 103 without--were included, and 6 prediction rules were applied. Only the model with the addition of a test for C-reactive protein had a significant area under the curve of 0.69 (95% confidence interval [CI], 0.58-0.80), with a positive predictive value of 47% (95% CI, 23-71) and a negative predictive value of 84% (95% CI, 77-91). The pretest probabilities for the presence and absence of pneumonia were 20% and 80%, respectively. Models based only on clinical information do not reliably predict the presence of pneumonia. The addition of an elevated C-reactive protein level seems of little value.

  17. C-reactive protein as a predictor of chorioamnionitis.

    PubMed

    Smith, Erik J; Muller, Corinna L; Sartorius, Jennifer A; White, David R; Maslow, Arthur S

    2012-10-01

    Chorioamnionitis (CAM) affects many pregnancies complicated by preterm premature rupture of membranes (PPROM). Finding a serum factor that could accurately predict the presence of CAM could potentially lead to more efficient management of PPROM and improved neonatal outcomes. To determine if C-reactive protein (CRP) is an effective early marker of CAM in patients with PPROM. A retrospective evaluation of pregnant women with PPROM at Geisinger Medical Center in Danville, Pennsylvania, between January 2005 and January 2009. Nonparametric statistical tests (ie, Wilcoxon rank sum and Spearman rank correlation) were used to compare distributions that were skewed. Characteristics of the study population were compared using 2-sample t tests for continuous variables and Fisher exact tests for discrete variables. Logistic regression analysis was used to generate receiver operating characteristic curves and obtain area under the curve estimates in stepwise fashion for predicting histologic CAM. A secondary analysis compared the characteristics among patients with clinical CAM, histologic CAM, or non-CAM. The total population of 73 women was subdivided into patients with histologic CAM (n=26) and patients without histologic CAM (ie, no evidence of CAM on placental pathology; n=47). There was no difference between groups in CRP levels, days of pregnancy latency, white blood cell count, smoking status, antibiotic administration, or steroid benefit. The group with histologic CAM delivered at earlier gestational ages: mean (standard deviation) age was 29.5 (4.4) weeks vs 31.9 (3.5) weeks (P=.02). For our primary analysis, we found no difference in CRP levels (P=.32). Receiver operating characteristic curve plots of CRP levels, temperature at delivery, and white blood cell count resulted in an area under the curve estimate of 0.696, which was 70% predictive of histologic CAM. In the secondary analysis, after adjusting for gestational age, the estimated hazard ratio for CRP change was 1.05 (95% confidence interval, 1.02-1.08; P=.001). Therefore, increasing CRP levels from PPROM was statistically significant in predicting clinical CAM development over time. C-reactive protein levels were not effective independent predictors of clinical or histologic CAM, nor was sequential CRP testing statistically significant for the identification of clinical or histologic CAM in patients with PPROM.

  18. New candidate markers of head and neck squamous cell carcinoma progression

    NASA Astrophysics Data System (ADS)

    Kakurina, G. V.; Kolegova, E. S.; Cheremisina, O. V.; Kulbakin, D. E.; Choinzonov, E. L.

    2017-09-01

    The tumor progression in head and neck squamous cell carcinoma (HNSCC) is one of the main causes of high mortality of the patients with HNSCC. The tumor progression, particularly the metastasis, is characterized by the changes in the composition, functions and structure of different proteins. We have previously shown that serum of HNSCC patients contains the proteins which regulate various cellular processes—adenylyl cyclase associated protein 1 (CAP1), protein phosphatase 1 B (PPM1B), etc. The levels of CAP1 and PPM1B were determined using the enzyme immunoassay. The results of this study show that CAP1 and PPM1B take a part in the progression of HNSCC. The levels of CAP1 and PPM1B in the tumor and in morphologically normal tissue depended on the prevalence of the tumor process. The CAP1 and PPM1B levels were significantly higher in tumor tissue of the patients with regional metastasis. Our data allow assuming the potential possibility for predicting the outcome of the HNSCC measuring the level of tissue CAP1.

  19. A partial loss of function allele of Methyl-CpG-binding protein 2 predicts a human neurodevelopmental syndrome

    PubMed Central

    Samaco, Rodney C.; Fryer, John D.; Ren, Jun; Fyffe, Sharyl; Chao, Hsiao-Tuan; Sun, Yaling; Greer, John J.; Zoghbi, Huda Y.; Neul, Jeffrey L.

    2008-01-01

    Rett Syndrome, an X-linked dominant neurodevelopmental disorder characterized by regression of language and hand use, is primarily caused by mutations in methyl-CpG-binding protein 2 (MECP2). Loss of function mutations in MECP2 are also found in other neurodevelopmental disorders such as autism, Angelman-like syndrome and non-specific mental retardation. Furthermore, duplication of the MECP2 genomic region results in mental retardation with speech and social problems. The common features of human neurodevelopmental disorders caused by the loss or increase of MeCP2 function suggest that even modest alterations of MeCP2 protein levels result in neurodevelopmental problems. To determine whether a small reduction in MeCP2 level has phenotypic consequences, we characterized a conditional mouse allele of Mecp2 that expresses 50% of the wild-type level of MeCP2. Upon careful behavioral analysis, mice that harbor this allele display a spectrum of abnormalities such as learning and motor deficits, decreased anxiety, altered social behavior and nest building, decreased pain recognition and disrupted breathing patterns. These results indicate that precise control of MeCP2 is critical for normal behavior and predict that human neurodevelopmental disorders will result from a subtle reduction in MeCP2 expression. PMID:18321864

  20. High MRPS23 expression contributes to hepatocellular carcinoma proliferation and indicates poor survival outcomes.

    PubMed

    Pu, Meng; Wang, Jianlin; Huang, Qike; Zhao, Ge; Xia, Congcong; Shang, Runze; Zhang, Zhuochao; Bian, Zhenyuan; Yang, Xishegn; Tao, Kaishan

    2017-07-01

    Hepatocellular carcinoma is one of the most prevalent neoplasms and the leading cause of cancer-related mortality worldwide. Mitochondrial ribosomal protein S23 is encoded by a nuclear gene and participates in mitochondrial protein translation. Mitochondrial ribosomal protein S23 overexpression has been found in many types of cancer. In this study, we explored mitochondrial ribosomal protein S23 expression in primary hepatocellular carcinoma tissues compared with matched adjacent non-tumoral liver tissues using mitochondrial ribosomal protein S23 messenger RNA and protein levels collected from public databases and clinical samples. Immunohistochemistry was performed to analyze the relationship between mitochondrial ribosomal protein S23 and various clinicopathological features. The results indicated that mitochondrial ribosomal protein S23 was significantly overexpressed in hepatocellular carcinoma. High mitochondrial ribosomal protein S23 expression was correlated with the tumor size and tumor-metastasis-node stage. Moreover, patients with high mitochondrial ribosomal protein S23 expression levels presented poorer survival rates. Mitochondrial ribosomal protein S23 was an independent prognostic factor for survival, especially at the early stage of hepatocellular carcinoma. In addition, the downregulation of mitochondrial ribosomal protein S23 decreased the proliferation of hepatocellular carcinoma in vitro and in vivo. In conclusion, we verified for the first time that mitochondrial ribosomal protein S23 expression was upregulated in hepatocellular carcinoma. High mitochondrial ribosomal protein S23 levels can predict poor clinical outcomes in hepatocellular carcinoma, and this protein plays a key role in tumor proliferation. Therefore, mitochondrial ribosomal protein S23 may be a potential therapeutic target for hepatocellular carcinoma.

  1. Evaluating the validity of using unverified indices of body condition

    USGS Publications Warehouse

    Schamber, J.L.; Esler, Daniel N.; Flint, Paul L.

    2009-01-01

    Condition indices are commonly used in an attempt to link body condition of birds to ecological variables of interest, including demographic attributes such as survival and reproduction. Most indices are based on body mass adjusted for structural body size, calculated as simple ratios or residuals from regressions. However, condition indices are often applied without confirming their predictive value (i.e., without being validated against measured values of fat and protein), which we term ‘unverified’ use. We evaluated the ability of a number of unverified indices frequently found in the literature to predict absolute and proportional levels of fat and protein across five species of waterfowl. Among indices we considered, those accounting for body size never predicted absolute protein more precisely than body mass, however, some indices improved predictability of fat, although the form of the best index varied by species. Further, the gain in precision by using a condition index to predict either absolute or percent fat was minimal (rise in r2≤0.13), and in many cases model fit was actually reduced. Our data agrees with previous assertions that the assumption that indices provide more precise indicators of body condition than body mass alone is often invalid. We strongly discourage the use of unverified indices, because subjectively selecting indices likely does little to improve precision and might in fact decrease predictability relative to using body mass alone.

  2. MicroRNA networks in mouse lung organogenesis.

    PubMed

    Dong, Jie; Jiang, Guoqian; Asmann, Yan W; Tomaszek, Sandra; Jen, Jin; Kislinger, Thomas; Wigle, Dennis A

    2010-05-26

    MicroRNAs (miRNAs) are known to be important regulators of both organ development and tumorigenesis. MiRNA networks and their regulation of messenger RNA (mRNA) translation and protein expression in specific biological processes are poorly understood. We explored the dynamic regulation of miRNAs in mouse lung organogenesis. Comprehensive miRNA and mRNA profiling was performed encompassing all recognized stages of lung development beginning at embryonic day 12 and continuing to adulthood. We analyzed the expression patterns of dynamically regulated miRNAs and mRNAs using a number of statistical and computational approaches, and in an integrated manner with protein levels from an existing mass-spectrometry derived protein database for lung development. In total, 117 statistically significant miRNAs were dynamically regulated during mouse lung organogenesis and clustered into distinct temporal expression patterns. 11,220 mRNA probes were also shown to be dynamically regulated and clustered into distinct temporal expression patterns, with 3 major patterns accounting for 75% of all probes. 3,067 direct miRNA-mRNA correlation pairs were identified involving 37 miRNAs. Two defined correlation patterns were observed upon integration with protein data: 1) increased levels of specific miRNAs directly correlating with downregulation of predicted mRNA targets; and 2) increased levels of specific miRNAs directly correlating with downregulation of translated target proteins without detectable changes in mRNA levels. Of 1345 proteins analyzed, 55% appeared to be regulated in this manner with a direct correlation between miRNA and protein level, but without detectable change in mRNA levels. Systematic analysis of microRNA, mRNA, and protein levels over the time course of lung organogenesis demonstrates dynamic regulation and reveals 2 distinct patterns of miRNA-mRNA interaction. The translation of target proteins affected by miRNAs independent of changes in mRNA level appears to be a prominent mechanism of developmental regulation in lung organogenesis.

  3. Inter-kingdom prediction certainty evaluation of protein subcellular localization tools: microbial pathogenesis approach for deciphering host microbe interaction.

    PubMed

    Khan, Abdul Arif; Khan, Zakir; Kalam, Mohd Abul; Khan, Azmat Ali

    2018-01-01

    Microbial pathogenesis involves several aspects of host-pathogen interactions, including microbial proteins targeting host subcellular compartments and subsequent effects on host physiology. Such studies are supported by experimental data, but recent detection of bacterial proteins localization through computational eukaryotic subcellular protein targeting prediction tools has also come into practice. We evaluated inter-kingdom prediction certainty of these tools. The bacterial proteins experimentally known to target host subcellular compartments were predicted with eukaryotic subcellular targeting prediction tools, and prediction certainty was assessed. The results indicate that these tools alone are not sufficient for inter-kingdom protein targeting prediction. The correct prediction of pathogen's protein subcellular targeting depends on several factors, including presence of localization signal, transmembrane domain and molecular weight, etc., in addition to approach for subcellular targeting prediction. The detection of protein targeting in endomembrane system is comparatively difficult, as the proteins in this location are channelized to different compartments. In addition, the high specificity of training data set also creates low inter-kingdom prediction accuracy. Current data can help to suggest strategy for correct prediction of bacterial protein's subcellular localization in host cell. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  4. Evolutionary conservation of Ebola virus proteins predicts important functions at residue level.

    PubMed

    Arslan, Ahmed; van Noort, Vera

    2017-01-15

    The recent outbreak of Ebola virus disease (EVD) resulted in a large number of human deaths. Due to this devastation, the Ebola virus has attracted renewed interest as model for virus evolution. Recent literature on Ebola virus (EBOV) has contributed substantially to our understanding of the underlying genetics and its scope with reference to the 2014 outbreak. But no study yet, has focused on the conservation patterns of EBOV proteins. We analyzed the evolution of functional regions of EBOV and highlight the function of conserved residues in protein activities. We apply an array of computational tools to dissect the functions of EBOV proteins in detail: (i) protein sequence conservation, (ii) protein-protein interactome analysis, (iii) structural modeling and (iv) kinase prediction. Our results suggest the presence of novel post-translational modifications in EBOV proteins and their role in the modulation of protein functions and protein interactions. Moreover, on the basis of the presence of ATM recognition motifs in all EBOV proteins we postulate a role of DNA damage response pathways and ATM kinase in EVD. The ATM kinase is put forward, for further evaluation, as novel potential therapeutic target. http://www.biw.kuleuven.be/CSB/EBOV-PTMs CONTACT: vera.vannoort@biw.kuleuven.beSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.

  5. Mapping monomeric threading to protein-protein structure prediction.

    PubMed

    Guerler, Aysam; Govindarajoo, Brandon; Zhang, Yang

    2013-03-25

    The key step of template-based protein-protein structure prediction is the recognition of complexes from experimental structure libraries that have similar quaternary fold. Maintaining two monomer and dimer structure libraries is however laborious, and inappropriate library construction can degrade template recognition coverage. We propose a novel strategy SPRING to identify complexes by mapping monomeric threading alignments to protein-protein interactions based on the original oligomer entries in the PDB, which does not rely on library construction and increases the efficiency and quality of complex template recognitions. SPRING is tested on 1838 nonhomologous protein complexes which can recognize correct quaternary template structures with a TM score >0.5 in 1115 cases after excluding homologous proteins. The average TM score of the first model is 60% and 17% higher than that by HHsearch and COTH, respectively, while the number of targets with an interface RMSD <2.5 Å by SPRING is 134% and 167% higher than these competing methods. SPRING is controlled with ZDOCK on 77 docking benchmark proteins. Although the relative performance of SPRING and ZDOCK depends on the level of homology filters, a combination of the two methods can result in a significantly higher model quality than ZDOCK at all homology thresholds. These data demonstrate a new efficient approach to quaternary structure recognition that is ready to use for genome-scale modeling of protein-protein interactions due to the high speed and accuracy.

  6. Elucidating the mechanism of protein water channels by molecular dynamics simulations

    NASA Astrophysics Data System (ADS)

    Grubmuller, Helmut

    2004-03-01

    Aquaporins are highly selective water channels. Molecular dynamics simulations of multiple water permeation events correctly predict the measured rate and explain at the atomic level why these membrane channels are so efficient, while blocking other small molecules, ions, and even protons. High efficiency is achieved through a carefully tailored balance of hydrogen bonds that the protein substitutes for the bulk interactions; selectivity is achieved mainly by electrostatic barriers.

  7. Use of Biodescriptors and Chemodescriptors in Predictive Toxicology: A Mathematical/Computational Approach

    DTIC Science & Technology

    2005-01-01

    proteomic gel analyses. The research group has explored the use of chemodescriptors calculated using high-level ab initio quantum chemical basis sets...descriptors that characterize the entire proteomics map, local descriptors that characterize a subset of the proteins present in the gel, and spectrum...techniques for analyzing the full set of proteins present in a proteomics map. 14. SUBJECT TERMS 1S. NUMBER OF PAGES Topological indices

  8. Tula and Puumala hantavirus NSs ORFs are functional and the products inhibit activation of the interferon-beta promoter.

    PubMed

    Jääskeläinen, Kirsi M; Kaukinen, Pasi; Minskaya, Ekaterina S; Plyusnina, Angelina; Vapalahti, Olli; Elliott, Richard M; Weber, Friedemann; Vaheri, Antti; Plyusnin, Alexander

    2007-10-01

    The S RNA genome segment of hantaviruses carried by Arvicolinae and Sigmodontinae rodents encodes the nucleocapsid (N) protein and has an overlapping (+1) open reading frame (ORF) for a putative nonstructural protein (NSs). The aim of this study was to determine whether the ORF is functional. A protein corresponding to the predicted size of Tula virus (TULV) NSs was detected using coupled in vitro transcription and translation from a cloned S segment cDNA, and a protein corresponding to the predicted size of Puumala virus (PUUV) NSs was detected in infected cells by Western blotting with an anti-peptide serum. The activities of the interferon beta (IFN-beta) promoter, and nuclear factor kappa B (NF-kappaB)- and interferon regulatory factor-3 (IRF-3) responsive promoters, were inhibited in COS-7 cells transiently expressing TULV or PUUV NSs. Also IFN-beta mRNA levels in IFN-competent MRC5 cells either infected with TULV or transiently expressing NSs were decreased. These data demonstrate that Tula and Puumala hantaviruses have a functional NSs ORF. The findings may explain why the NSs ORF has been preserved in the genome of most hantaviruses during their long evolution and why hantavirus-infected cells secrete relatively low levels of IFNs. (c) 2007 Wiley-Liss, Inc.

  9. In silico re-identification of properties of drug target proteins.

    PubMed

    Kim, Baeksoo; Jo, Jihoon; Han, Jonghyun; Park, Chungoo; Lee, Hyunju

    2017-05-31

    Computational approaches in the identification of drug targets are expected to reduce time and effort in drug development. Advances in genomics and proteomics provide the opportunity to uncover properties of druggable genomes. Although several studies have been conducted for distinguishing drug targets from non-drug targets, they mainly focus on the sequences and functional roles of proteins. Many other properties of proteins have not been fully investigated. Using the DrugBank (version 3.0) database containing nearly 6,816 drug entries including 760 FDA-approved drugs and 1822 of their targets and human UniProt/Swiss-Prot databases, we defined 1578 non-redundant drug target and 17,575 non-drug target proteins. To select these non-redundant protein datasets, we built four datasets (A, B, C, and D) by considering clustering of paralogous proteins. We first reassessed the widely used properties of drug target proteins. We confirmed and extended that drug target proteins (1) are likely to have more hydrophobic, less polar, less PEST sequences, and more signal peptide sequences higher and (2) are more involved in enzyme catalysis, oxidation and reduction in cellular respiration, and operational genes. In this study, we proposed new properties (essentiality, expression pattern, PTMs, and solvent accessibility) for effectively identifying drug target proteins. We found that (1) drug targetability and protein essentiality are decoupled, (2) druggability of proteins has high expression level and tissue specificity, and (3) functional post-translational modification residues are enriched in drug target proteins. In addition, to predict the drug targetability of proteins, we exploited two machine learning methods (Support Vector Machine and Random Forest). When we predicted drug targets by combining previously known protein properties and proposed new properties, an F-score of 0.8307 was obtained. When the newly proposed properties are integrated, the prediction performance is improved and these properties are related to drug targets. We believe that our study will provide a new aspect in inferring drug-target interactions.

  10. Large-scale binding ligand prediction by improved patch-based method Patch-Surfer2.0

    PubMed Central

    Zhu, Xiaolei; Xiong, Yi; Kihara, Daisuke

    2015-01-01

    Motivation: Ligand binding is a key aspect of the function of many proteins. Thus, binding ligand prediction provides important insight in understanding the biological function of proteins. Binding ligand prediction is also useful for drug design and examining potential drug side effects. Results: We present a computational method named Patch-Surfer2.0, which predicts binding ligands for a protein pocket. By representing and comparing pockets at the level of small local surface patches that characterize physicochemical properties of the local regions, the method can identify binding pockets of the same ligand even if they do not share globally similar shapes. Properties of local patches are represented by an efficient mathematical representation, 3D Zernike Descriptor. Patch-Surfer2.0 has significant technical improvements over our previous prototype, which includes a new feature that captures approximate patch position with a geodesic distance histogram. Moreover, we constructed a large comprehensive database of ligand binding pockets that will be searched against by a query. The benchmark shows better performance of Patch-Surfer2.0 over existing methods. Availability and implementation: http://kiharalab.org/patchsurfer2.0/ Contact: dkihara@purdue.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25359888

  11. Serum surfactant protein D predicts the outcome of patients with idiopathic pulmonary fibrosis treated with pirfenidone.

    PubMed

    Ikeda, Kimiyuki; Shiratori, Masanori; Chiba, Hirofumi; Nishikiori, Hirotaka; Yokoo, Keiki; Saito, Atsushi; Hasegawa, Yoshihiro; Kuronuma, Koji; Otsuka, Mitsuo; Yamada, Gen; Takahashi, Hiroki

    2017-10-01

    Idiopathic pulmonary fibrosis (IPF) is a fatal pulmonary disease with poor prognosis. Pirfenidone, the first antifibrotic drug, suppresses the decline in forced vital capacity (FVC) and improves prognosis in some, but not all, patients with IPF; therefore, an indicator for identifying improved outcomes in pirfenidone therapy is desirable. This study aims to clarify whether baseline parameters can be predictors of disease progression and prognosis in patients with IPF treated with pirfenidone. We retrospectively investigated patients with IPF who started treatment with pirfenidone between December 2008 and November 2014 at the Sapporo Medical University Hospital. Patients treated with pirfenidone for ≥6 months were enrolled in this study and were observed until November 2015. We investigated the association of clinical characteristics, pulmonary function test results, and blood examination results at the start of pirfenidone with the outcome of patients. Sixty patients were included in this study. In multivariate logistic regression analysis, % predicted FVC and serum surfactant protein (SP)-D levels were predictors of a ≥10% decline in FVC in the initial 12 months. In the Cox proportional hazards model, these two factors predicted progression-free survival. Pack-years, % predicted diffusing capacity for carbon monoxide, and SP-D levels predicted overall survival. The serum SP-D level was a predictor of disease progression and prognosis in patients with IPF treated with pirfenidone. In addition, this analysis describes the relative usefulness of other clinical parameters at baseline in estimating the prognosis of patients with IPF who are candidates for pirfenidone therapy. Copyright © 2017 Elsevier Ltd. All rights reserved.

  12. Elevation of Plasmin-α2-plasmin Inhibitor Complex Predicts the Diagnosis of Systemic AL Amyloidosis in Patients with Monoclonal Protein.

    PubMed

    Ishiguro, Kazuya; Hayashi, Toshiaki; Yokoyama, Yoshihiro; Aoki, Yuka; Onodera, Kei; Ikeda, Hiroshi; Ishida, Tadao; Nakase, Hiroshi

    2018-03-15

    Objective The complication of systemic immunoglobulin light chain (AL) amyloidosis in patients with monoclonal immunoglobulin affects the prognosis, but amyloid deposition in tissues is sometimes difficult to detect due to bleeding tendencies and preferential distributions. However, fibrinolysis is known to be exacerbated in patients with systemic AL amyloidosis specifically. We therefore explored new biomarkers for predicting a diagnosis of systemic AL amyloidosis focusing on coagulation and fibrinolysis markers. Methods We reviewed the clinical features and treatment outcomes of patients with serum monoclonal protein, including primary systemic AL amyloidosis and multiple myeloma (MM), treated at our hospital between January 2008 and December 2014. Results Among several biomarkers, only the serum level of plasmin-α2-plasmin inhibitor complex (PIC) in patients with systemic AL amyloidosis (n=26) at the diagnosis was significantly higher than in patients with MM without AL amyloidosis (n=26) (mean±standard deviation, 3.69±2.82 μg/mL vs. 1.23±0.97 μg/mL, p<0.01). The cut-off for predicting a diagnosis of systemic AL amyloidosis in patients with serum monoclonal protein was 1.72 μg/mL with 84.6% sensitivity and 80.8% specificity. Hepatic involvement resulted in a significantly higher PIC level than no involvement in patients with systemic AL amyloidosis. The serum PIC level was also associated with the hematological response of systemic AL amyloidosis. Conclusion PIC is a useful biomarker for the diagnosis and management of patients with systemic AL amyloidosis.

  13. HomPPI: a class of sequence homology based protein-protein interface prediction methods

    PubMed Central

    2011-01-01

    Background Although homology-based methods are among the most widely used methods for predicting the structure and function of proteins, the question as to whether interface sequence conservation can be effectively exploited in predicting protein-protein interfaces has been a subject of debate. Results We studied more than 300,000 pair-wise alignments of protein sequences from structurally characterized protein complexes, including both obligate and transient complexes. We identified sequence similarity criteria required for accurate homology-based inference of interface residues in a query protein sequence. Based on these analyses, we developed HomPPI, a class of sequence homology-based methods for predicting protein-protein interface residues. We present two variants of HomPPI: (i) NPS-HomPPI (Non partner-specific HomPPI), which can be used to predict interface residues of a query protein in the absence of knowledge of the interaction partner; and (ii) PS-HomPPI (Partner-specific HomPPI), which can be used to predict the interface residues of a query protein with a specific target protein. Our experiments on a benchmark dataset of obligate homodimeric complexes show that NPS-HomPPI can reliably predict protein-protein interface residues in a given protein, with an average correlation coefficient (CC) of 0.76, sensitivity of 0.83, and specificity of 0.78, when sequence homologs of the query protein can be reliably identified. NPS-HomPPI also reliably predicts the interface residues of intrinsically disordered proteins. Our experiments suggest that NPS-HomPPI is competitive with several state-of-the-art interface prediction servers including those that exploit the structure of the query proteins. The partner-specific classifier, PS-HomPPI can, on a large dataset of transient complexes, predict the interface residues of a query protein with a specific target, with a CC of 0.65, sensitivity of 0.69, and specificity of 0.70, when homologs of both the query and the target can be reliably identified. The HomPPI web server is available at http://homppi.cs.iastate.edu/. Conclusions Sequence homology-based methods offer a class of computationally efficient and reliable approaches for predicting the protein-protein interface residues that participate in either obligate or transient interactions. For query proteins involved in transient interactions, the reliability of interface residue prediction can be improved by exploiting knowledge of putative interaction partners. PMID:21682895

  14. Incorporating information on predicted solvent accessibility to the co-evolution-based study of protein interactions.

    PubMed

    Ochoa, David; García-Gutiérrez, Ponciano; Juan, David; Valencia, Alfonso; Pazos, Florencio

    2013-01-27

    A widespread family of methods for studying and predicting protein interactions using sequence information is based on co-evolution, quantified as similarity of phylogenetic trees. Part of the co-evolution observed between interacting proteins could be due to co-adaptation caused by inter-protein contacts. In this case, the co-evolution is expected to be more evident when evaluated on the surface of the proteins or the internal layers close to it. In this work we study the effect of incorporating information on predicted solvent accessibility to three methods for predicting protein interactions based on similarity of phylogenetic trees. We evaluate the performance of these methods in predicting different types of protein associations when trees based on positions with different characteristics of predicted accessibility are used as input. We found that predicted accessibility improves the results of two recent versions of the mirrortree methodology in predicting direct binary physical interactions, while it neither improves these methods, nor the original mirrortree method, in predicting other types of interactions. That improvement comes at no cost in terms of applicability since accessibility can be predicted for any sequence. We also found that predictions of protein-protein interactions are improved when multiple sequence alignments with a richer representation of sequences (including paralogs) are incorporated in the accessibility prediction.

  15. Cloud prediction of protein structure and function with PredictProtein for Debian.

    PubMed

    Kaján, László; Yachdav, Guy; Vicedo, Esmeralda; Steinegger, Martin; Mirdita, Milot; Angermüller, Christof; Böhm, Ariane; Domke, Simon; Ertl, Julia; Mertes, Christian; Reisinger, Eva; Staniewski, Cedric; Rost, Burkhard

    2013-01-01

    We report the release of PredictProtein for the Debian operating system and derivatives, such as Ubuntu, Bio-Linux, and Cloud BioLinux. The PredictProtein suite is available as a standard set of open source Debian packages. The release covers the most popular prediction methods from the Rost Lab, including methods for the prediction of secondary structure and solvent accessibility (profphd), nuclear localization signals (predictnls), and intrinsically disordered regions (norsnet). We also present two case studies that successfully utilize PredictProtein packages for high performance computing in the cloud: the first analyzes protein disorder for whole organisms, and the second analyzes the effect of all possible single sequence variants in protein coding regions of the human genome.

  16. Cloud Prediction of Protein Structure and Function with PredictProtein for Debian

    PubMed Central

    Kaján, László; Yachdav, Guy; Vicedo, Esmeralda; Steinegger, Martin; Mirdita, Milot; Angermüller, Christof; Böhm, Ariane; Domke, Simon; Ertl, Julia; Mertes, Christian; Reisinger, Eva; Rost, Burkhard

    2013-01-01

    We report the release of PredictProtein for the Debian operating system and derivatives, such as Ubuntu, Bio-Linux, and Cloud BioLinux. The PredictProtein suite is available as a standard set of open source Debian packages. The release covers the most popular prediction methods from the Rost Lab, including methods for the prediction of secondary structure and solvent accessibility (profphd), nuclear localization signals (predictnls), and intrinsically disordered regions (norsnet). We also present two case studies that successfully utilize PredictProtein packages for high performance computing in the cloud: the first analyzes protein disorder for whole organisms, and the second analyzes the effect of all possible single sequence variants in protein coding regions of the human genome. PMID:23971032

  17. Predicting motion sickness during parabolic flight

    NASA Technical Reports Server (NTRS)

    Harm, Deborah L.; Schlegel, Todd T.

    2002-01-01

    BACKGROUND: There are large individual differences in susceptibility to motion sickness. Attempts to predict who will become motion sick have had limited success. In the present study, we examined gender differences in resting levels of salivary amylase and total protein, cardiac interbeat intervals (R-R intervals), and a sympathovagal index and evaluated their potential to correctly classify individuals into two motion sickness severity groups. METHODS: Sixteen subjects (10 men and 6 women) flew four sets of 10 parabolas aboard NASA's KC-135 aircraft. Saliva samples for amylase and total protein were collected preflight on the day of the flight and motion sickness symptoms were recorded during each parabola. Cardiovascular parameters were collected in the supine position 1-5 days before the flight. RESULTS: There were no significant gender differences in sickness severity or any of the other variables mentioned above. Discriminant analysis using salivary amylase, R-R intervals and the sympathovagal index produced a significant Wilks' lambda coefficient of 0.36, p=0.006. The analysis correctly classified 87% of the subjects into the none-mild sickness or the moderate-severe sickness group. CONCLUSIONS: The linear combination of resting levels of salivary amylase, high-frequency R-R interval levels, and a sympathovagal index may be useful in predicting motion sickness severity.

  18. Predicting Motion Sickness During Parabolic Flight

    NASA Technical Reports Server (NTRS)

    Harm, Deborah L.; Schlegel, Todd T.

    2002-01-01

    Background: There are large individual differences in susceptibility to motion sickness. Attempts to predict who will become motion sick have had limited success. In the present study we examined gender differences in resting levels of salivary amylase and total protein, cardiac interbeat intervals (R-R intervals), and a sympathovagal index and evaluated their potential to correctly classify individuals into two motion sickness severity groups. Methods: Sixteen subjects (10 men and 6 women) flew 4 sets of 10 parabolas aboard NASA's KC-135 aircraft. Saliva samples for amylase and total protein were collected preflight on the day of the flight and motion sickness symptoms were recorded during each parabola. Cardiovascular parameters were collected in the supine position 1-5 days prior to the flight. Results: There were no significant gender differences in sickness severity or any of the other variables mentioned above. Discriminant analysis using salivary amylase, R-R intervals and the sympathovagal index produced a significant Wilks' lambda coefficient of 0.36, p= 0.006. The analysis correctly classified 87% of the subjects into the none-mild sickness or the moderate-severe sickness group. Conclusions: The linear combination of resting levels of salivary amylase, high frequency R-R interval levels, and a sympathovagal index may be useful in predicting motion sickness severity.

  19. Modeling complexes of modeled proteins.

    PubMed

    Anishchenko, Ivan; Kundrotas, Petras J; Vakser, Ilya A

    2017-03-01

    Structural characterization of proteins is essential for understanding life processes at the molecular level. However, only a fraction of known proteins have experimentally determined structures. This fraction is even smaller for protein-protein complexes. Thus, structural modeling of protein-protein interactions (docking) primarily has to rely on modeled structures of the individual proteins, which typically are less accurate than the experimentally determined ones. Such "double" modeling is the Grand Challenge of structural reconstruction of the interactome. Yet it remains so far largely untested in a systematic way. We present a comprehensive validation of template-based and free docking on a set of 165 complexes, where each protein model has six levels of structural accuracy, from 1 to 6 Å C α RMSD. Many template-based docking predictions fall into acceptable quality category, according to the CAPRI criteria, even for highly inaccurate proteins (5-6 Å RMSD), although the number of such models (and, consequently, the docking success rate) drops significantly for models with RMSD > 4 Å. The results show that the existing docking methodologies can be successfully applied to protein models with a broad range of structural accuracy, and the template-based docking is much less sensitive to inaccuracies of protein models than the free docking. Proteins 2017; 85:470-478. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  20. Investigating the clinical potential for 14-3-3 zeta protein to serve as a biomarker for epithelial ovarian cancer

    PubMed Central

    2013-01-01

    Objective Recently, 14-3-3 zeta protein was identified as a potential serum biomarker of epithelial ovarian cancer (EOC). The goal of this study was to investigate the clinical potential of 14-3-3 zeta protein for monitoring EOC progression compared with CA-125 and HE4. Design Prospective follow-up study. Setting University of Pecs Medical Center Department of Obstetrics and Gynecology/Oncology (Pecs, Hungary). Population Thirteen EOC patients with advanced stage (FIGO IIb-IIIc) epithelial ovarian cancer that underwent radical surgery and received six consecutive cycles of first line chemotherapy (paclitaxel, carboplatin) in 21-day intervals. Methods Pre- and post-chemotherapy computed tomography (CT) scans were performed. Serum levels of CA-125, HE4, and 14-3-3 zeta protein were detected by enzyme-linked immunosorbent assay (ELISA) and quantitative electrochemiluminescence assay (ECLIA). Main outcome measures Serum levels of CA-125, HE4, and 14-3-3 zeta protein, as well as lesion size according to pre- and post-chemotherapy CT scans. Results Serum levels of CA-125 and HE4 were found to significantly decrease following chemotherapy, and this was consistent with the decrease in lesion size detected post-chemotherapy. In contrast, 14-3-3 zeta protein levels did not significantly differ in healthy postmenopausal patients versus EOC patients. Conclusions Determination of CA-125 and HE4 serum levels for the determination of the risk of ovarian malignancy algorithm (ROMA) represents a useful tool for the prediction of chemotherapy efficacy for EOC patients. However, levels of 14-3-3 zeta protein were not found to vary significantly as a consequence of treatment. Therefore we question if 14-3-3 zeta protein is a reliable biomarker, which correlates with the clinical behavior of EOC. PMID:24238270

  1. Interspecific- and acclimation-induced variation in levels of heat-shock proteins 70 (hsp70) and 90 (hsp90) and heat-shock transcription factor-1 (HSF1) in congeneric marine snails (genus Tegula): implications for regulation of hsp gene expression.

    PubMed

    Tomanek, Lars; Somero, George N

    2002-03-01

    In our previous studies of heat-shock protein (hsp) expression in congeneric marine gastropods of the genus Tegula, we observed interspecific and acclimation-induced variation in the temperatures at which heat-shock gene expression is induced (T(on)). To investigate the factors responsible for these inter- and intraspecific differences in T(on), we tested the predictions of the 'cellular thermometer' model for the transcriptional regulation of hsp expression. According to this model, hsps not active in chaperoning unfolded proteins bind to a transcription factor, heat-shock factor-1 (HSF1), thereby reducing the levels of free HSF1 that are available to bind to the heat-shock element, a regulatory element upstream of hsp genes. Under stress, hsps bind to denatured proteins, releasing HSF1, which can now activate hsp gene transcription. Thus, elevated levels of heat-shock proteins of the 40, 70 and 90 kDa families (hsp 40, hsp70 and hsp90, respectively) would be predicted to elevate T(on). Conversely, elevated levels of HSF1 would be predicted to decrease T(on). Following laboratory acclimation to 13, 18 and 23 degrees C, we used solid-phase immunochemistry (western analysis) to quantify endogenous levels of two hsp70 isoforms (hsp74 and hsp72), hsp90 and HSF1 in the low- to mid-intertidal species Tegula funebralis and in two subtidal to low-intertidal congeners, T. brunnea and T. montereyi. We found higher endogenous levels of hsp72 (a strongly heat-induced isoform) at 13 and 18 degrees C in T. funebralis in comparison with T. brunnea and T. montereyi. However, T. funebralis also had higher levels of HSF1 than its congeners. The higher levels of HSF1 in T. funebralis cannot, within the framework of the cellular thermometer model, account for the higher T(on) observed for this species, although they may explain why T. funebralis is able to induce the heat-shock response more rapidly than T. brunnea. However, the cellular thermometer model does appear to explain the cause of the increases in T(on) that occurred during warm acclimation of the two subtidal species, in which warm acclimation was accompanied by increased levels of hsp72, hsp74 and hsp90, whereas levels of HSF1 remained stable. T. funebralis, which experiences greater heat stress than its subtidal congeners, consistently had higher ratios of hsp72 to hsp74 than its congeners, although the sum of levels of the two isoforms was similar for all three species except at the highest acclimation temperature (23 degrees C). The ratio of hsp72 to hsp74 may provide a more accurate estimate of environmental heat stress than the total concentrations of both hsp70 isoforms.

  2. C-reactive protein, waist circumference, and family history of heart attack are independent predictors of body iron stores in apparently healthy premenopausal women.

    PubMed

    Suárez-Ortegón, M F; Arbeláez, A; Mosquera, M; Méndez, F; Aguilar-de Plata, C

    2012-08-01

    Ferritin levels have been associated with metabolic syndrome and insulin resistance. The aim of the present study was to evaluate the prediction of ferritin levels by variables related to cardiometabolic disease risk in a multivariate analysis. For this aim, 123 healthy women (72 premenopausal and 51 posmenopausal) were recruited. Data were collected through procedures of anthropometric measurements, questionnaires for personal/familial antecedents, and dietary intake (24-h recall), and biochemical determinations (ferritin, C reactive protein (CRP), glucose, insulin, and lipid profile) in blood serum samples obtained. Multiple linear regression analysis was used and variables with no normal distribution were log-transformed for this analysis. In premenopausal women, a model to explain log-ferritin levels was found with log-CRP levels, heart attack familial history, and waist circumference as independent predictors. Ferritin behaves as other cardiovascular markers in terms of prediction of its levels by documented predictors of cardiometabolic disease and related disorders. This is the first report of a relationship between heart attack familial history and ferritin levels. Further research is required to evaluate the mechanism to explain the relationship of central body fat and heart attack familial history with body iron stores values.

  3. Predictive Biomarkers of Radiation Sensitivity in Rectal Cancer

    NASA Astrophysics Data System (ADS)

    Tut, Thein Ga

    Colorectal cancer (CRC) is the third most common cancer in the world. Australia, New Zealand, Canada, the United States, and parts of Europe have the highest incidence rates of CRC. China, India, South America and parts of Africa have the lowest risk of CRC. CRC is the second most common cancer in both sexes in Australia. Even though the death rates from CRC involving the colon have diminished, those arising from the rectum have revealed no improvement. The greatest obstacle in attaining a complete surgical resection of large rectal cancers is the close anatomical relation to surrounding structures, as opposed to the free serosal surfaces enfolding the colon. To assist complete resection, pre-operative radiotherapy (DXT) can be applied, but the efficacy of ionising radiation (IR) is extremely variable between individual tumours. Reliable predictive marker/s that enable patient stratification in the application of this otherwise toxic therapy is still not available. Current therapeutic management of rectal cancer can be improved with the availability of better predictive and prognostic biomarkers. Proteins such as Plk1, gammaH2AX and MMR proteins (MSH2, MSH6, MLH1 and PMS2), involved in DNA damage response (DDR) pathway may be possible biomarkers for radiation response prediction and prognostication of rectal cancer. Serine/threonine protein kinase Plk1 is overexpressed in most of cancers including CRC. Plk1 functional activity is essential in the restoration of DNA damage following IR, which causes DNA double strand break (DSB). The earliest manifestation of this reparative process is histone H2AX phosphorylation at serine 139, leading to gammaH2AX. Colorectal normal mucosa showed the lowest level of gammaH2AX with gradually increasing levels in early adenoma and then in advanced malignant colorectal tissues, leading to the possibility that gammaH2AX may be a prospective biomarker in rectal cancer management. There are numerous publications regarding DNA mismatch repair (MMR) proteins, the insufficiency of which is characteristic of CRCs with microsatellite instability (MSI). MSI may enable unlimited replicative potential of malignant cell that leads to radiation injury resistance. Therefore, these proteins were characterized in both CRC cell lines (MMR proteins) and different (core and invasive front) rectal cancer tissues (Plk1, gammaH2AX and MMR proteins) exposed to radiation. Histopathological grading of tumour regression was performed following radiotherapy in rectal cancer as a marker of radiotherapy response and a surrogate indicator of patient prognosis. Though MMR protein expressions correlated with improved in vitro cell survival following radiation, these findings could only be partially replicated in patient tissue samples. This may not be entirely unexpected, given intratumoural heterogeneity in genetic profiles and oxygenation between individual cancer cells, their interaction with stromal environment and a multitude of other factors that cannot be adequately replicated in cell line experiments. In our rectal cancer patient cohort, histopathological regression following radiotherapy did appear to correlate with better clinical outcome, but certainly no replacement for the routine pTNM staging with which it was compared. Overexpression of Plk1 in the primary rectal cancer also correlates with poor tumour regression and reduced overall survival. High level of gammaH2AX correlates with higher tumour stage, perineural invasion and vascular invasion. However, interpretation of the results is limited by the small number of positivity amongst the cohort, with respect to gammaH2AX and MMR proteins. The combined analysis of all the proteins examined in this thesis revealed no interactions, possibly suggesting these biomarkers act individually within the DDR pathway, rather than in a demonstrably interdependent manner. Though our results are mixed, finding biomarkers predictive of radiation response is nonetheless critical. Enhancing the radiosensitivity of cancers through manipulating the functional activity and/or expression of prospective biomarkers could conceivably enhance tumour response to the level that the extent of consequent surgical resection can be minimized.

  4. Assessment of Protein Side-Chain Conformation Prediction Methods in Different Residue Environments

    PubMed Central

    Peterson, Lenna X.; Kang, Xuejiao; Kihara, Daisuke

    2016-01-01

    Computational prediction of side-chain conformation is an important component of protein structure prediction. Accurate side-chain prediction is crucial for practical applications of protein structure models that need atomic detailed resolution such as protein and ligand design. We evaluated the accuracy of eight side-chain prediction methods in reproducing the side-chain conformations of experimentally solved structures deposited to the Protein Data Bank. Prediction accuracy was evaluated for a total of four different structural environments (buried, surface, interface, and membrane-spanning) in three different protein types (monomeric, multimeric, and membrane). Overall, the highest accuracy was observed for buried residues in monomeric and multimeric proteins. Notably, side-chains at protein interfaces and membrane-spanning regions were better predicted than surface residues even though the methods did not all use multimeric and membrane proteins for training. Thus, we conclude that the current methods are as practically useful for modeling protein docking interfaces and membrane-spanning regions as for modeling monomers. PMID:24619909

  5. Microscale to Manufacturing Scale-up of Cell-Free Cytokine Production—A New Approach for Shortening Protein Production Development Timelines

    PubMed Central

    Zawada, James F; Yin, Gang; Steiner, Alexander R; Yang, Junhao; Naresh, Alpana; Roy, Sushmita M; Gold, Daniel S; Heinsohn, Henry G; Murray, Christopher J

    2011-01-01

    Engineering robust protein production and purification of correctly folded biotherapeutic proteins in cell-based systems is often challenging due to the requirements for maintaining complex cellular networks for cell viability and the need to develop associated downstream processes that reproducibly yield biopharmaceutical products with high product quality. Here, we present an alternative Escherichia coli-based open cell-free synthesis (OCFS) system that is optimized for predictable high-yield protein synthesis and folding at any scale with straightforward downstream purification processes. We describe how the linear scalability of OCFS allows rapid process optimization of parameters affecting extract activation, gene sequence optimization, and redox folding conditions for disulfide bond formation at microliter scales. Efficient and predictable high-level protein production can then be achieved using batch processes in standard bioreactors. We show how a fully bioactive protein produced by OCFS from optimized frozen extract can be purified directly using a streamlined purification process that yields a biologically active cytokine, human granulocyte-macrophage colony-stimulating factor, produced at titers of 700 mg/L in 10 h. These results represent a milestone for in vitro protein synthesis, with potential for the cGMP production of disulfide-bonded biotherapeutic proteins. Biotechnol. Bioeng. 2011; 108:1570–1578. © 2011 Wiley Periodicals, Inc. PMID:21337337

  6. A Quantitative Spatial Proteomics Analysis of Proteome Turnover in Human Cells*

    PubMed Central

    Boisvert, François-Michel; Ahmad, Yasmeen; Gierliński, Marek; Charrière, Fabien; Lamont, Douglas; Scott, Michelle; Barton, Geoff; Lamond, Angus I.

    2012-01-01

    Measuring the properties of endogenous cell proteins, such as expression level, subcellular localization, and turnover rates, on a whole proteome level remains a major challenge in the postgenome era. Quantitative methods for measuring mRNA expression do not reliably predict corresponding protein levels and provide little or no information on other protein properties. Here we describe a combined pulse-labeling, spatial proteomics and data analysis strategy to characterize the expression, localization, synthesis, degradation, and turnover rates of endogenously expressed, untagged human proteins in different subcellular compartments. Using quantitative mass spectrometry and stable isotope labeling with amino acids in cell culture, a total of 80,098 peptides from 8,041 HeLa proteins were quantified, and their spatial distribution between the cytoplasm, nucleus and nucleolus determined and visualized using specialized software tools developed in PepTracker. Using information from ion intensities and rates of change in isotope ratios, protein abundance levels and protein synthesis, degradation and turnover rates were calculated for the whole cell and for the respective cytoplasmic, nuclear, and nucleolar compartments. Expression levels of endogenous HeLa proteins varied by up to seven orders of magnitude. The average turnover rate for HeLa proteins was ∼20 h. Turnover rate did not correlate with either molecular weight or net charge, but did correlate with abundance, with highly abundant proteins showing longer than average half-lives. Fast turnover proteins had overall a higher frequency of PEST motifs than slow turnover proteins but no general correlation was observed between amino or carboxyl terminal amino acid identities and turnover rates. A subset of proteins was identified that exist in pools with different turnover rates depending on their subcellular localization. This strongly correlated with subunits of large, multiprotein complexes, suggesting a general mechanism whereby their assembly is controlled in a different subcellular location to their main site of function. PMID:21937730

  7. Differential correlations between changes to glutathione redox state, protein ubiquitination, and stress-inducible HSPA chaperone expression after different types of oxidative stress.

    PubMed

    Girard, Pierre-Marie; Peynot, Nathalie; Lelièvre, Jean-Marc

    2018-05-12

    In primary bovine fibroblasts with an hspa1b/luciferase transgene, we examined the intensity of heat-shock response (HSR) following four types of oxidative stress or heat stress (HS), and its putative relationship with changes to different cell parameters, including reactive oxygen species (ROS), the redox status of the key molecules glutathione (GSH), NADP(H) NAD(H), and the post-translational protein modifications carbonylation, S-glutathionylation, and ubiquitination. We determined the sub-lethal condition generating the maximal luciferase activity and inducible HSPA protein level for treatments with hydrogen peroxide (H 2 O 2 ), UVA-induced oxygen photo-activation, the superoxide-generating agent menadione (MN), and diamide (DA), an electrophilic and sulfhydryl reagent. The level of HSR induced by oxidative stress was the highest after DA and MN, followed by UVA and H 2 O 2 treatments, and was not correlated to the level of ROS production nor to the extent of protein S-glutathionylation or carbonylation observed immediately after stress. We found a correlation following oxidative treatments between HSR and the level of GSH/GSSG immediately after stress, and the increase in protein ubiquitination during the recovery period. Conversely, HS treatment, which led to the highest HSR level, did not generate ROS nor modified or depended on GSH redox state. Furthermore, the level of protein ubiquitination was maximum immediately after HS and lower than after MN and DA treatments thereafter. In these cells, heat-induced HSR was therefore clearly different from oxidative stress-induced HSR, in which conversely early redox changes of the major cellular thiol predicted the level of HSR and polyubiquinated proteins.

  8. Proteins Annexin A2 and PSA in Prostate Cancer Biopsies Do Not Predict Biochemical Failure.

    PubMed

    Lamb, David S; Sondhauss, Sven; Dunne, Jonathan C; Woods, Lisa; Delahunt, Brett; Ferguson, Peter; Murray, Judith; Nacey, John N; Denham, James W; Jordan, T William

    2017-12-01

    We previously reported the use of mass spectrometry and western blotting to identify proteins from tumour regions of formalin-fixed paraffin-embedded biopsies from 16 men who presented with apparently localized prostate cancer, and found that annexin A2 (ANXA2) appeared to be a better predictor of subsequent biochemical failure than prostate-specific antigen (PSA). In this follow-up study, ANXA2 and PSA were measured using western blotting of proteins extracted from biopsies from 37 men from a subsequent prostate cancer trial. No significant differences in ANXA2 and PSA levels were observed between men with and without biochemical failure. The statistical effect sizes were small, d=0.116 for ANXA2, and 0.266 for PSA. ANXA2 and PSA proteins measured from biopsy tumour regions are unlikely to be good biomarkers for prediction of the clinical outcome of prostate cancer presenting with apparently localized disease. Copyright© 2017, International Institute of Anticancer Research (Dr. George J. Delinasios), All rights reserved.

  9. A label-free approach to detect ligand binding to cell surface proteins in real time.

    PubMed

    Burtscher, Verena; Hotka, Matej; Li, Yang; Freissmuth, Michael; Sandtner, Walter

    2018-04-26

    Electrophysiological recordings allow for monitoring the operation of proteins with high temporal resolution down to the single molecule level. This technique has been exploited to track either ion flow arising from channel opening or the synchronized movement of charged residues and/or ions within the membrane electric field. Here, we describe a novel type of current by using the serotonin transporter (SERT) as a model. We examined transient currents elicited on rapid application of specific SERT inhibitors. Our analysis shows that these currents originate from ligand binding and not from a long-range conformational change. The Gouy-Chapman model predicts that adsorption of charged ligands to surface proteins must produce displacement currents and related apparent changes in membrane capacitance. Here we verified these predictions with SERT. Our observations demonstrate that ligand binding to a protein can be monitored in real time and in a label-free manner by recording the membrane capacitance. © 2018, Burtscher et al.

  10. Tertiary structural propensities reveal fundamental sequence/structure relationships.

    PubMed

    Zheng, Fan; Zhang, Jian; Grigoryan, Gevorg

    2015-05-05

    Extracting useful generalizations from the continually growing Protein Data Bank (PDB) is of central importance. We hypothesize that the PDB contains valuable quantitative information on the level of local tertiary structural motifs (TERMs). We show that by breaking a protein structure into its constituent TERMs, and querying the PDB to characterize the natural ensemble matching each, we can estimate the compatibility of the structure with a given amino acid sequence through a metric we term "structure score." Considering submissions from recent Critical Assessment of Structure Prediction (CASP) experiments, we found a strong correlation (R = 0.69) between structure score and model accuracy, with poorly predicted regions readily identifiable. This performance exceeds that of leading atomistic statistical energy functions. Furthermore, TERM-based analysis of two prototypical multi-state proteins rapidly produced structural insights fully consistent with prior extensive experimental studies. We thus find that TERM-based analysis should have considerable utility for protein structural biology. Copyright © 2015 Elsevier Ltd. All rights reserved.

  11. Discrete structural features among interface residue-level classes.

    PubMed

    Sowmya, Gopichandran; Ranganathan, Shoba

    2015-01-01

    Protein-protein interaction (PPI) is essential for molecular functions in biological cells. Investigation on protein interfaces of known complexes is an important step towards deciphering the driving forces of PPIs. Each PPI complex is specific, sensitive and selective to binding. Therefore, we have estimated the relative difference in percentage of polar residues between surface and the interface for each complex in a non-redundant heterodimer dataset of 278 complexes to understand the predominant forces driving binding. Our analysis showed ~60% of protein complexes with surface polarity greater than interface polarity (designated as class A). However, a considerable number of complexes (~40%) have interface polarity greater than surface polarity, (designated as class B), with a significantly different p-value of 1.66E-45 from class A. Comprehensive analyses of protein complexes show that interface features such as interface area, interface polarity abundance, solvation free energy gain upon interface formation, binding energy and the percentage of interface charged residue abundance distinguish among class A and class B complexes, while electrostatic visualization maps also help differentiate interface classes among complexes. Class A complexes are classical with abundant non-polar interactions at the interface; however class B complexes have abundant polar interactions at the interface, similar to protein surface characteristics. Five physicochemical interface features analyzed from the protein heterodimer dataset are discriminatory among the interface residue-level classes. These novel observations find application in developing residue-level models for protein-protein binding prediction, protein-protein docking studies and interface inhibitor design as drugs.

  12. Discrete structural features among interface residue-level classes

    PubMed Central

    2015-01-01

    Background Protein-protein interaction (PPI) is essential for molecular functions in biological cells. Investigation on protein interfaces of known complexes is an important step towards deciphering the driving forces of PPIs. Each PPI complex is specific, sensitive and selective to binding. Therefore, we have estimated the relative difference in percentage of polar residues between surface and the interface for each complex in a non-redundant heterodimer dataset of 278 complexes to understand the predominant forces driving binding. Results Our analysis showed ~60% of protein complexes with surface polarity greater than interface polarity (designated as class A). However, a considerable number of complexes (~40%) have interface polarity greater than surface polarity, (designated as class B), with a significantly different p-value of 1.66E-45 from class A. Comprehensive analyses of protein complexes show that interface features such as interface area, interface polarity abundance, solvation free energy gain upon interface formation, binding energy and the percentage of interface charged residue abundance distinguish among class A and class B complexes, while electrostatic visualization maps also help differentiate interface classes among complexes. Conclusions Class A complexes are classical with abundant non-polar interactions at the interface; however class B complexes have abundant polar interactions at the interface, similar to protein surface characteristics. Five physicochemical interface features analyzed from the protein heterodimer dataset are discriminatory among the interface residue-level classes. These novel observations find application in developing residue-level models for protein-protein binding prediction, protein-protein docking studies and interface inhibitor design as drugs. PMID:26679043

  13. A flow-proteometric platform for analyzing protein concentration (FAP): Proof of concept for quantification of PD-L1 protein in cells and tissues.

    PubMed

    Chou, Chao-Kai; Huang, Po-Jung; Tsou, Pei-Hsiang; Wei, Yongkun; Lee, Heng-Huan; Wang, Ying-Nai; Liu, Yen-Liang; Shi, Colin; Yeh, Hsin-Chih; Kameoka, Jun; Hung, Mien-Chie

    2018-05-29

    Protein expression level is critically related to the cell physiological function. However, current methodologies such as Western blot (WB) and Immunohistochemistry (IHC) in analyzing the protein level are rather semi-quantitative and without the information of actual protein concentration. We have developed a microfluidic technique termed a "flow-proteometric platform for analyzing protein concentration (FAP)" that can measure the concentration of a target protein in cells or tissues without the requirement of a calibration standard, e.g., the purified target molecules. To validate our method, we tested a number of control samples with known target protein concentrations and showed that the FAP measurement resulted in concentrations that well matched the actual concentrations in the control samples (coefficient of determination [R 2 ], 0.998), demonstrating a dynamic range of concentrations from 0.13 to 130 pM of a detection for 2 min. We successfully determined a biomarker protein (for predicting the treatment response of cancer immune check-point therapy) PD-L1 concentration in cancer cell lines (HeLa PD-L1 and MDA-MB-231) and breast cancer patient tumor tissues without any prior process of sample purification and standard line construction. Therefore, FAP is a simple, faster, and reliable method to measure the protein concentration in cells and tissues, which can support the conventional methods such as WB and IHC to determine the actual protein level. Copyright © 2018 Elsevier B.V. All rights reserved.

  14. nGASP - the nematode genome annotation assessment project

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Coghlan, A; Fiedler, T J; McKay, S J

    2008-12-19

    While the C. elegans genome is extensively annotated, relatively little information is available for other Caenorhabditis species. The nematode genome annotation assessment project (nGASP) was launched to objectively assess the accuracy of protein-coding gene prediction software in C. elegans, and to apply this knowledge to the annotation of the genomes of four additional Caenorhabditis species and other nematodes. Seventeen groups worldwide participated in nGASP, and submitted 47 prediction sets for 10 Mb of the C. elegans genome. Predictions were compared to reference gene sets consisting of confirmed or manually curated gene models from WormBase. The most accurate gene-finders were 'combiner'more » algorithms, which made use of transcript- and protein-alignments and multi-genome alignments, as well as gene predictions from other gene-finders. Gene-finders that used alignments of ESTs, mRNAs and proteins came in second place. There was a tie for third place between gene-finders that used multi-genome alignments and ab initio gene-finders. The median gene level sensitivity of combiners was 78% and their specificity was 42%, which is nearly the same accuracy as reported for combiners in the human genome. C. elegans genes with exons of unusual hexamer content, as well as those with many exons, short exons, long introns, a weak translation start signal, weak splice sites, or poorly conserved orthologs were the most challenging for gene-finders. While the C. elegans genome is extensively annotated, relatively little information is available for other Caenorhabditis species. The nematode genome annotation assessment project (nGASP) was launched to objectively assess the accuracy of protein-coding gene prediction software in C. elegans, and to apply this knowledge to the annotation of the genomes of four additional Caenorhabditis species and other nematodes. Seventeen groups worldwide participated in nGASP, and submitted 47 prediction sets for 10 Mb of the C. elegans genome. Predictions were compared to reference gene sets consisting of confirmed or manually curated gene models from WormBase. The most accurate gene-finders were 'combiner' algorithms, which made use of transcript- and protein-alignments and multi-genome alignments, as well as gene predictions from other gene-finders. Gene-finders that used alignments of ESTs, mRNAs and proteins came in second place. There was a tie for third place between gene-finders that used multi-genome alignments and ab initio gene-finders. The median gene level sensitivity of combiners was 78% and their specificity was 42%, which is nearly the same accuracy as reported for combiners in the human genome. C. elegans genes with exons of unusual hexamer content, as well as those with many exons, short exons, long introns, a weak translation start signal, weak splice sites, or poorly conserved orthologs were the most challenging for gene-finders.« less

  15. C-reactive Protein is a Useful Marker for Early Prediction of Anastomotic Leakage after Esophageal Reconstruction.

    PubMed

    Edagawa, Eijiro; Matsuda, Yasunori; Gyobu, Ken; Lee, Shigeru; Kishida, Satoru; Fujiwara, Yushi; Hashiba, Ryoya; Osugi, Harushi; Suehiro, Shigefumi

    2015-06-01

    Esophageal anastomotic leakage is one of the most fatal complications after esophagectomy and increases the hospitalization length. We aimed to identify a convenient clinical marker of anastomotic leakage in the early postoperative period. In total, 108 patients who underwent esophagectomy were retrospectively screened, and 96 were used to validate the overall results. All 108 patients underwent physical examinations and determination of their white blood cell count, C-reactive protein level, platelet count, fibrinogen level, fibrin degradation product level, and antithrombin III level until postoperative day 6. Anastomotic leakage occurred in 21 of the 108 patients (median detection, 8 days). The C-reactive protein level on postoperative day 3 and fibrinogen level on postoperative day 4 in the leakage group were significantly higher than those in the nonleakage group. Receiver operating characteristic curves for detection of anastomotic leakage were constructed; the cutoff value of C-reactive protein on postoperative day 3 was 8.62 mg/dL, and that of fibrinogen on postoperative day 4 was 712 mg/dL. Anastomotic leakage occurred in 23 of the 96 patients in the validation group. There was a significant difference between the leakage and nonleakage groups when the C-reactive protein threshold on postoperative day 3 was set at 8.62 mg/dL. However, there was no difference between the groups when the fibrinogen threshold on postoperative day 4 was set at 712 mg/dL. The C-reactive protein level on postoperative day 3 is a valuable predictor of anastomotic leakage after esophagectomy and might allow for earlier management of this complication.

  16. MEGADOCK: An All-to-All Protein-Protein Interaction Prediction System Using Tertiary Structure Data

    PubMed Central

    Ohue, Masahito; Matsuzaki, Yuri; Uchikoga, Nobuyuki; Ishida, Takashi; Akiyama, Yutaka

    2014-01-01

    The elucidation of protein-protein interaction (PPI) networks is important for understanding cellular structure and function and structure-based drug design. However, the development of an effective method to conduct exhaustive PPI screening represents a computational challenge. We have been investigating a protein docking approach based on shape complementarity and physicochemical properties. We describe here the development of the protein-protein docking software package “MEGADOCK” that samples an extremely large number of protein dockings at high speed. MEGADOCK reduces the calculation time required for docking by using several techniques such as a novel scoring function called the real Pairwise Shape Complementarity (rPSC) score. We showed that MEGADOCK is capable of exhaustive PPI screening by completing docking calculations 7.5 times faster than the conventional docking software, ZDOCK, while maintaining an acceptable level of accuracy. When MEGADOCK was applied to a subset of a general benchmark dataset to predict 120 relevant interacting pairs from 120 x 120 = 14,400 combinations of proteins, an F-measure value of 0.231 was obtained. Further, we showed that MEGADOCK can be applied to a large-scale protein-protein interaction-screening problem with accuracy better than random. When our approach is combined with parallel high-performance computing systems, it is now feasible to search and analyze protein-protein interactions while taking into account three-dimensional structures at the interactome scale. MEGADOCK is freely available at http://www.bi.cs.titech.ac.jp/megadock. PMID:23855673

  17. Proteomic Analyses of Corneal Tissue Subjected to Alkali Exposure

    PubMed Central

    Parikh, Toral; Eisner, Natalie; Venugopalan, Praseeda; Yang, Qin; Lam, Byron L.

    2011-01-01

    Purpose. To determine whether exposure to alkaline chemicals results in predictable changes in corneal protein profile. To determine whether protein profile changes are indicative of severity and duration of alkali exposure. Methods. Enucleated bovine and porcine (n = 59 each) eyes were used for exposure to sodium, ammonium, and calcium hydroxide, respectively. Eyes were subjected to fluorescein staining, 5-bromo-2′-deoxy-uridine (BrdU) labeling. Excised cornea was subjected to protein extraction, spectrophotometric determination of protein amount, dynamic light scattering and SDS-PAGE profiling, mass spectrometric protein identification, and iTRAQ-labeled quantification. Select identified proteins were subjected to Western blot and immunohistochemical analyses. Results. Alkali exposure resulted in lower protein extractability from corneal tissue. Elevated aggregate formation was found with strong alkali exposure (sodium hydroxide>ammonium, calcium hydroxide), even with a short duration of exposure compared with controls. The protein yield after exposure varied as a function of postexposure time. Protein profiles changed because of alkali exposure. Concentration and strength of the alkali affected the profile change significantly. Mass spectrometry identified 15 proteins from different bands with relative quantification. Plexin D1 was identified for the first time in the cornea at a protein level that was further confirmed by Western blot and immunohistochemical analyses. Conclusions. Exposure to alkaline chemicals results in predictable and reproducible changes in corneal protein profile. Stronger alkali, longer durations, or both, of exposure resulted in lower yields and significant protein profile changes compared with controls. PMID:20861482

  18. A hidden markov model derived structural alphabet for proteins.

    PubMed

    Camproux, A C; Gautier, R; Tufféry, P

    2004-06-04

    Understanding and predicting protein structures depends on the complexity and the accuracy of the models used to represent them. We have set up a hidden Markov model that discretizes protein backbone conformation as series of overlapping fragments (states) of four residues length. This approach learns simultaneously the geometry of the states and their connections. We obtain, using a statistical criterion, an optimal systematic decomposition of the conformational variability of the protein peptidic chain in 27 states with strong connection logic. This result is stable over different protein sets. Our model fits well the previous knowledge related to protein architecture organisation and seems able to grab some subtle details of protein organisation, such as helix sub-level organisation schemes. Taking into account the dependence between the states results in a description of local protein structure of low complexity. On an average, the model makes use of only 8.3 states among 27 to describe each position of a protein structure. Although we use short fragments, the learning process on entire protein conformations captures the logic of the assembly on a larger scale. Using such a model, the structure of proteins can be reconstructed with an average accuracy close to 1.1A root-mean-square deviation and for a complexity of only 3. Finally, we also observe that sequence specificity increases with the number of states of the structural alphabet. Such models can constitute a very relevant approach to the analysis of protein architecture in particular for protein structure prediction.

  19. Probing sequence dependence of folding pathway of α-helix bundle proteins through free energy landscape analysis.

    PubMed

    Shao, Qiang

    2014-06-05

    A comparative study on the folding of multiple three-α-helix bundle proteins including α3D, α3W, and the B domain of protein A (BdpA) is presented. The use of integrated-tempering-sampling molecular dynamics simulations achieves reversible folding and unfolding events in individual short trajectories, which thus provides an efficient approach to sufficiently sample the configuration space of protein and delineate the folding pathway of α-helix bundle. The detailed free energy landscape analyses indicate that the folding mechanism of α-helix bundle is not uniform but sequence dependent. A simple model is then proposed to predict folding mechanism of α-helix bundle on the basis of amino acid composition: α-helical proteins containing higher percentage of hydrophobic residues than charged ones fold via nucleation-condensation mechanism (e.g., α3D and BdpA) whereas proteins having opposite tendency in amino acid composition more likely fold via the framework mechanism (e.g., α3W). The model is tested on various α-helix bundle proteins, and the predicted mechanism is similar to the most approved one for each protein. In addition, the common features in the folding pathway of α-helix bundle protein are also deduced. In summary, the present study provides comprehensive, atomic-level picture of the folding of α-helix bundle proteins.

  20. Is Phosphoproteomics Ready for Clinical Research?

    PubMed Central

    Iliuk, Anton B.; Tao, W. Andy

    2012-01-01

    Background For many diseases such as cancer where phosphorylation-dependent signaling is the foundation of disease onset and progression, single-gene testing and genomic profiling alone are not sufficient in providing most critical information. The reason for this is that in these activated pathways the signaling changes and drug resistance are often not directly correlated with changes in protein expression levels. In order to obtain the essential information needed to evaluate pathway activation or the effects of certain drugs and therapies on the molecular level, the analysis of changes in protein phosphorylation is critical. Methods Existing approaches do not differentiate clinical disease subtypes on the protein and signaling pathway level, and therefore hamper the predictive management of the disease and the selection of therapeutic targets. Conclusions The mini-review examines the impact of emerging systems biology tools and the possibility of applying phosphoproteomics to clinical research. PMID:23159844

  1. Validation of Molecular Dynamics Simulations for Prediction of Three-Dimensional Structures of Small Proteins.

    PubMed

    Kato, Koichi; Nakayoshi, Tomoki; Fukuyoshi, Shuichi; Kurimoto, Eiji; Oda, Akifumi

    2017-10-12

    Although various higher-order protein structure prediction methods have been developed, almost all of them were developed based on the three-dimensional (3D) structure information of known proteins. Here we predicted the short protein structures by molecular dynamics (MD) simulations in which only Newton's equations of motion were used and 3D structural information of known proteins was not required. To evaluate the ability of MD simulationto predict protein structures, we calculated seven short test protein (10-46 residues) in the denatured state and compared their predicted and experimental structures. The predicted structure for Trp-cage (20 residues) was close to the experimental structure by 200-ns MD simulation. For proteins shorter or longer than Trp-cage, root-mean square deviation values were larger than those for Trp-cage. However, secondary structures could be reproduced by MD simulations for proteins with 10-34 residues. Simulations by replica exchange MD were performed, but the results were similar to those from normal MD simulations. These results suggest that normal MD simulations can roughly predict short protein structures and 200-ns simulations are frequently sufficient for estimating the secondary structures of protein (approximately 20 residues). Structural prediction method using only fundamental physical laws are useful for investigating non-natural proteins, such as primitive proteins and artificial proteins for peptide-based drug delivery systems.

  2. Molecular Characterization of Protease Activity in Serratia sp. Strain SCBI and Its Importance in Cytotoxicity and Virulence

    PubMed Central

    Petersen, Lauren M.

    2014-01-01

    A newly recognized Serratia species, termed South African Caenorhabditis briggsae isolate (SCBI), is both a mutualist of the nematode Caenorhabditis briggsae KT0001 and a pathogen of lepidopteran insects. Serratia sp. strain SCBI displays high proteolytic activity, and because secreted proteases are known virulence factors for many pathogens, the purpose of this study was to identify genes essential for extracellular protease activity in Serratia sp. strain SCBI and to determine what role proteases play in insect pathogenesis and cytotoxicity. A bank of 2,100 transposon mutants was generated, and six SCBI mutants with defective proteolytic activity were identified. These mutants were also defective in cytotoxicity. The mutants were found defective in genes encoding the following proteins: alkaline metalloprotease secretion protein AprE, a BglB family transcriptional antiterminator, an inosine/xanthosine triphosphatase, GidA, a methyl-accepting chemotaxis protein, and a PIN domain protein. Gene expression analysis on these six mutants showed significant downregulation in mRNA levels of several different types of predicted protease genes. In addition, transcriptome sequencing (RNA-seq) analysis provided insight into how inactivation of AprE, GidA, and a PIN domain protein influences motility and virulence, as well as protease activity. Using quantitative reverse transcription-PCR (qRT-PCR) to further characterize expression of predicted protease genes in wild-type Serratia sp. SCBI, the highest mRNA levels for the alkaline metalloprotease genes (termed prtA1 to prtA4) occurred following the death of an insect host, while two serine protease and two metalloprotease genes had their highest mRNA levels during active infection. Overall, these results indicate that proteolytic activity is essential for cytotoxicity in Serratia sp. SCBI and that its regulation appears to be highly complex. PMID:25182493

  3. Topology-based modeling of intrinsically disordered proteins: balancing intrinsic folding and intermolecular interactions.

    PubMed

    Ganguly, Debabani; Chen, Jianhan

    2011-04-01

    Coupled binding and folding is frequently involved in specific recognition of so-called intrinsically disordered proteins (IDPs), a newly recognized class of proteins that rely on a lack of stable tertiary fold for function. Here, we exploit topology-based Gō-like modeling as an effective tool for the mechanism of IDP recognition within the theoretical framework of minimally frustrated energy landscape. Importantly, substantial differences exist between IDPs and globular proteins in both amino acid sequence and binding interface characteristics. We demonstrate that established Gō-like models designed for folded proteins tend to over-estimate the level of residual structures in unbound IDPs, whereas under-estimating the strength of intermolecular interactions. Such systematic biases have important consequences in the predicted mechanism of interaction. A strategy is proposed to recalibrate topology-derived models to balance intrinsic folding propensities and intermolecular interactions, based on experimental knowledge of the overall residual structure level and binding affinity. Applied to pKID/KIX, the calibrated Gō-like model predicts a dominant multistep sequential pathway for binding-induced folding of pKID that is initiated by KIX binding via the C-terminus in disordered conformations, followed by binding and folding of the rest of C-terminal helix and finally the N-terminal helix. This novel mechanism is consistent with key observations derived from a recent NMR titration and relaxation dispersion study and provides a molecular-level interpretation of kinetic rates derived from dispersion curve analysis. These case studies provide important insight into the applicability and potential pitfalls of topology-based modeling for studying IDP folding and interaction in general. Copyright © 2011 Wiley-Liss, Inc.

  4. BindML/BindML+: Detecting Protein-Protein Interaction Interface Propensity from Amino Acid Substitution Patterns.

    PubMed

    Wei, Qing; La, David; Kihara, Daisuke

    2017-01-01

    Prediction of protein-protein interaction sites in a protein structure provides important information for elucidating the mechanism of protein function and can also be useful in guiding a modeling or design procedures of protein complex structures. Since prediction methods essentially assess the propensity of amino acids that are likely to be part of a protein docking interface, they can help in designing protein-protein interactions. Here, we introduce BindML and BindML+ protein-protein interaction sites prediction methods. BindML predicts protein-protein interaction sites by identifying mutation patterns found in known protein-protein complexes using phylogenetic substitution models. BindML+ is an extension of BindML for distinguishing permanent and transient types of protein-protein interaction sites. We developed an interactive web-server that provides a convenient interface to assist in structural visualization of protein-protein interactions site predictions. The input data for the web-server are a tertiary structure of interest. BindML and BindML+ are available at http://kiharalab.org/bindml/ and http://kiharalab.org/bindml/plus/ .

  5. Systems modeling accurately predicts responses to genotoxic agents and their synergism with BCL-2 inhibitors in triple negative breast cancer cells.

    PubMed

    Lucantoni, Federico; Lindner, Andreas U; O'Donovan, Norma; Düssmann, Heiko; Prehn, Jochen H M

    2018-01-19

    Triple negative breast cancer (TNBC) is an aggressive form of breast cancer which accounts for 15-20% of this disease and is currently treated with genotoxic chemotherapy. The BCL2 (B-cell lymphoma 2) family of proteins controls the process of mitochondrial outer membrane permeabilization (MOMP), which is required for the activation of the mitochondrial apoptosis pathway in response to genotoxic agents. We previously developed a deterministic systems model of BCL2 protein interactions, DR_MOMP that calculates the sensitivity of cells to undergo mitochondrial apoptosis. Here we determined whether DR_MOMP predicts responses of TNBC cells to genotoxic agents and the re-sensitization of resistant cells by BCL2 inhibitors. Using absolute protein levels of BAX, BAK, BCL2, BCL(X)L and MCL1 as input for DR_MOMP, we found a strong correlation between model predictions and responses of a panel of TNBC cells to 24 and 48 h cisplatin (R 2  = 0.96 and 0.95, respectively) and paclitaxel treatments (R 2  = 0.94 and 0.95, respectively). This outperformed single protein correlations (best performer BCL(X)L with R 2 of 0.69 and 0.50 for cisplatin and paclitaxel treatments, respectively) and BCL2 proteins ratio (R 2 of 0.50 for cisplatin and 0.49 for paclitaxel). Next we performed synergy studies using the BCL2 selective antagonist Venetoclax /ABT199, the BCL(X)L selective antagonist WEHI-539, or the MCL1 selective antagonist A-1210477 in combination with cisplatin. In silico predictions by DR_MOMP revealed substantial differences in treatment responses of BCL(X)L, BCL2 or MCL1 inhibitors combinations with cisplatin that were successfully validated in cell lines. Our findings provide evidence that DR_MOMP predicts responses of TNBC cells to genotoxic therapy, and can aid in the choice of the optimal BCL2 protein antagonist for combination treatments of resistant cells.

  6. The Prognostic Value of Serum Levels of Heart-Type Fatty Acid Binding Protein and High Sensitivity C-Reactive Protein in Patients With Increased Levels of Amino-Terminal Pro-B Type Natriuretic Peptide.

    PubMed

    Jeong, Ji Hun; Seo, Yiel Hea; Ahn, Jeong Yeal; Kim, Kyung Hee; Seo, Ja Young; Kim, Moon Jin; Lee, Hwan Tae; Park, Pil Whan

    2016-09-01

    Amino-terminal pro-B type natriuretic peptide (NT-proBNP) is a well-established prognostic factor in heart failure (HF). However, numerous causes may lead to elevations in NT-proBNP, and thus, an increased NT-proBNP level alone is not sufficient to predict outcome. The aim of this study was to evaluate the utility of two acute response markers, high sensitivity C-reactive protein (hsCRP) and heart-type fatty acid binding protein (H-FABP), in patients with an increased NT-proBNP level. The 278 patients were classified into three groups by etiology: 1) acute coronary syndrome (ACS) (n=62), 2) non-ACS cardiac disease (n=156), and 3) infectious disease (n=60). Survival was determined on day 1, 7, 14, 21, 28, 60, 90, 120, and 150 after enrollment. H-FABP (P<0.001), NT-proBNP (P=0.006), hsCRP (P<0.001) levels, and survival (P<0.001) were significantly different in the three disease groups. Patients were divided into three classes by using receiver operating characteristic curves for NT-proBNP, H-FABP, and hsCRP. Patients with elevated NT-proBNP (≥3,856 pg/mL) and H-FABP (≥8.8 ng/mL) levels were associated with higher hazard ratio for mortality (5.15 in NT-proBNP and 3.25 in H-FABP). Area under the receiver operating characteristic curve analysis showed H-FABP was a better predictor of 60-day mortality than NT-proBNP. The combined measurement of H-FABP with NT-proBNP provides a highly reliable means of short-term mortality prediction for patients hospitalized for ACS, non-ACS cardiac disease, or infectious disease.

  7. Predictive value of quantitative HER2, HER3 and p95HER2 levels in HER2-positive advanced breast cancer patients treated with lapatinib following progression on trastuzumab.

    PubMed

    Duchnowska, Renata; Sperinde, Jeff; Czartoryska-Arłukowicz, Bogumiła; Myśliwiec, Paulina; Winslow, John; Radecka, Barbara; Petropoulos, Christos; Demlova, Regina; Orlikowska, Marlena; Kowalczyk, Anna; Lang, Istvan; Ziółkowska, Barbara; Dębska-Szmich, Sylwia; Merdalska, Monika; Grela-Wojewoda, Aleksandra; Żawrocki, Anton; Biernat, Wojciech; Huang, Weidong; Jassem, Jacek

    2017-11-28

    Lapatinib is a HER1 and HER2 tyrosine kinase inhibitor (TKI) approved in second line treatment of advanced or metastatic breast cancer following progression on trastuzumab-containing therapy. Biomarkers for activity of lapatinib and other TKIs are lacking. Formalin-fixed, paraffin-embedded primary tumor samples were obtained from 189 HER2-positive patients treated with lapatinib plus capecitabine following progression on trastuzumab. The HERmark ® Breast Cancer Assay was used to quantify HER2 protein expression. HER3 and p95HER2 protein expression was quantified using the VeraTag ® technology. Overall survival (OS) was inversely correlated with HER2 (HR = 1.9/log; P = 0.009) for patients with tumors above the cut-off positivity level by the HERmark assay. OS was significantly shorter for those with above median HER2 levels (HR = 1.7; P = 0.015) and trended shorter for those below the cut-off level of positivity by the HERmark assay (HR = 1.7; P = 0.057) compared to cases with moderate HER2 overexpression. The relationship between HER2 protein expression and OS was best captured with a U-shaped parabolic function (P = 0.004), with the best prognosis at moderate levels of HER2 protein overexpression. In a multivariate model including HER2, increasing p95HER2 expression was associated with longer OS (HR = 0.35/log; P = 0.027). Continuous HER3 did not significantly correlate with OS. Patients with moderately overexpressed HER2 levels and high p95HER2 expression may have best outcomes while receiving lapatinib following progression on trastuzumab. Further study is warranted to explore the predictive utility of quantitative HER2 and p95HER2 in guiding HER2-directed therapies.

  8. Computational prediction of human salivary proteins from blood circulation and application to diagnostic biomarker identification.

    PubMed

    Wang, Jiaxin; Liang, Yanchun; Wang, Yan; Cui, Juan; Liu, Ming; Du, Wei; Xu, Ying

    2013-01-01

    Proteins can move from blood circulation into salivary glands through active transportation, passive diffusion or ultrafiltration, some of which are then released into saliva and hence can potentially serve as biomarkers for diseases if accurately identified. We present a novel computational method for predicting salivary proteins that come from circulation. The basis for the prediction is a set of physiochemical and sequence features we found to be discerning between human proteins known to be movable from circulation to saliva and proteins deemed to be not in saliva. A classifier was trained based on these features using a support-vector machine to predict protein secretion into saliva. The classifier achieved 88.56% average recall and 90.76% average precision in 10-fold cross-validation on the training data, indicating that the selected features are informative. Considering the possibility that our negative training data may not be highly reliable (i.e., proteins predicted to be not in saliva), we have also trained a ranking method, aiming to rank the known salivary proteins from circulation as the highest among the proteins in the general background, based on the same features. This prediction capability can be used to predict potential biomarker proteins for specific human diseases when coupled with the information of differentially expressed proteins in diseased versus healthy control tissues and a prediction capability for blood-secretory proteins. Using such integrated information, we predicted 31 candidate biomarker proteins in saliva for breast cancer.

  9. Computational Prediction of Human Salivary Proteins from Blood Circulation and Application to Diagnostic Biomarker Identification

    PubMed Central

    Wang, Jiaxin; Liang, Yanchun; Wang, Yan; Cui, Juan; Liu, Ming; Du, Wei; Xu, Ying

    2013-01-01

    Proteins can move from blood circulation into salivary glands through active transportation, passive diffusion or ultrafiltration, some of which are then released into saliva and hence can potentially serve as biomarkers for diseases if accurately identified. We present a novel computational method for predicting salivary proteins that come from circulation. The basis for the prediction is a set of physiochemical and sequence features we found to be discerning between human proteins known to be movable from circulation to saliva and proteins deemed to be not in saliva. A classifier was trained based on these features using a support-vector machine to predict protein secretion into saliva. The classifier achieved 88.56% average recall and 90.76% average precision in 10-fold cross-validation on the training data, indicating that the selected features are informative. Considering the possibility that our negative training data may not be highly reliable (i.e., proteins predicted to be not in saliva), we have also trained a ranking method, aiming to rank the known salivary proteins from circulation as the highest among the proteins in the general background, based on the same features. This prediction capability can be used to predict potential biomarker proteins for specific human diseases when coupled with the information of differentially expressed proteins in diseased versus healthy control tissues and a prediction capability for blood-secretory proteins. Using such integrated information, we predicted 31 candidate biomarker proteins in saliva for breast cancer. PMID:24324552

  10. Clinical prediction model to aid emergency doctors managing febrile children at risk of serious bacterial infections: diagnostic study

    PubMed Central

    Nijman, Ruud G; Vergouwe, Yvonne; Thompson, Matthew; van Veen, Mirjam; van Meurs, Alfred H J; van der Lei, Johan; Steyerberg, Ewout W; Moll, Henriette A

    2013-01-01

    Objective To derive, cross validate, and externally validate a clinical prediction model that assesses the risks of different serious bacterial infections in children with fever at the emergency department. Design Prospective observational diagnostic study. Setting Three paediatric emergency care units: two in the Netherlands and one in the United Kingdom. Participants Children with fever, aged 1 month to 15 years, at three paediatric emergency care units: Rotterdam (n=1750) and the Hague (n=967), the Netherlands, and Coventry (n=487), United Kingdom. A prediction model was constructed using multivariable polytomous logistic regression analysis and included the predefined predictor variables age, duration of fever, tachycardia, temperature, tachypnoea, ill appearance, chest wall retractions, prolonged capillary refill time (>3 seconds), oxygen saturation <94%, and C reactive protein. Main outcome measures Pneumonia, other serious bacterial infections (SBIs, including septicaemia/meningitis, urinary tract infections, and others), and no SBIs. Results Oxygen saturation <94% and presence of tachypnoea were important predictors of pneumonia. A raised C reactive protein level predicted the presence of both pneumonia and other SBIs, whereas chest wall retractions and oxygen saturation <94% were useful to rule out the presence of other SBIs. Discriminative ability (C statistic) to predict pneumonia was 0.81 (95% confidence interval 0.73 to 0.88); for other SBIs this was even better: 0.86 (0.79 to 0.92). Risk thresholds of 10% or more were useful to identify children with serious bacterial infections; risk thresholds less than 2.5% were useful to rule out the presence of serious bacterial infections. External validation showed good discrimination for the prediction of pneumonia (0.81, 0.69 to 0.93); discriminative ability for the prediction of other SBIs was lower (0.69, 0.53 to 0.86). Conclusion A validated prediction model, including clinical signs, symptoms, and C reactive protein level, was useful for estimating the likelihood of pneumonia and other SBIs in children with fever, such as septicaemia/meningitis and urinary tract infections. PMID:23550046

  11. Serum cytokine/chemokine profiles in patients with dengue fever (DF) and dengue hemorrhagic fever (FHD) by using protein array.

    PubMed

    Oliveira, Renato Antonio Dos Santos; Cordeiro, Marli Tenório; Moura, Patrícia Muniz Mendes Freire de; Baptista Filho, Paulo Neves Bapti; Braga-Neto, Ulisses de Mendonça; Marques, Ernesto Torres de Azevedo; Gil, Laura Helena Vega Gonzales

    2017-04-01

    DENV infection can induce different clinical manifestations varying from mild forms to dengue fever (DF) or the severe hemorrhagic fever (DHF). Several factors are involved in the progression from DF to DHF. No marker is available to predict this progression. Such biomarker could allow a suitable medical care at the beginning of the infection, improving patient prognosis. The aim of this study was to compare the serum expression levels of acute phase proteins in a well-established cohort of dengue fever (DF) and dengue hemorrhagic fever (DHF) patients, in order to individuate a prognostic marker of diseases severity. The serum levels of 36 cytokines, chemokines and acute phase proteins were determined in DF and DHF patients and compared to healthy volunteers using a multiplex protein array and near-infrared (NIR) fluorescence detection. Serum levels of IL-1ra, IL-23, MIF, sCD40 ligand, IP-10 and GRO-α were also determined by ELISA. At the early stages of infection, GRO-α and IP-10 expression levels were different in DF compared to DHF patients. Besides, GRO-α was positively correlated with platelet counts and IP-10 was negatively correlated with total protein levels. These findings suggest that high levels of GRO-α during acute DENV infection may be associated with a good prognosis, while high levels of IP-10 may be a warning sign of infection severity. Copyright © 2017 Elsevier B.V. All rights reserved.

  12. A novel ex vivo immunoproteomic approach characterising Fasciola hepatica tegumental antigens identified using immune antibody from resistant sheep.

    PubMed

    Cameron, Timothy C; Cooke, Ira; Faou, Pierre; Toet, Hayley; Piedrafita, David; Young, Neil; Rathinasamy, Vignesh; Beddoe, Travis; Anderson, Glenn; Dempster, Robert; Spithill, Terry W

    2017-08-01

    A more thorough understanding of the immunological interactions between Fasciola spp. and their hosts is required if we are to develop new immunotherapies to control fasciolosis. Deeper knowledge of the antigens that are the target of the acquired immune responses of definitive hosts against both Fasciola hepatica and Fasciola gigantica will potentially identify candidate vaccine antigens. Indonesian Thin Tail sheep express a high level of acquired immunity to infection by F. gigantica within 4weeks of infection and antibodies in Indonesian Thin Tail sera can promote antibody-dependent cell-mediated cytotoxicity against the surface tegument of juvenile F. gigantica in vitro. Given the high protein sequence similarity between F. hepatica and F. gigantica, we hypothesised that antibody from F. gigantica-infected sheep could be used to identify the orthologous proteins in the tegument of F. hepatica. Purified IgG from the sera of F. gigantica-infected Indonesian Thin Tail sheep collected pre-infection and 4weeks p.i. were incubated with live adult F. hepatica ex vivo and the immunosloughate (immunoprecipitate) formed was isolated and analysed via liquid chromatography-electrospray ionisation-tandem mass spectrometry to identify proteins involved in the immune response. A total of 38 proteins were identified at a significantly higher abundance in the immunosloughate using week 4 IgG, including eight predicted membrane proteins, 20 secreted proteins, nine proteins predicted to be associated with either the lysosomes, the cytoplasm or the cytoskeleton and one protein with an unknown cellular localization. Three of the membrane proteins are transporters including a multidrug resistance protein, an amino acid permease and a glucose transporter. Interestingly, a total of 21 of the 38 proteins matched with proteins recently reported to be associated with the proposed small exosome-like extracellular vesicles of adult F. hepatica, suggesting that the Indonesian Thin Tail week 4 IgG is either recognising individual proteins released from extracellular vesicles or is immunoprecipitating intact exosome-like extracellular vesicles. Five extracellular vesicle membrane proteins were identified including two proteins predicted to be associated with vesicle transport/ exocytosis (VPS4, vacuolar protein sorting-associated protein 4b and the Niemann-Pick C1 protein). RNAseq analysis of the developmental transcription of the 38 immunosloughate proteins showed that the sequences are expressed over a wide abundance range with 21/38 transcripts expressed at a relatively high level from metacercariae to the adult life cycle stage. A notable feature of the immunosloughates was the absence of cytosolic proteins which have been reported to be secreted markers for damage to adult flukes incubated in vitro, suggesting that the proteins observed are not inadvertent contaminants leaking from damaged flukes ex vivo. The identification of tegument protein antigens shared between F. gigantica and F. hepatica is beneficial in terms of the possible development of a dual purpose vaccine effective against both fluke species. Copyright © 2017 Australian Society for Parasitology. Published by Elsevier Ltd. All rights reserved.

  13. Predicting disease-related proteins based on clique backbone in protein-protein interaction network.

    PubMed

    Yang, Lei; Zhao, Xudong; Tang, Xianglong

    2014-01-01

    Network biology integrates different kinds of data, including physical or functional networks and disease gene sets, to interpret human disease. A clique (maximal complete subgraph) in a protein-protein interaction network is a topological module and possesses inherently biological significance. A disease-related clique possibly associates with complex diseases. Fully identifying disease components in a clique is conductive to uncovering disease mechanisms. This paper proposes an approach of predicting disease proteins based on cliques in a protein-protein interaction network. To tolerate false positive and negative interactions in protein networks, extending cliques and scoring predicted disease proteins with gene ontology terms are introduced to the clique-based method. Precisions of predicted disease proteins are verified by disease phenotypes and steadily keep to more than 95%. The predicted disease proteins associated with cliques can partly complement mapping between genotype and phenotype, and provide clues for understanding the pathogenesis of serious diseases.

  14. SeqRate: sequence-based protein folding type classification and rates prediction

    PubMed Central

    2010-01-01

    Background Protein folding rate is an important property of a protein. Predicting protein folding rate is useful for understanding protein folding process and guiding protein design. Most previous methods of predicting protein folding rate require the tertiary structure of a protein as an input. And most methods do not distinguish the different kinetic nature (two-state folding or multi-state folding) of the proteins. Here we developed a method, SeqRate, to predict both protein folding kinetic type (two-state versus multi-state) and real-value folding rate using sequence length, amino acid composition, contact order, contact number, and secondary structure information predicted from only protein sequence with support vector machines. Results We systematically studied the contributions of individual features to folding rate prediction. On a standard benchmark dataset, the accuracy of folding kinetic type classification is 80%. The Pearson correlation coefficient and the mean absolute difference between predicted and experimental folding rates (sec-1) in the base-10 logarithmic scale are 0.81 and 0.79 for two-state protein folders, and 0.80 and 0.68 for three-state protein folders. SeqRate is the first sequence-based method for protein folding type classification and its accuracy of fold rate prediction is improved over previous sequence-based methods. Its performance can be further enhanced with additional information, such as structure-based geometric contacts, as inputs. Conclusions Both the web server and software of predicting folding rate are publicly available at http://casp.rnet.missouri.edu/fold_rate/index.html. PMID:20438647

  15. A novel fractal approach for predicting G-protein-coupled receptors and their subfamilies with support vector machines.

    PubMed

    Nie, Guoping; Li, Yong; Wang, Feichi; Wang, Siwen; Hu, Xuehai

    2015-01-01

    G-protein-coupled receptors (GPCRs) are seven membrane-spanning proteins and regulate many important physiological processes, such as vision, neurotransmission, immune response and so on. GPCRs-related pathways are the targets of a large number of marketed drugs. Therefore, the design of a reliable computational model for predicting GPCRs from amino acid sequence has long been a significant biomedical problem. Chaos game representation (CGR) reveals the fractal patterns hidden in protein sequences, and then fractal dimension (FD) is an important feature of these highly irregular geometries with concise mathematical expression. Here, in order to extract important features from GPCR protein sequences, CGR algorithm, fractal dimension and amino acid composition (AAC) are employed to formulate the numerical features of protein samples. Four groups of features are considered, and each group is evaluated by support vector machine (SVM) and 10-fold cross-validation test. To test the performance of the present method, a new non-redundant dataset was built based on latest GPCRDB database. Comparing the results of numerical experiments, the group of combined features with AAC and FD gets the best result, the accuracy is 99.22% and Matthew's correlation coefficient (MCC) is 0.9845 for identifying GPCRs from non-GPCRs. Moreover, if it is classified as a GPCR, it will be further put into the second level, which will classify a GPCR into one of the five main subfamilies. At this level, the group of combined features with AAC and FD also gets best accuracy 85.73%. Finally, the proposed predictor is also compared with existing methods and shows better performances.

  16. YKL-40, CCL18 and SP-D predict mortality in patients hospitalized with community-acquired pneumonia.

    PubMed

    Spoorenberg, Simone M C; Vestjens, Stefan M T; Rijkers, Ger T; Meek, Bob; van Moorsel, Coline H M; Grutters, Jan C; Bos, Willem Jan W

    2017-04-01

    The aim of this study was to investigate the prognostic value of four biomarkers, YKL-40, chemokine (C-C motif) ligand 18 (CCL18), surfactant protein-D (SP-D) and CA 15-3, in patients admitted with community-acquired pneumonia (CAP). These markers have been studied extensively in chronic pulmonary disease, but in acute pulmonary disease their prognostic value is unknown. A total of 289 adult patients who were hospitalized with CAP and participated in a randomized controlled trial were enrolled. Biomarker levels were measured on the day of admission. Intensive care unit admission, 30-day, 1-year and long-term mortality (median follow-up of 5.4 years, interquartile range (IQR): 4.7-6.1) were recorded as outcomes. Median YKL-40 and CCL18 levels were significantly higher and levels of SP-D were significantly lower in CAP patients compared to healthy controls. Significantly higher YKL-40, CCL18 and SP-D levels were found in patients classified in pneumonia severity index classes 4-5 and with a CURB-65 score ≥2 compared to patients with less severe pneumonia. Furthermore, these three markers were significant predictors for long-term mortality in multivariate analysis and compared with C-reactive protein and procalcitonin level on admission, area under the curves were higher for 30-day, 1-year and long-term mortality. CA 15-3 levels were less predictive. YKL-40, CCL18 and SP-D levels were higher in patients with more severe pneumonia, possibly reflecting the extent of pulmonary inflammation. Of these, YKL-40 most significantly predicts mortality for CAP. © 2016 Asian Pacific Society of Respirology.

  17. Proteogenomic characterization of human colon and rectal cancer

    PubMed Central

    Zhang, Bing; Wang, Jing; Wang, Xiaojing; Zhu, Jing; Liu, Qi; Shi, Zhiao; Chambers, Matthew C.; Zimmerman, Lisa J.; Shaddox, Kent F.; Kim, Sangtae; Davies, Sherri R.; Wang, Sean; Wang, Pei; Kinsinger, Christopher R.; Rivers, Robert C.; Rodriguez, Henry; Townsend, R. Reid; Ellis, Matthew J.C.; Carr, Steven A.; Tabb, David L.; Coffey, Robert J.; Slebos, Robbert J.C.; Liebler, Daniel C.

    2014-01-01

    Summary We analyzed proteomes of colon and rectal tumors previously characterized by the Cancer Genome Atlas (TCGA) and performed integrated proteogenomic analyses. Somatic variants displayed reduced protein abundance compared to germline variants. mRNA transcript abundance did not reliably predict protein abundance differences between tumors. Proteomics identified five proteomic subtypes in the TCGA cohort, two of which overlapped with the TCGA “MSI/CIMP” transcriptomic subtype, but had distinct mutation, methylation, and protein expression patterns associated with different clinical outcomes. Although copy number alterations showed strong cis- and trans-effects on mRNA abundance, relatively few of these extend to the protein level. Thus, proteomics data enabled prioritization of candidate driver genes. The chromosome 20q amplicon was associated with the largest global changes at both mRNA and protein levels; proteomics data highlighted potential 20q candidates including HNF4A, TOMM34 and SRC. Integrated proteogenomic analysis provides functional context to interpret genomic abnormalities and affords a new paradigm for understanding cancer biology. PMID:25043054

  18. Prediction of troponin-T degradation using color image texture features in 10d aged beef longissimus steaks.

    PubMed

    Sun, X; Chen, K J; Berg, E P; Newman, D J; Schwartz, C A; Keller, W L; Maddock Carlin, K R

    2014-02-01

    The objective was to use digital color image texture features to predict troponin-T degradation in beef. Image texture features, including 88 gray level co-occurrence texture features, 81 two-dimension fast Fourier transformation texture features, and 48 Gabor wavelet filter texture features, were extracted from color images of beef strip steaks (longissimus dorsi, n = 102) aged for 10d obtained using a digital camera and additional lighting. Steaks were designated degraded or not-degraded based on troponin-T degradation determined on d 3 and d 10 postmortem by immunoblotting. Statistical analysis (STEPWISE regression model) and artificial neural network (support vector machine model, SVM) methods were designed to classify protein degradation. The d 3 and d 10 STEPWISE models were 94% and 86% accurate, respectively, while the d 3 and d 10 SVM models were 63% and 71%, respectively, in predicting protein degradation in aged meat. STEPWISE and SVM models based on image texture features show potential to predict troponin-T degradation in meat. © 2013.

  19. Mean of the typical decoding rates: a new translation efficiency index based on the analysis of ribosome profiling data.

    PubMed

    Dana, Alexandra; Tuller, Tamir

    2014-12-01

    Gene translation modeling and prediction is a fundamental problem that has numerous biomedical implementations. In this work we present a novel, user-friendly tool/index for calculating the mean of the typical decoding rates that enables predicting translation elongation efficiency of protein coding genes for different tissue types, developmental stages, and experimental conditions. The suggested translation efficiency index is based on the analysis of the organism's ribosome profiling data. This index could be used for example to predict changes in translation elongation efficiency of lowly expressed genes that usually have relatively low and/or biased ribosomal densities and protein levels measurements, or can be used for example for predicting translation efficiency of new genetically engineered genes. We demonstrate the usability of this index via the analysis of six organisms in different tissues and developmental stages. Distributable cross platform application and guideline are available for download at: http://www.cs.tau.ac.il/~tamirtul/MTDR/MTDR_Install.html. Copyright © 2015 Dana and Tuller.

  20. Quantum-mechanics-derived 13Cα chemical shift server (CheShift) for protein structure validation

    PubMed Central

    Vila, Jorge A.; Arnautova, Yelena A.; Martin, Osvaldo A.; Scheraga, Harold A.

    2009-01-01

    A server (CheShift) has been developed to predict 13Cα chemical shifts of protein structures. It is based on the generation of 696,916 conformations as a function of the φ, ψ, ω, χ1 and χ2 torsional angles for all 20 naturally occurring amino acids. Their 13Cα chemical shifts were computed at the DFT level of theory with a small basis set and extrapolated, with an empirically-determined linear regression formula, to reproduce the values obtained with a larger basis set. Analysis of the accuracy and sensitivity of the CheShift predictions, in terms of both the correlation coefficient R and the conformational-averaged rmsd between the observed and predicted 13Cα chemical shifts, was carried out for 3 sets of conformations: (i) 36 x-ray-derived protein structures solved at 2.3 Å or better resolution, for which sets of 13Cα chemical shifts were available; (ii) 15 pairs of x-ray and NMR-derived sets of protein conformations; and (iii) a set of decoys for 3 proteins showing an rmsd with respect to the x-ray structure from which they were derived of up to 3 Å. Comparative analysis carried out with 4 popular servers, namely SHIFTS, SHIFTX, SPARTA, and PROSHIFT, for these 3 sets of conformations demonstrated that CheShift is the most sensitive server with which to detect subtle differences between protein models and, hence, to validate protein structures determined by either x-ray or NMR methods, if the observed 13Cα chemical shifts are available. CheShift is available as a web server. PMID:19805131

  1. Comparative genome analysis of entomopathogenic fungi reveals a complex set of secreted proteins.

    PubMed

    Staats, Charley Christian; Junges, Angela; Guedes, Rafael Lucas Muniz; Thompson, Claudia Elizabeth; de Morais, Guilherme Loss; Boldo, Juliano Tomazzoni; de Almeida, Luiz Gonzaga Paula; Andreis, Fábio Carrer; Gerber, Alexandra Lehmkuhl; Sbaraini, Nicolau; da Paixão, Rana Louise de Andrade; Broetto, Leonardo; Landell, Melissa; Santi, Lucélia; Beys-da-Silva, Walter Orlando; Silveira, Carolina Pereira; Serrano, Thaiane Rispoli; de Oliveira, Eder Silva; Kmetzsch, Lívia; Vainstein, Marilene Henning; de Vasconcelos, Ana Tereza Ribeiro; Schrank, Augusto

    2014-09-29

    Metarhizium anisopliae is an entomopathogenic fungus used in the biological control of some agricultural insect pests, and efforts are underway to use this fungus in the control of insect-borne human diseases. A large repertoire of proteins must be secreted by M. anisopliae to cope with the various available nutrients as this fungus switches through different lifestyles, i.e., from a saprophytic, to an infectious, to a plant endophytic stage. To further evaluate the predicted secretome of M. anisopliae, we employed genomic and transcriptomic analyses, coupled with phylogenomic analysis, focusing on the identification and characterization of secreted proteins. We determined the M. anisopliae E6 genome sequence and compared this sequence to other entomopathogenic fungi genomes. A robust pipeline was generated to evaluate the predicted secretomes of M. anisopliae and 15 other filamentous fungi, leading to the identification of a core of secreted proteins. Transcriptomic analysis using the tick Rhipicephalus microplus cuticle as an infection model during two periods of infection (48 and 144 h) allowed the identification of several differentially expressed genes. This analysis concluded that a large proportion of the predicted secretome coding genes contained altered transcript levels in the conditions analyzed in this study. In addition, some specific secreted proteins from Metarhizium have an evolutionary history similar to orthologs found in Beauveria/Cordyceps. This similarity suggests that a set of secreted proteins has evolved to participate in entomopathogenicity. The data presented represents an important step to the characterization of the role of secreted proteins in the virulence and pathogenicity of M. anisopliae.

  2. In silico prediction of protein-protein interactions in human macrophages

    PubMed Central

    2014-01-01

    Background Protein-protein interaction (PPI) network analyses are highly valuable in deciphering and understanding the intricate organisation of cellular functions. Nevertheless, the majority of available protein-protein interaction networks are context-less, i.e. without any reference to the spatial, temporal or physiological conditions in which the interactions may occur. In this work, we are proposing a protocol to infer the most likely protein-protein interaction (PPI) network in human macrophages. Results We integrated the PPI dataset from the Agile Protein Interaction DataAnalyzer (APID) with different meta-data to infer a contextualized macrophage-specific interactome using a combination of statistical methods. The obtained interactome is enriched in experimentally verified interactions and in proteins involved in macrophage-related biological processes (i.e. immune response activation, regulation of apoptosis). As a case study, we used the contextualized interactome to highlight the cellular processes induced upon Mycobacterium tuberculosis infection. Conclusion Our work confirms that contextualizing interactomes improves the biological significance of bioinformatic analyses. More specifically, studying such inferred network rather than focusing at the gene expression level only, is informative on the processes involved in the host response. Indeed, important immune features such as apoptosis are solely highlighted when the spotlight is on the protein interaction level. PMID:24636261

  3. Posaconazole in Human Serum: a Greater Pharmacodynamic Effect than Predicted by the Non-Protein-Bound Serum Concentration ▿

    PubMed Central

    Lignell, Anders; Löwdin, Elisabeth; Cars, Otto; Chryssanthou, Erja; Sjölin, Jan

    2011-01-01

    It is generally accepted that only the unbound fraction of a drug is pharmacologically active. Posaconazole is an antifungal agent with a protein binding of 98 to 99%. Taking into account the degree of protein binding, plasma levels in patients, and MIC levels of susceptible strains, it can be assumed that the free concentration of posaconazole sometimes will be too low to exert the expected antifungal effect. The aim was therefore to test the activity of posaconazole in serum in comparison with that of the calculated unbound concentrations in protein-free media. Significant differences (P < 0.05) from the serum control were found at serum concentrations of posaconazole of 1.0 and 0.10 mg/liter, with calculated free concentrations corresponding to 1× MIC and 0.1× MIC, respectively, against one Candida lusitaniae strain selected for proof of principle. In RPMI 1640, the corresponding calculated unbound concentration of 0.015 mg/liter resulted in a significant effect, whereas that of 0.0015 mg/liter did not. Also, against seven additional Candida strains tested, there was an effect of the low posaconazole concentration in serum, in contrast to the results in RPMI 1640. Fluconazole, a low-grade-protein-bound antifungal, was used for comparison at corresponding concentrations in serum and RPMI 1640. No effect was observed at the serum concentration, resulting in a calculated unbound concentration of 0.1× MIC. In summary, there was a substantially greater pharmacodynamic effect of posaconazole in human serum than could be predicted by the non-protein-bound serum concentration. A flux from serum protein-bound to fungal lanosterol 14α-demethylase-bound posaconazole is suggested. PMID:21502622

  4. Identification of two novel mammalian genes establishes a subfamily of KH-domain RNA-binding proteins.

    PubMed

    Makeyev, A V; Liebhaber, S A

    2000-08-01

    We have identified two novel human genes encoding proteins with a high level of sequence identity to two previously characterized RNA-binding proteins, alphaCP-1 and alphaCP-2. Both of these novel genes, alphaCP-3 and alphaCP-4, are predicted to encode proteins with triplicated KH domains. The number and organization of the KH domains, their sequences, and the sequences of the contiguous regions are conserved among all four alphaCP proteins. The common evolutionary origin of these proteins is substantiated by conservation of exon-intron organization in the corresponding genes. The map positions of alphaCP-1 and alphaCP-2 (previously reported) and those of alphaCP-3 and alphaCP-4 (present report) reveal that the four alphaCP loci are dispersed in the human genome; alphaCP-3 and alphaCP-4 mapped to 21q22.3 and 3p21, and the respective mouse orthologues mapped to syntenic regions of the mouse genome, 10B5 and 9F1-F2, respectively. Two additional loci in the human genome were identified as alphaCP-2 processed pseudogenes (PCBP2P1, 21q22.3, and PCBP2P2, 8q21-q22). Although the overall levels of alphaCP-3 and alphaCP-4 mRNAs are substantially lower than those of alphaCP-1 and alphaCP-2, transcripts of alphaCP-3 and alphaCP-4 were found in all mouse tissues tested. These data establish a new subfamily of genes predicted to encode closely related KH-containing RNA-binding proteins with potential functions in posttranscriptional controls. Copyright 2000 Academic Press.

  5. Blood Preservation Study.

    DTIC Science & Technology

    1983-01-01

    1977. 3. Wood LA and Beutler E: The effect of periodic mixing on the preservation of 2,3- diphosphoglycerate (2,3-DPG) levels in stored blood. Blood 42:17...ATP levels and the viability of red cells was investigated; and several procedures for protein extraction from red cells were performed. DO 1473... levels are a disappointing parameter with respect to predicting the viability of stored red cells (5-8). There is a great need to identify measurements

  6. Effect of Treatment of Cystic Fibrosis Pulmonary Exacerbations on Systemic Inflammation

    PubMed Central

    Thompson, Valeria; Chmiel, James F.; Montgomery, Gregory S.; Nasr, Samya Z.; Perkett, Elizabeth; Saavedra, Milene T.; Slovis, Bonnie; Anthony, Margaret M.; Emmett, Peggy; Heltshe, Sonya L.

    2015-01-01

    Rationale: In cystic fibrosis (CF), pulmonary exacerbations present an opportunity to define the effect of antibiotic therapy on systemic measures of inflammation. Objectives: Investigate whether plasma inflammatory proteins demonstrate and predict a clinical response to antibiotic therapy and determine which proteins are associated with measures of clinical improvement. Methods: In this multicenter study, a panel of 15 plasma proteins was measured at the onset and end of treatment for pulmonary exacerbation and at a clinically stable visit in patients with CF who were 10 years of age or older. Measurements and Main Results: Significant reductions in 10 plasma proteins were observed in 103 patients who had paired blood collections during antibiotic treatment for pulmonary exacerbations. Plasma C-reactive protein, serum amyloid A, calprotectin, and neutrophil elastase antiprotease complexes correlated most strongly with clinical measures at exacerbation onset. Reductions in C-reactive protein, serum amyloid A, IL-1ra, and haptoglobin were most associated with improvements in lung function with antibiotic therapy. Having higher IL-6, IL-8, and α1-antitrypsin (α1AT) levels at exacerbation onset were associated with an increased risk of being a nonresponder (i.e., failing to recover to baseline FEV1). Baseline IL-8, neutrophil elastase antiprotease complexes, and α1AT along with changes in several plasma proteins with antibiotic treatment, in combination with FEV1 at exacerbation onset, were predictive of being a treatment responder. Conclusions: Circulating inflammatory proteins demonstrate and predict a response to treatment of CF pulmonary exacerbations. A systemic biomarker panel could speed up drug discovery, leading to a quicker, more efficient drug development process for the CF community. PMID:25714657

  7. Molecular Dynamics in Mixed Solvents Reveals Protein-Ligand Interactions, Improves Docking, and Allows Accurate Binding Free Energy Predictions.

    PubMed

    Arcon, Juan Pablo; Defelipe, Lucas A; Modenutti, Carlos P; López, Elias D; Alvarez-Garcia, Daniel; Barril, Xavier; Turjanski, Adrián G; Martí, Marcelo A

    2017-04-24

    One of the most important biological processes at the molecular level is the formation of protein-ligand complexes. Therefore, determining their structure and underlying key interactions is of paramount relevance and has direct applications in drug development. Because of its low cost relative to its experimental sibling, molecular dynamics (MD) simulations in the presence of different solvent probes mimicking specific types of interactions have been increasingly used to analyze protein binding sites and reveal protein-ligand interaction hot spots. However, a systematic comparison of different probes and their real predictive power from a quantitative and thermodynamic point of view is still missing. In the present work, we have performed MD simulations of 18 different proteins in pure water as well as water mixtures of ethanol, acetamide, acetonitrile and methylammonium acetate, leading to a total of 5.4 μs simulation time. For each system, we determined the corresponding solvent sites, defined as space regions adjacent to the protein surface where the probability of finding a probe atom is higher than that in the bulk solvent. Finally, we compared the identified solvent sites with 121 different protein-ligand complexes and used them to perform molecular docking and ligand binding free energy estimates. Our results show that combining solely water and ethanol sites allows sampling over 70% of all possible protein-ligand interactions, especially those that coincide with ligand-based pharmacophoric points. Most important, we also show how the solvent sites can be used to significantly improve ligand docking in terms of both accuracy and precision, and that accurate predictions of ligand binding free energies, along with relative ranking of ligand affinity, can be performed.

  8. Radial spoke proteins of Chlamydomonas flagella

    PubMed Central

    Yang, Pinfen; Diener, Dennis R.; Yang, Chun; Kohno, Takahiro; Pazour, Gregory J.; Dienes, Jennifer M.; Agrin, Nathan S.; King, Stephen M.; Sale, Winfield S.; Kamiya, Ritsu; Rosenbaum, Joel L.; Witman, George B.

    2007-01-01

    Summary The radial spoke is a ubiquitous component of ‘9+2’ cilia and flagella, and plays an essential role in the control of dynein arm activity by relaying signals from the central pair of microtubules to the arms. The Chlamydomonas reinhardtii radial spoke contains at least 23 proteins, only 8 of which have been characterized at the molecular level. Here, we use mass spectrometry to identify 10 additional radial spoke proteins. Many of the newly identified proteins in the spoke stalk are predicted to contain domains associated with signal transduction, including Ca2+-, AKAP- and nucleotide-binding domains. This suggests that the spoke stalk is both a scaffold for signaling molecules and itself a transducer of signals. Moreover, in addition to the recently described HSP40 family member, a second spoke stalk protein is predicted to be a molecular chaperone, implying that there is a sophisticated mechanism for the assembly of this large complex. Among the 18 spoke proteins identified to date, at least 12 have apparent homologs in humans, indicating that the radial spoke has been conserved throughout evolution. The human genes encoding these proteins are candidates for causing primary ciliary dyskinesia, a severe inherited disease involving missing or defective axonemal structures, including the radial spokes. PMID:16507594

  9. A Survey of Computational Intelligence Techniques in Protein Function Prediction

    PubMed Central

    Tiwari, Arvind Kumar; Srivastava, Rajeev

    2014-01-01

    During the past, there was a massive growth of knowledge of unknown proteins with the advancement of high throughput microarray technologies. Protein function prediction is the most challenging problem in bioinformatics. In the past, the homology based approaches were used to predict the protein function, but they failed when a new protein was different from the previous one. Therefore, to alleviate the problems associated with homology based traditional approaches, numerous computational intelligence techniques have been proposed in the recent past. This paper presents a state-of-the-art comprehensive review of various computational intelligence techniques for protein function predictions using sequence, structure, protein-protein interaction network, and gene expression data used in wide areas of applications such as prediction of DNA and RNA binding sites, subcellular localization, enzyme functions, signal peptides, catalytic residues, nuclear/G-protein coupled receptors, membrane proteins, and pathway analysis from gene expression datasets. This paper also summarizes the result obtained by many researchers to solve these problems by using computational intelligence techniques with appropriate datasets to improve the prediction performance. The summary shows that ensemble classifiers and integration of multiple heterogeneous data are useful for protein function prediction. PMID:25574395

  10. Avoidance of truncated proteins from unintended ribosome binding sites within heterologous protein coding sequences.

    PubMed

    Whitaker, Weston R; Lee, Hanson; Arkin, Adam P; Dueber, John E

    2015-03-20

    Genetic sequences ported into non-native hosts for synthetic biology applications can gain unexpected properties. In this study, we explored sequences functioning as ribosome binding sites (RBSs) within protein coding DNA sequences (CDSs) that cause internal translation, resulting in truncated proteins. Genome-wide prediction of bacterial RBSs, based on biophysical calculations employed by the RBS calculator, suggests a selection against internal RBSs within CDSs in Escherichia coli, but not those in Saccharomyces cerevisiae. Based on these calculations, silent mutations aimed at removing internal RBSs can effectively reduce truncation products from internal translation. However, a solution for complete elimination of internal translation initiation is not always feasible due to constraints of available coding sequences. Fluorescence assays and Western blot analysis showed that in genes with internal RBSs, increasing the strength of the intended upstream RBS had little influence on the internal translation strength. Another strategy to minimize truncated products from an internal RBS is to increase the relative strength of the upstream RBS with a concomitant reduction in promoter strength to achieve the same protein expression level. Unfortunately, lower transcription levels result in increased noise at the single cell level due to stochasticity in gene expression. At the low expression regimes desired for many synthetic biology applications, this problem becomes particularly pronounced. We found that balancing promoter strengths and upstream RBS strengths to intermediate levels can achieve the target protein concentration while avoiding both excessive noise and truncated protein.

  11. Prediction of Body Fluids where Proteins are Secreted into Based on Protein Interaction Network

    PubMed Central

    Hu, Le-Le; Huang, Tao; Cai, Yu-Dong; Chou, Kuo-Chen

    2011-01-01

    Determining the body fluids where secreted proteins can be secreted into is important for protein function annotation and disease biomarker discovery. In this study, we developed a network-based method to predict which kind of body fluids human proteins can be secreted into. For a newly constructed benchmark dataset that consists of 529 human-secreted proteins, the prediction accuracy for the most possible body fluid location predicted by our method via the jackknife test was 79.02%, significantly higher than the success rate by a random guess (29.36%). The likelihood that the predicted body fluids of the first four orders contain all the true body fluids where the proteins can be secreted into is 62.94%. Our method was further demonstrated with two independent datasets: one contains 57 proteins that can be secreted into blood; while the other contains 61 proteins that can be secreted into plasma/serum and were possible biomarkers associated with various cancers. For the 57 proteins in first dataset, 55 were correctly predicted as blood-secrete proteins. For the 61 proteins in the second dataset, 58 were predicted to be most possible in plasma/serum. These encouraging results indicate that the network-based prediction method is quite promising. It is anticipated that the method will benefit the relevant areas for both basic research and drug development. PMID:21829572

  12. Interferon-γ-inducible protein-10 in chronic hepatitis C: Correlations with insulin resistance, histological features & sustained virological response.

    PubMed

    Crisan, Dana; Grigorescu, Mircea Dan; Radu, Corina; Suciu, Alina; Grigorescu, Mircea

    2017-04-01

    One of the multiple factors contributing to virological response in chronic hepatitis C (CHC) is interferon-gamma-inducible protein-10 (IP-10). Its level reflects the status of interferon-stimulated genes, which in turn is associated with virological response to antiviral therapy. The aim of this study was to evaluate the role of serum IP-10 levels on sustained virological response (SVR) and the association of this parameter with insulin resistance (IR) and liver histology. Two hundred and three consecutive biopsy proven CHC patients were included in the study. Serum levels of IP-10 were determined using ELISA method. IR was evaluated by homeostasis model assessment-IR (HOMA-IR). Histological features were assessed invasively by liver biopsy and noninvasively using FibroTest, ActiTest and SteatoTest. Predictive factors for SVR and their interrelations were assessed. A cut-off value for IP-10 of 392 pg/ml was obtained to discriminate between responders and non-responders. SVR was obtained in 107 patients (52.70%). Area under the receiver operating characteristic curve for SVR was 0.875 with a sensitivity of 91.6 per cent, specificity 74.7 per cent, positive predictive value 80.3 per cent and negative predictive value 88.7 per cent. Higher values of IP-10 were associated with increasing stages of fibrosis (P<0.01) and higher grades of inflammation (P=0.02, P=0.07) assessed morphologically and noninvasively through FibroTest and ActiTest. Significant steatosis and IR were also associated with increased levels of IP-10 (P=0.01 and P=0.02). In multivariate analysis, IP-10 levels and fibrosis stages were independently associated with SVR. Our findings showed that the assessment of serum IP-10 level could be a predictive factor for SVR and it was associated with fibrosis, necroinflammatory activity, significant steatosis and IR in patients with chronic HCV infection.

  13. Systemic inflammatory response and serum lipopolysaccharide levels predict multiple organ failure and death in alcoholic hepatitis.

    PubMed

    Michelena, Javier; Altamirano, José; Abraldes, Juan G; Affò, Silvia; Morales-Ibanez, Oriol; Sancho-Bru, Pau; Dominguez, Marlene; García-Pagán, Juan Carlos; Fernández, Javier; Arroyo, Vicente; Ginès, Pere; Louvet, Alexandre; Mathurin, Philippe; Mehal, Wajahat Z; Caballería, Juan; Bataller, Ramón

    2015-09-01

    Alcoholic hepatitis (AH) frequently progresses to multiple organ failure (MOF) and death. However, the driving factors are largely unknown. At admission, patients with AH often show criteria of systemic inflammatory response syndrome (SIRS) even in the absence of an infection. We hypothesize that the presence of SIRS may predispose to MOF and death. To test this hypothesis, we studied a cohort including 162 patients with biopsy-proven AH. The presence of SIRS and infections was assessed in all patients, and multivariate analyses identified variables independently associated with MOF and 90-day mortality. At admission, 32 (19.8%) patients were diagnosed with a bacterial infection, while 75 (46.3%) fulfilled SIRS criteria; 58 patients (35.8%) developed MOF during hospitalization. Short-term mortality was significantly higher among patients who developed MOF (62.1% versus 3.8%, P < 0.001). The presence of SIRS was a major predictor of MOF (odds ratio = 2.69, P = 0.025) and strongly correlated with mortality. Importantly, the course of patients with SIRS with and without infection was similar in terms of MOF development and short-term mortality. Finally, we sought to identify serum markers that differentiate SIRS with and without infection. We studied serum levels of high-sensitivity C-reactive protein, procalcitonin, and lipopolysaccharide at admission. All of them predicted mortality. Procalcitonin, but not high-sensitivity C-reactive protein, serum levels identified those patients with SIRS and infection. Lipopolysaccharide serum levels predicted MOF and the response to prednisolone. In the presence or absence of infections, SIRS is a major determinant of MOF and mortality in AH, and the mechanisms involved in the development of SIRS should be investigated; procalcitonin serum levels can help to identify patients with infection, and lipopolysaccharide levels may help to predict mortality and the response to steroids. © 2015 by the American Association for the Study of Liver Diseases.

  14. Cerebrospinal fluid markers of neuronal and glial cell damage to monitor disease activity and predict long-term outcome in patients with autoimmune encephalitis.

    PubMed

    Constantinescu, R; Krýsl, D; Bergquist, F; Andrén, K; Malmeström, C; Asztély, F; Axelsson, M; Menachem, E B; Blennow, K; Rosengren, L; Zetterberg, H

    2016-04-01

    Clinical symptoms and long-term outcome of autoimmune encephalitis are variable. Diagnosis requires multiple investigations, and treatment strategies must be individually tailored. Better biomarkers are needed for diagnosis, to monitor disease activity and to predict long-term outcome. The value of cerebrospinal fluid (CSF) markers of neuronal [neurofilament light chain protein (NFL), and total tau protein (T-tau)] and glial cell [glial fibrillary acidic protein (GFAP)] damage in patients with autoimmune encephalitis was investigated. Demographic, clinical, magnetic resonance imaging, CSF and antibody-related data of 25 patients hospitalized for autoimmune encephalitis and followed for 1 year were retrospectively collected. Correlations between these data and consecutive CSF levels of NFL, T-tau and GFAP were investigated. Disability, assessed by the modified Rankin scale, was used for evaluation of disease activity and long-term outcome. The acute stage of autoimmune encephalitis was accompanied by high CSF levels of NFL and T-tau, whereas normal or significantly lower levels were observed after clinical improvement 1 year later. NFL and T-tau reacted in a similar way but at different speeds, with T-tau reacting faster. CSF levels of GFAP were initially moderately increased but did not change significantly later on. Final outcome (disability at 1 year) directly correlated with CSF-NFL and CSF-GFAP levels at all time-points and with CSF-T-tau at 3 ± 1 months. This correlation remained significant after age adjustment for CSF-NFL and T-tau but not for GFAP. In autoimmune encephalitis, CSF levels of neuronal and glial cell damage markers appear to reflect disease activity and long-term disability. © 2016 EAN.

  15. Pentraxin 3 predicts complicated course of febrile neutropenia in haematological patients, but the decision level depends on the underlying malignancy.

    PubMed

    Juutilainen, Auni; Vänskä, Matti; Pulkki, Kari; Hämäläinen, Sari; Nousiainen, Tapio; Jantunen, Esa; Koivula, Irma

    2011-11-01

    This study aimed at assessing the cut-off levels for pentraxin 3 (PTX3) in predicting complications of neutropenic fever (bacteraemia, septic shock) in haematological patients. A prospective study during 2006-2009 was performed at haematology ward in Kuopio University Hospital. A patient was eligible for the study if having neutropenic fever after intensive therapy for acute myeloid leukaemia (AML) (n = 32) or non-Hodgkin lymphoma (NHL) (n = 35). Blood cultures were taken, and maximal PTX3 and C-reactive protein (CRP) were evaluated during d0 to d3 from the beginning of fever onset. The level of PTX3 was associated with both the underlying malignancy and the presence of complications, with highest level in NHL patients with complicated course of febrile neutropenia and lowest in AML patients with non-complicated course. The cut-off level of PTX3 to predict complications was ten-fold in patients with NHL (115 μg/L) in comparison with patients with AML (11.5 μg/L). In combined analysis based on separate cut-offs, PTX3 predicted complications of febrile neutropenia with sensitivity of 0.86, specificity of 0.83, positive predictive value of 0.57 and negative predictive value of 0.96.   PTX3 was superior to CRP in predicting complicated course of febrile neutropenia, but only when the effect of the underlying malignancy had been taken into account. © 2011 John Wiley & Sons A/S.

  16. Protein carbamylation predicts mortality in ESRD.

    PubMed

    Koeth, Robert A; Kalantar-Zadeh, Kamyar; Wang, Zeneng; Fu, Xiaoming; Tang, W H Wilson; Hazen, Stanley L

    2013-04-01

    Traditional risk factors fail to explain the increased risk for cardiovascular morbidity and mortality in ESRD. Cyanate, a reactive electrophilic species in equilibrium with urea, posttranslationally modifies proteins through a process called carbamylation, which promotes atherosclerosis. The plasma level of protein-bound homocitrulline (PBHCit), which results from carbamylation, predicts major adverse cardiac events in patients with normal renal function, but whether this relationship is similar in ESRD is unknown. We quantified serum PBHCit in a cohort of 347 patients undergoing maintenance hemodialysis with 5 years of follow-up. Kaplan-Meier analyses revealed a significant association between elevated PBHCit and death (log-rank P<0.01). After adjustment for patient characteristics, laboratory values, and comorbid conditions, the risk for death among patients with PBHCit values in the highest tertile was more than double the risk among patients with values in the middle tertile (adjusted hazard ratio [HR], 2.4; 95% confidence interval [CI], 1.5-3.9) or the lowest tertile (adjusted HR, 2.3; 95% CI, 1.5-3.7). Including PBHCit significantly improved the multivariable model, with a net reclassification index of 14% (P<0.01). In summary, serum PBHCit, a footprint of protein carbamylation, predicts increased cardiovascular risk in patients with ESRD, supporting a mechanistic link among uremia, inflammation, and atherosclerosis.

  17. Repressing a Repressor

    PubMed Central

    Silverstone, Aron L.; Jung, Hou-Sung; Dill, Alyssa; Kawaide, Hiroshi; Kamiya, Yuji; Sun, Tai-ping

    2001-01-01

    RGA (for repressor of ga1-3) and SPINDLY (SPY) are likely repressors of gibberellin (GA) signaling in Arabidopsis because the recessive rga and spy mutations partially suppressed the phenotype of the GA-deficient mutant ga1-3. We found that neither rga nor spy altered the GA levels in the wild-type or the ga1-3 background. However, expression of the GA biosynthetic gene GA4 was reduced 26% by the rga mutation, suggesting that partial derepression of the GA response pathway by rga resulted in the feedback inhibition of GA4 expression. The green fluorescent protein (GFP)–RGA fusion protein was localized to nuclei in transgenic Arabidopsis. This result supports the predicted function of RGA as a transcriptional regulator based on sequence analysis. Confocal microscopy and immunoblot analyses demonstrated that the levels of both the GFP-RGA fusion protein and endogenous RGA were reduced rapidly by GA treatment. Therefore, the GA signal appears to derepress the GA signaling pathway by degrading the repressor protein RGA. The effect of rga on GA4 gene expression and the effect of GA on RGA protein level allow us to identify part of the mechanism by which GA homeostasis is achieved. PMID:11449051

  18. Prediction of Protein Configurational Entropy (Popcoen).

    PubMed

    Goethe, Martin; Gleixner, Jan; Fita, Ignacio; Rubi, J Miguel

    2018-03-13

    A knowledge-based method for configurational entropy prediction of proteins is presented; this methodology is extremely fast, compared to previous approaches, because it does not involve any type of configurational sampling. Instead, the configurational entropy of a query fold is estimated by evaluating an artificial neural network, which was trained on molecular-dynamics simulations of ∼1000 proteins. The predicted entropy can be incorporated into a large class of protein software based on cost-function minimization/evaluation, in which configurational entropy is currently neglected for performance reasons. Software of this type is used for all major protein tasks such as structure predictions, proteins design, NMR and X-ray refinement, docking, and mutation effect predictions. Integrating the predicted entropy can yield a significant accuracy increase as we show exemplarily for native-state identification with the prominent protein software FoldX. The method has been termed Popcoen for Prediction of Protein Configurational Entropy. An implementation is freely available at http://fmc.ub.edu/popcoen/ .

  19. RaptorX-Angle: real-value prediction of protein backbone dihedral angles through a hybrid method of clustering and deep learning.

    PubMed

    Gao, Yujuan; Wang, Sheng; Deng, Minghua; Xu, Jinbo

    2018-05-08

    Protein dihedral angles provide a detailed description of protein local conformation. Predicted dihedral angles can be used to narrow down the conformational space of the whole polypeptide chain significantly, thus aiding protein tertiary structure prediction. However, direct angle prediction from sequence alone is challenging. In this article, we present a novel method (named RaptorX-Angle) to predict real-valued angles by combining clustering and deep learning. Tested on a subset of PDB25 and the targets in the latest two Critical Assessment of protein Structure Prediction (CASP), our method outperforms the existing state-of-art method SPIDER2 in terms of Pearson Correlation Coefficient (PCC) and Mean Absolute Error (MAE). Our result also shows approximately linear relationship between the real prediction errors and our estimated bounds. That is, the real prediction error can be well approximated by our estimated bounds. Our study provides an alternative and more accurate prediction of dihedral angles, which may facilitate protein structure prediction and functional study.

  20. Characterization and molecular modeling of Inositol 1,3,4 tris phosphate 5/6 kinase-2 from Glycine max (L) Merr.: comprehending its evolutionary conservancy at functional level.

    PubMed

    Marathe, Ashish; Krishnan, Veda; Mahajan, Mahesh M; Thimmegowda, Vinutha; Dahuja, Anil; Jolly, Monica; Praveen, Shelly; Sachdev, Archana

    2018-01-01

    Soybean genome encodes a family of four inositol 1,3,4 trisphosphate 5/6 kinases which belong to the ATP-GRASP group of proteins. Inositol 1,3,4 trisphosphate kinase-2 ( GmItpk2 ), catalyzing the ATP-dependent phosphorylation of Inositol 1,3,4 trisphosphate (IP3) to Inositol 1,3,4,5 tetra phosphate or Inositol 1,3,4,6 tetra phosphate, is a key enzyme diverting the flux of inositol phosphate pool towards phytate biosynthesis. Although considerable research on characterizing genes involved in phytate biosynthesis is accomplished at genomic and transcript level, characterization of the proteins is yet to be explored. In the present study, we report the isolation and expression of single copy Itpk 2 (948 bp) from Glycine max cv Pusa-16 predicted to encode 315 amino acid protein with an isoelectric point of 5.9. Sequence analysis revealed that Gm ITPK2 shared highest similarity (80%) with Phaseolus vulgaris. The predicted 3D model confirmed 12 α helices and 14 β barrel sheets with ATP-binding site close to β sheet present towards the C-terminus of the protein molecule. Spatio-temporal transcript profiling signified GmItpk2 to be seed specific, with higher transcript levels in the early stage of seed development. The present study using various molecular and bio-computational tools could, therefore, help in improving our understanding of this key enzyme and prove to be a potential target towards generating low phytate trait in nutritionally rich crop like soybean.

  1. Platelet-derived growth factor receptor beta: a novel urinary biomarker for recurrence of non-muscle-invasive bladder cancer.

    PubMed

    Feng, Jiayu; He, Weifeng; Song, Yajun; Wang, Ying; Simpson, Richard J; Zhang, Xiaorong; Luo, Gaoxing; Wu, Jun; Huang, Chibing

    2014-01-01

    Non-muscle-invasive bladder cancer (NMIBC) is one of the most common malignant tumors in the urological system with a high risk of recurrence, and effective non-invasive biomarkers for NMIBC relapse are still needed. The human urinary proteome can reflect the status of the microenvironment of the urinary system and is an ideal source for clinical diagnosis of urinary system diseases. Our previous work used proteomics to identify 1643 high-confidence urinary proteins in the urine from a healthy population. Here, we used bioinformatics to construct a cancer-associated protein-protein interaction (PPI) network comprising 16 high-abundance urinary proteins based on the urinary proteome database. As a result, platelet-derived growth factor receptor beta (PDGFRB) was selected for further validation as a candidate biomarker for NMIBC diagnosis and prognosis. Although the levels of urinary PDGFRB showed no significant difference between patients pre- and post-surgery (n = 185, P>0.05), over 3 years of follow-up, urinary PDGFRB was shown to be significantly higher in relapsed patients (n = 68) than in relapse-free patients (n = 117, P<0.001). The levels of urinary PDGFRB were significantly correlated with the risk of 3-year recurrence of NMIBC, and these levels improved the accuracy of a NMIBC recurrence risk prediction model that included age, tumor size, and tumor number (area under the curve, 0.862; 95% CI, 0.809 to 0.914) compared to PDGFR alone. Therefore, we surmise that urinary PDGFRB could serve as a non-invasive biomarker for predicting NMIBC recurrence.

  2. Inflammation and hemostasis biomarkers for predicting stroke in postmenopausal women: The Women’s Health Initiative Observational Study

    PubMed Central

    Kaplan, Robert C; McGinn, Aileen P; Baird, Alison E; Hendrix, Susan L; Kooperberg, Charles; Lynch, John; Rosenbaum, Daniel M; Johnson, Karen C; Strickler, Howard D; Wassertheil-Smoller, Sylvia

    2009-01-01

    Background Inflammatory and hemostasis-related biomarkers may identify women at risk of stroke. Methods Hormones and Biomarkers Predicting Stroke is a study of ischemic stroke among postmenopausal women participating in the Women’s Health Initiative Observational Study (n = 972 case-control pairs). A Biomarker Risk Score was derived from levels of seven inflammatory and hemostasis-related biomarkers that appeared individually to predict risk of ischemic stroke: C-reactive protein, interleukin-6, tissue plasminogen activator, D-dimer, white blood cell count, neopterin, and homocysteine. The c index was used to evaluate discrimination. Results Of all the individual biomarkers examined, C-reactive protein emerged as the only independent single predictor of ischemic stroke (adjusted odds ratio comparing Q4 versus Q1 = 1.64, 95% confidence interval: 1.15–2.32, p = 0.01) after adjustment for other biomarkers and standard stroke risk factors. The Biomarker Risk Score identified a gradient of increasing stroke risk with a greater number of elevated inflammatory/hemostasis biomarkers, and improved the c index significantly compared with standard stroke risk factors (p = 0.02). Among the subset of individuals who met current criteria for “high risk” levels of C-reactive protein (> 3.0 mg/L), the Biomarker Risk Score defined an approximately two-fold gradient of risk. We found no evidence for a relationship between stroke and levels of E-selectin, fibrinogen, tumor necrosis factor-alpha, vascular cell adhesion molecule-1, prothrombin fragment 1+2, Factor VIIC, or plasminogen activator inhibitor-1 antigen (p >0.15). Discussion The findings support the further exploration of multiple-biomarker panels to develop approaches for stratifying an individual’s risk of stroke. PMID:18984425

  3. Predictive factors for the sensitivity of radiotherapy and prognosis of esophageal squamous cell carcinoma.

    PubMed

    Wu, Shaobin; Wang, Xianwei; Chen, Jin-Xiang; Chen, Yuxiang

    2014-05-01

    To identify predictive biomarkers for radiosensitization and prognosis of esophageal squamous cell carcinoma (ESCC). A total of 150 advanced stage ESCC patients were treated with preoperative radiotherapy. The protein levels of Dicer 1, DNA methyltransferase 1 (Dnmt1), and DNA-dependent protein kinase catalytic subunit (DNA-PKcs) and the mRNA levels of Dicer 1, Dnmt1, and let-7b microRNA (miRNA) were measured in ESCC tumor tissues before and after radiotherapy. Global DNA methylation was measured and terminal deoxynucleotidyl transferase dUTP nick end labeling (TUNEL) staining was performed. Negative Dicer 1, Dnmt1, and DNA-PKcs protein expression were observed in 72%, 67.3%, and 50.7% of ESCC patients, respectively. Primary Dicer 1 and Dnmt1 expression positively correlated with radiation sensitization and longer survival of ESCC patients, while increased Dicer 1 and Dnmt1 expression after radiation correlated with increased apoptosis in residual tumor tissues. Dicer 1 and Dnmt1 expression correlated with let-7b miRNA expression and global DNA methylation levels, respectively. In contrast, positive DNA-PKcs expression negatively correlated with radiation-induced pathological reactions, and increased DNA-PKcs expression correlated with increased apoptosis after radiation. Global DNA hypomethylation and low miRNA expression are involved in the sensitization of ESCC to radiotherapy and prognosis of patients with ESCC.

  4. PredictProtein—an open resource for online prediction of protein structural and functional features

    PubMed Central

    Yachdav, Guy; Kloppmann, Edda; Kajan, Laszlo; Hecht, Maximilian; Goldberg, Tatyana; Hamp, Tobias; Hönigschmid, Peter; Schafferhans, Andrea; Roos, Manfred; Bernhofer, Michael; Richter, Lothar; Ashkenazy, Haim; Punta, Marco; Schlessinger, Avner; Bromberg, Yana; Schneider, Reinhard; Vriend, Gerrit; Sander, Chris; Ben-Tal, Nir; Rost, Burkhard

    2014-01-01

    PredictProtein is a meta-service for sequence analysis that has been predicting structural and functional features of proteins since 1992. Queried with a protein sequence it returns: multiple sequence alignments, predicted aspects of structure (secondary structure, solvent accessibility, transmembrane helices (TMSEG) and strands, coiled-coil regions, disulfide bonds and disordered regions) and function. The service incorporates analysis methods for the identification of functional regions (ConSurf), homology-based inference of Gene Ontology terms (metastudent), comprehensive subcellular localization prediction (LocTree3), protein–protein binding sites (ISIS2), protein–polynucleotide binding sites (SomeNA) and predictions of the effect of point mutations (non-synonymous SNPs) on protein function (SNAP2). Our goal has always been to develop a system optimized to meet the demands of experimentalists not highly experienced in bioinformatics. To this end, the PredictProtein results are presented as both text and a series of intuitive, interactive and visually appealing figures. The web server and sources are available at http://ppopen.rostlab.org. PMID:24799431

  5. Identification of immune signatures predictive of clinical protection from malaria

    PubMed Central

    2017-01-01

    Antibodies are thought to play an essential role in naturally acquired immunity to malaria. Prospective cohort studies have frequently shown how continuous exposure to the malaria parasite Plasmodium falciparum cause an accumulation of specific responses against various antigens that correlate with a decreased risk of clinical malaria episodes. However, small effect sizes and the often polymorphic nature of immunogenic parasite proteins make the robust identification of the true targets of protective immunity ambiguous. Furthermore, the degree of individual-level protection conferred by elevated responses to these antigens has not yet been explored. Here we applied a machine learning approach to identify immune signatures predictive of individual-level protection against clinical disease. We find that commonly assumed immune correlates are poor predictors of clinical protection in children. On the other hand, antibody profiles predictive of an individual’s malaria protective status can be found in data comprising responses to a large set of diverse parasite proteins. We show that this pattern emerges only after years of continuous exposure to the malaria parasite, whereas susceptibility to clinical episodes in young hosts (< 10 years) cannot be ascertained by measured antibody responses alone. PMID:29065113

  6. Computational design of chimeric protein libraries for directed evolution.

    PubMed

    Silberg, Jonathan J; Nguyen, Peter Q; Stevenson, Taylor

    2010-01-01

    The best approach for creating libraries of functional proteins with large numbers of nondisruptive amino acid substitutions is protein recombination, in which structurally related polypeptides are swapped among homologous proteins. Unfortunately, as more distantly related proteins are recombined, the fraction of variants having a disrupted structure increases. One way to enrich the fraction of folded and potentially interesting chimeras in these libraries is to use computational algorithms to anticipate which structural elements can be swapped without disturbing the integrity of a protein's structure. Herein, we describe how the algorithm Schema uses the sequences and structures of the parent proteins recombined to predict the structural disruption of chimeras, and we outline how dynamic programming can be used to find libraries with a range of amino acid substitution levels that are enriched in variants with low Schema disruption.

  7. Contact Prediction for Beta and Alpha-Beta Proteins Using Integer Linear Optimization and its Impact on the First Principles 3D Structure Prediction Method ASTRO-FOLD

    PubMed Central

    Rajgaria, R.; Wei, Y.; Floudas, C. A.

    2010-01-01

    An integer linear optimization model is presented to predict residue contacts in β, α + β, and α/β proteins. The total energy of a protein is expressed as sum of a Cα – Cα distance dependent contact energy contribution and a hydrophobic contribution. The model selects contacts that assign lowest energy to the protein structure while satisfying a set of constraints that are included to enforce certain physically observed topological information. A new method based on hydrophobicity is proposed to find the β-sheet alignments. These β-sheet alignments are used as constraints for contacts between residues of β-sheets. This model was tested on three independent protein test sets and CASP8 test proteins consisting of β, α + β, α/β proteins and was found to perform very well. The average accuracy of the predictions (separated by at least six residues) was approximately 61%. The average true positive and false positive distances were also calculated for each of the test sets and they are 7.58 Å and 15.88 Å, respectively. Residue contact prediction can be directly used to facilitate the protein tertiary structure prediction. This proposed residue contact prediction model is incorporated into the first principles protein tertiary structure prediction approach, ASTRO-FOLD. The effectiveness of the contact prediction model was further demonstrated by the improvement in the quality of the protein structure ensemble generated using the predicted residue contacts for a test set of 10 proteins. PMID:20225257

  8. Comprehensive predictions of target proteins based on protein-chemical interaction using virtual screening and experimental verifications.

    PubMed

    Kobayashi, Hiroki; Harada, Hiroko; Nakamura, Masaomi; Futamura, Yushi; Ito, Akihiro; Yoshida, Minoru; Iemura, Shun-Ichiro; Shin-Ya, Kazuo; Doi, Takayuki; Takahashi, Takashi; Natsume, Tohru; Imoto, Masaya; Sakakibara, Yasubumi

    2012-04-05

    Identification of the target proteins of bioactive compounds is critical for elucidating the mode of action; however, target identification has been difficult in general, mostly due to the low sensitivity of detection using affinity chromatography followed by CBB staining and MS/MS analysis. We applied our protocol of predicting target proteins combining in silico screening and experimental verification for incednine, which inhibits the anti-apoptotic function of Bcl-xL by an unknown mechanism. One hundred eighty-two target protein candidates were computationally predicted to bind to incednine by the statistical prediction method, and the predictions were verified by in vitro binding of incednine to seven proteins, whose expression can be confirmed in our cell system.As a result, 40% accuracy of the computational predictions was achieved successfully, and we newly found 3 incednine-binding proteins. This study revealed that our proposed protocol of predicting target protein combining in silico screening and experimental verification is useful, and provides new insight into a strategy for identifying target proteins of small molecules.

  9. Quantifying protein interface footprinting by hydroxyl radical oxidation and molecular dynamics simulation: application to galectin-1.

    PubMed

    Charvátová, Olga; Foley, B Lachele; Bern, Marshall W; Sharp, Joshua S; Orlando, Ron; Woods, Robert J

    2008-11-01

    Biomolecular surface mapping methods offer an important alternative method for characterizing protein-protein and protein-ligand interactions in cases in which it is not possible to determine high-resolution three-dimensional (3D) structures of complexes. Hydroxyl radical footprinting offers a significant advance in footprint resolution compared with traditional chemical derivatization. Here we present results of footprinting performed with hydroxyl radicals generated on the nanosecond time scale by laser-induced photodissociation of hydrogen peroxide. We applied this emerging method to a carbohydrate-binding protein, galectin-1. Since galectin-1 occurs as a homodimer, footprinting was employed to characterize the interface of the monomeric subunits. Efficient analysis of the mass spectrometry data for the oxidized protein was achieved with the recently developed ByOnic (Palo Alto, CA) software that was altered to handle the large number of modifications arising from side-chain oxidation. Quantification of the level of oxidation has been achieved by employing spectral intensities for all of the observed oxidation states on a per-residue basis. The level of accuracy achievable from spectral intensities was determined by examination of mixtures of synthetic peptides related to those present after oxidation and tryptic digestion of galectin-1. A direct relationship between side-chain solvent accessibility and level of oxidation emerged, which enabled the prediction of the level of oxidation given the 3D structure of the protein. The precision of this relationship was enhanced through the use of average solvent accessibilities computed from 10 ns molecular dynamics simulations of the protein.

  10. The effects of dietary protein levels on the population growth, performance, and physiology of honey bee workers during early spring.

    PubMed

    Zheng, Benle; Wu, Zaifu; Xu, Baohua

    2014-01-01

    This study was conducted to investigate the effects of dietary protein levels on honey bee colonies, specifically the population growth, physiology, and longevity of honey bee workers during early spring. Diets containing four different levels of crude protein (25.0, 29.5, 34.0, or 38.5%) and pure pollen (control) were evaluated. Twenty-five colonies of honey bees with sister queens were used in the study. We compared the effects of the different bee diets by measuring population growth, emergent worker weight, midgut proteolytic enzyme activity, hypopharyngeal gland development, and survival. After 48 d, the cumulative number of workers produced by the colonies ranged from 22,420 to 29,519, providing a significant fit to a quadratic equation that predicts the maximum population growth when the diet contains 31.7% crude protein. Significantly greater emergent worker weight, midgut proteolytic enzyme activity, hypopharyngeal gland acini, and survival were observed in the colonies that were fed diets containing 34.0% crude protein compared with the other crude protein levels. Although higher emergent worker weight and survival were observed in the colonies that were fed the control diet, there were no significant differences between the control colonies and the colonies that were fed 34.0% crude protein. Based on these results, we concluded that a dietary crude protein content of 29.5-34.0% is recommended to maximize the reproduction rate of honey bee colonies in early spring. © The Author 2014. Published by Oxford University Press on behalf of the Entomological Society of America.

  11. Increased Levels of Markers of Oxidative Stress and Inflammation in Patients with Rheumatic Mitral Stenosis Predispose to Left Atrial Thrombus Formation

    PubMed Central

    Pulimamidi, Vinay Kumar; Murugesan, Vengatesan; Rajappa, Medha; Satheesh, Santhosh; Harichandrakumar, Kottenyen Thazhath

    2013-01-01

    Background: Rheumatic mitral stenosis (MS) causes stagnation of blood flow, leading to thrombus formation in the left atrium (LA), which may lead to systemic thromboembolic complications. We compared alterations in circulating levels of pro-/anti–oxidants and markers of inflammation in patients of severe rheumatic MS with and without LA thrombus and studied their predictive power to detect the presence of LA thrombus in patients with rheumatic MS. Material and Methods: This is a cross-sectional study of 80 patients with rheumatic MS, evaluated for percutaneous mitral commisurotomy. Group 1 comprised of patients with rheumatic MS with LA thrombus (n=35) and Group 2 included patients with rheumatic MS without LA thrombus (n=45). The following oxidative stress markers-malondialdehyde (MDA), protein carbonyls, total oxidant status and total antioxidant status and inflammation markers-high sensitivity C-reactive protein (hs-CRP), total sialic acid (TSA) and protein-bound sialic acid (PBSA) were estimated in all study subjects. Results: Levels of plasma MDA, protein carbonyl and total oxidant status were significantly elevated, whilst the total antioxidant status levels were significantly lowered, in Group 1, as compared with Group 2. hs-CRP, TSA and PBSA levels showed a significant rise in Group 1 patients, as compared with Group 2. Conclusion: Our results suggest that circulating levels of MDA, protein carbonyl and PBSA were independent predictors of occurrence of LA thrombus in patients with rheumatic MS. PMID:24392368

  12. SVM-Prot 2016: A Web-Server for Machine Learning Prediction of Protein Functional Families from Sequence Irrespective of Similarity.

    PubMed

    Li, Ying Hong; Xu, Jing Yu; Tao, Lin; Li, Xiao Feng; Li, Shuang; Zeng, Xian; Chen, Shang Ying; Zhang, Peng; Qin, Chu; Zhang, Cheng; Chen, Zhe; Zhu, Feng; Chen, Yu Zong

    2016-01-01

    Knowledge of protein function is important for biological, medical and therapeutic studies, but many proteins are still unknown in function. There is a need for more improved functional prediction methods. Our SVM-Prot web-server employed a machine learning method for predicting protein functional families from protein sequences irrespective of similarity, which complemented those similarity-based and other methods in predicting diverse classes of proteins including the distantly-related proteins and homologous proteins of different functions. Since its publication in 2003, we made major improvements to SVM-Prot with (1) expanded coverage from 54 to 192 functional families, (2) more diverse protein descriptors protein representation, (3) improved predictive performances due to the use of more enriched training datasets and more variety of protein descriptors, (4) newly integrated BLAST analysis option for assessing proteins in the SVM-Prot predicted functional families that were similar in sequence to a query protein, and (5) newly added batch submission option for supporting the classification of multiple proteins. Moreover, 2 more machine learning approaches, K nearest neighbor and probabilistic neural networks, were added for facilitating collective assessment of protein functions by multiple methods. SVM-Prot can be accessed at http://bidd2.nus.edu.sg/cgi-bin/svmprot/svmprot.cgi.

  13. Building protein-protein interaction networks for Leishmania species through protein structural information.

    PubMed

    Dos Santos Vasconcelos, Crhisllane Rafaele; de Lima Campos, Túlio; Rezende, Antonio Mauro

    2018-03-06

    Systematic analysis of a parasite interactome is a key approach to understand different biological processes. It makes possible to elucidate disease mechanisms, to predict protein functions and to select promising targets for drug development. Currently, several approaches for protein interaction prediction for non-model species incorporate only small fractions of the entire proteomes and their interactions. Based on this perspective, this study presents an integration of computational methodologies, protein network predictions and comparative analysis of the protozoan species Leishmania braziliensis and Leishmania infantum. These parasites cause Leishmaniasis, a worldwide distributed and neglected disease, with limited treatment options using currently available drugs. The predicted interactions were obtained from a meta-approach, applying rigid body docking tests and template-based docking on protein structures predicted by different comparative modeling techniques. In addition, we trained a machine-learning algorithm (Gradient Boosting) using docking information performed on a curated set of positive and negative protein interaction data. Our final model obtained an AUC = 0.88, with recall = 0.69, specificity = 0.88 and precision = 0.83. Using this approach, it was possible to confidently predict 681 protein structures and 6198 protein interactions for L. braziliensis, and 708 protein structures and 7391 protein interactions for L. infantum. The predicted networks were integrated to protein interaction data already available, analyzed using several topological features and used to classify proteins as essential for network stability. The present study allowed to demonstrate the importance of integrating different methodologies of interaction prediction to increase the coverage of the protein interaction of the studied protocols, besides it made available protein structures and interactions not previously reported.

  14. Molecular evidence of stereo-specific lactoferrin dimers in solution.

    PubMed

    Persson, Björn A; Lund, Mikael; Forsman, Jan; Chatterton, Dereck E W; Akesson, Torbjörn

    2010-10-01

    Gathering experimental evidence suggests that bovine as well as human lactoferrin self-associate in aqueous solution. Still, a molecular level explanation is unavailable. Using force field based molecular modeling of the protein-protein interaction free energy we demonstrate (1) that lactoferrin forms highly stereo-specific dimers at neutral pH and (2) that the self-association is driven by a high charge complementarity across the contact surface of the proteins. Our theoretical predictions of dimer formation are verified by electrophoretic mobility and N-terminal sequence analysis on bovine lactoferrin. 2010 Elsevier B.V. All rights reserved.

  15. Diet and Macronutrient Optimization in Wild Ursids: A Comparison of Grizzly Bears with Sympatric and Allopatric Black Bears.

    PubMed

    Costello, Cecily M; Cain, Steven L; Pils, Shannon; Frattaroli, Leslie; Haroldson, Mark A; van Manen, Frank T

    2016-01-01

    When fed ad libitum, ursids can maximize mass gain by selecting mixed diets wherein protein provides 17 ± 4% of digestible energy, relative to carbohydrates or lipids. In the wild, this ability is likely constrained by seasonal food availability, limits of intake rate as body size increases, and competition. By visiting locations of 37 individuals during 274 bear-days, we documented foods consumed by grizzly (Ursus arctos) and black bears (Ursus americanus) in Grand Teton National Park during 2004-2006. Based on published nutritional data, we estimated foods and macronutrients as percentages of daily energy intake. Using principal components and cluster analyses, we identified 14 daily diet types. Only 4 diets, accounting for 21% of days, provided protein levels within the optimal range. Nine diets (75% of days) led to over-consumption of protein, and 1 diet (3% of days) led to under-consumption. Highest protein levels were associated with animal matter (i.e., insects, vertebrates), which accounted for 46-47% of daily energy for both species. As predicted: 1) daily diets dominated by high-energy vertebrates were positively associated with grizzly bears and mean percent protein intake was positively associated with body mass; 2) diets dominated by low-protein fruits were positively associated with smaller-bodied black bears; and 3) mean protein was highest during spring, when high-energy plant foods were scarce, however it was also higher than optimal during summer and fall. Contrary to our prediction: 4) allopatric black bears did not exhibit food selection for high-energy foods similar to grizzly bears. Although optimal gain of body mass was typically constrained, bears usually opted for the energetically superior trade-off of consuming high-energy, high-protein foods. Given protein digestion efficiency similar to obligate carnivores, this choice likely supported mass gain, consistent with studies showing monthly increases in percent body fat among bears in this region.

  16. Diet and Macronutrient Optimization in Wild Ursids: A Comparison of Grizzly Bears with Sympatric and Allopatric Black Bears

    PubMed Central

    Costello, Cecily M.; Cain, Steven L.; Pils, Shannon; Frattaroli, Leslie; Haroldson, Mark A.; van Manen, Frank T.

    2016-01-01

    When fed ad libitum, ursids can maximize mass gain by selecting mixed diets wherein protein provides 17 ± 4% of digestible energy, relative to carbohydrates or lipids. In the wild, this ability is likely constrained by seasonal food availability, limits of intake rate as body size increases, and competition. By visiting locations of 37 individuals during 274 bear-days, we documented foods consumed by grizzly (Ursus arctos) and black bears (Ursus americanus) in Grand Teton National Park during 2004–2006. Based on published nutritional data, we estimated foods and macronutrients as percentages of daily energy intake. Using principal components and cluster analyses, we identified 14 daily diet types. Only 4 diets, accounting for 21% of days, provided protein levels within the optimal range. Nine diets (75% of days) led to over-consumption of protein, and 1 diet (3% of days) led to under-consumption. Highest protein levels were associated with animal matter (i.e., insects, vertebrates), which accounted for 46–47% of daily energy for both species. As predicted: 1) daily diets dominated by high-energy vertebrates were positively associated with grizzly bears and mean percent protein intake was positively associated with body mass; 2) diets dominated by low-protein fruits were positively associated with smaller-bodied black bears; and 3) mean protein was highest during spring, when high-energy plant foods were scarce, however it was also higher than optimal during summer and fall. Contrary to our prediction: 4) allopatric black bears did not exhibit food selection for high-energy foods similar to grizzly bears. Although optimal gain of body mass was typically constrained, bears usually opted for the energetically superior trade-off of consuming high-energy, high-protein foods. Given protein digestion efficiency similar to obligate carnivores, this choice likely supported mass gain, consistent with studies showing monthly increases in percent body fat among bears in this region. PMID:27192407

  17. Diet and macronutrient optimization in wild ursids: A comparison of grizzly bears with sympatric and allopatric black bears

    USGS Publications Warehouse

    Costello, Cecily M.; Cain, Steven L.; Pils, Shannon R; Frattaroli, Leslie; Haroldson, Mark A.; van Manen, Frank T.

    2016-01-01

    When fed ad libitum, ursids can maximize mass gain by selecting mixed diets wherein protein provides 17 ± 4% of digestible energy, relative to carbohydrates or lipids. In the wild, this ability is likely constrained by seasonal food availability, limits of intake rate as body size increases, and competition. By visiting locations of 37 individuals during 274 bear-days, we documented foods consumed by grizzly (Ursus arctos) and black bears (Ursus americanus) in Grand Teton National Park during 2004–2006. Based on published nutritional data, we estimated foods and macronutrients as percentages of daily energy intake. Using principal components and cluster analyses, we identified 14 daily diet types. Only 4 diets, accounting for 21% of days, provided protein levels within the optimal range. Nine diets (75% of days) led to over-consumption of protein, and 1 diet (3% of days) led to under-consumption. Highest protein levels were associated with animal matter (i.e., insects, vertebrates), which accounted for 46–47% of daily energy for both species. As predicted: 1) daily diets dominated by high-energy vertebrates were positively associated with grizzly bears and mean percent protein intake was positively associated with body mass; 2) diets dominated by low-protein fruits were positively associated with smaller-bodied black bears; and 3) mean protein was highest during spring, when high-energy plant foods were scarce, however it was also higher than optimal during summer and fall. Contrary to our prediction: 4) allopatric black bears did not exhibit food selection for high-energy foods similar to grizzly bears. Although optimal gain of body mass was typically constrained, bears usually opted for the energetically superior trade-off of consuming high-energy, high-protein foods. Given protein digestion efficiency similar to obligate carnivores, this choice likely supported mass gain, consistent with studies showing monthly increases in percent body fat among bears in this region.

  18. Modeling Reactivity to Biological Macromolecules with a Deep Multitask Network

    PubMed Central

    2016-01-01

    Most small-molecule drug candidates fail before entering the market, frequently because of unexpected toxicity. Often, toxicity is detected only late in drug development, because many types of toxicities, especially idiosyncratic adverse drug reactions (IADRs), are particularly hard to predict and detect. Moreover, drug-induced liver injury (DILI) is the most frequent reason drugs are withdrawn from the market and causes 50% of acute liver failure cases in the United States. A common mechanism often underlies many types of drug toxicities, including both DILI and IADRs. Drugs are bioactivated by drug-metabolizing enzymes into reactive metabolites, which then conjugate to sites in proteins or DNA to form adducts. DNA adducts are often mutagenic and may alter the reading and copying of genes and their regulatory elements, causing gene dysregulation and even triggering cancer. Similarly, protein adducts can disrupt their normal biological functions and induce harmful immune responses. Unfortunately, reactive metabolites are not reliably detected by experiments, and it is also expensive to test drug candidates for potential to form DNA or protein adducts during the early stages of drug development. In contrast, computational methods have the potential to quickly screen for covalent binding potential, thereby flagging problematic molecules and reducing the total number of necessary experiments. Here, we train a deep convolution neural network—the XenoSite reactivity model—using literature data to accurately predict both sites and probability of reactivity for molecules with glutathione, cyanide, protein, and DNA. On the site level, cross-validated predictions had area under the curve (AUC) performances of 89.8% for DNA and 94.4% for protein. Furthermore, the model separated molecules electrophilically reactive with DNA and protein from nonreactive molecules with cross-validated AUC performances of 78.7% and 79.8%, respectively. On both the site- and molecule-level, the model’s performances significantly outperformed reactivity indices derived from quantum simulations that are reported in the literature. Moreover, we developed and applied a selectivity score to assess preferential reactions with the macromolecules as opposed to the common screening traps. For the entire data set of 2803 molecules, this approach yielded totals of 257 (9.2%) and 227 (8.1%) molecules predicted to be reactive only with DNA and protein, respectively, and hence those that would be missed by standard reactivity screening experiments. Site of reactivity data is an underutilized resource that can be used to not only predict if molecules are reactive, but also show where they might be modified to reduce toxicity while retaining efficacy. The XenoSite reactivity model is available at http://swami.wustl.edu/xenosite/p/reactivity. PMID:27610414

  19. Ectopic High Expression of E2-EPF Ubiquitin Carrier Protein Indicates a More Unfavorable Prognosis in Brain Glioma.

    PubMed

    Zhang, Xiaohui; Zhao, Fangbo; Zhang, Shujun; Song, Yichun

    2017-04-01

    Ubiquitination of proteins meant for elimination is a primary method of eukaryotic cellular protein degradation. The ubiquitin carrier protein E2-EPF is a key degradation enzyme that is highly expressed in many tumors. However, its expression and prognostic significance in brain glioma are still unclear. The aim of this study was to reveal how the level of E2-EPF relates to prognosis in brain glioma. Thirty low-grade and 30 high-grade brain glioma samples were divided into two tissue microarrays each. Levels of E2-EPF protein were examined by immunohistochemistry and immunofluorescence. Quantitative real-time polymerase chain reaction was used to analyze the level of E2-EPF in 60 glioma and 3 normal brain tissue samples. The relationship between E2-EPF levels and prognosis was analyzed by Kaplan-Meier survival curves. E2-EPF levels were low in normal brain tissue samples but high in glioma nuclei. E2-EPF levels gradually increased as glioma grade increased (p < 0.05). Ectopic E2-EPF levels in high-grade glioma were significantly higher than in low-grade glioma (p < 0.01). The 5-year survival rate of glioma patients with high E2-EPF levels was shorter than in patients with low expression (p < 0.05). Furthermore, the 5-year survival rate of patients with ectopic E2-EPF was significantly shorter than patients with only nuclear E2-EPF (p < 0.01). These results suggest that higher E2-EPF levels, especially ectopic, are associated with higher grade glioma and shorter survival. E2-EPF levels may play a key role in predicting the prognosis for patients with brain glioma.

  20. Network inference reveals novel connections in pathways regulating growth and defense in the yeast salt response.

    PubMed

    MacGilvray, Matthew E; Shishkova, Evgenia; Chasman, Deborah; Place, Michael; Gitter, Anthony; Coon, Joshua J; Gasch, Audrey P

    2018-05-01

    Cells respond to stressful conditions by coordinating a complex, multi-faceted response that spans many levels of physiology. Much of the response is coordinated by changes in protein phosphorylation. Although the regulators of transcriptome changes during stress are well characterized in Saccharomyces cerevisiae, the upstream regulatory network controlling protein phosphorylation is less well dissected. Here, we developed a computational approach to infer the signaling network that regulates phosphorylation changes in response to salt stress. We developed an approach to link predicted regulators to groups of likely co-regulated phospho-peptides responding to stress, thereby creating new edges in a background protein interaction network. We then use integer linear programming (ILP) to integrate wild type and mutant phospho-proteomic data and predict the network controlling stress-activated phospho-proteomic changes. The network we inferred predicted new regulatory connections between stress-activated and growth-regulating pathways and suggested mechanisms coordinating metabolism, cell-cycle progression, and growth during stress. We confirmed several network predictions with co-immunoprecipitations coupled with mass-spectrometry protein identification and mutant phospho-proteomic analysis. Results show that the cAMP-phosphodiesterase Pde2 physically interacts with many stress-regulated transcription factors targeted by PKA, and that reduced phosphorylation of those factors during stress requires the Rck2 kinase that we show physically interacts with Pde2. Together, our work shows how a high-quality computational network model can facilitate discovery of new pathway interactions during osmotic stress.

  1. Large-scale binding ligand prediction by improved patch-based method Patch-Surfer2.0.

    PubMed

    Zhu, Xiaolei; Xiong, Yi; Kihara, Daisuke

    2015-03-01

    Ligand binding is a key aspect of the function of many proteins. Thus, binding ligand prediction provides important insight in understanding the biological function of proteins. Binding ligand prediction is also useful for drug design and examining potential drug side effects. We present a computational method named Patch-Surfer2.0, which predicts binding ligands for a protein pocket. By representing and comparing pockets at the level of small local surface patches that characterize physicochemical properties of the local regions, the method can identify binding pockets of the same ligand even if they do not share globally similar shapes. Properties of local patches are represented by an efficient mathematical representation, 3D Zernike Descriptor. Patch-Surfer2.0 has significant technical improvements over our previous prototype, which includes a new feature that captures approximate patch position with a geodesic distance histogram. Moreover, we constructed a large comprehensive database of ligand binding pockets that will be searched against by a query. The benchmark shows better performance of Patch-Surfer2.0 over existing methods. http://kiharalab.org/patchsurfer2.0/ CONTACT: dkihara@purdue.edu Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  2. First Principles Predictions of the Structure and Function of G-Protein-Coupled Receptors: Validation for Bovine Rhodopsin

    PubMed Central

    Trabanino, Rene J.; Hall, Spencer E.; Vaidehi, Nagarajan; Floriano, Wely B.; Kam, Victor W. T.; Goddard, William A.

    2004-01-01

    G-protein-coupled receptors (GPCRs) are involved in cell communication processes and with mediating such senses as vision, smell, taste, and pain. They constitute a prominent superfamily of drug targets, but an atomic-level structure is available for only one GPCR, bovine rhodopsin, making it difficult to use structure-based methods to design receptor-specific drugs. We have developed the MembStruk first principles computational method for predicting the three-dimensional structure of GPCRs. In this article we validate the MembStruk procedure by comparing its predictions with the high-resolution crystal structure of bovine rhodopsin. The crystal structure of bovine rhodopsin has the second extracellular (EC-II) loop closed over the transmembrane regions by making a disulfide linkage between Cys-110 and Cys-187, but we speculate that opening this loop may play a role in the activation process of the receptor through the cysteine linkage with helix 3. Consequently we predicted two structures for bovine rhodopsin from the primary sequence (with no input from the crystal structure)—one with the EC-II loop closed as in the crystal structure, and the other with the EC-II loop open. The MembStruk-predicted structure of bovine rhodopsin with the closed EC-II loop deviates from the crystal by 2.84 Å coordinate root mean-square (CRMS) in the transmembrane region main-chain atoms. The predicted three-dimensional structures for other GPCRs can be validated only by predicting binding sites and energies for various ligands. For such predictions we developed the HierDock first principles computational method. We validate HierDock by predicting the binding site of 11-cis-retinal in the crystal structure of bovine rhodopsin. Scanning the whole protein without using any prior knowledge of the binding site, we find that the best scoring conformation in rhodopsin is 1.1 Å CRMS from the crystal structure for the ligand atoms. This predicted conformation has the carbonyl O only 2.82 Å from the N of Lys-296. Making this Schiff base bond and minimizing leads to a final conformation only 0.62 Å CRMS from the crystal structure. We also used HierDock to predict the binding site of 11-cis-retinal in the MembStruk-predicted structure of bovine rhodopsin (closed loop). Scanning the whole protein structure leads to a structure in which the carbonyl O is only 2.85 Å from the N of Lys-296. Making this Schiff base bond and minimizing leads to a final conformation only 2.92 Å CRMS from the crystal structure. The good agreement of the ab initio-predicted protein structures and ligand binding site with experiment validates the use of the MembStruk and HierDock first principles' methods. Since these methods are generic and applicable to any GPCR, they should be useful in predicting the structures of other GPCRs and the binding site of ligands to these proteins. PMID:15041637

  3. Musite, a tool for global prediction of general and kinase-specific phosphorylation sites.

    PubMed

    Gao, Jianjiong; Thelen, Jay J; Dunker, A Keith; Xu, Dong

    2010-12-01

    Reversible protein phosphorylation is one of the most pervasive post-translational modifications, regulating diverse cellular processes in various organisms. High throughput experimental studies using mass spectrometry have identified many phosphorylation sites, primarily from eukaryotes. However, the vast majority of phosphorylation sites remain undiscovered, even in well studied systems. Because mass spectrometry-based experimental approaches for identifying phosphorylation events are costly, time-consuming, and biased toward abundant proteins and proteotypic peptides, in silico prediction of phosphorylation sites is potentially a useful alternative strategy for whole proteome annotation. Because of various limitations, current phosphorylation site prediction tools were not well designed for comprehensive assessment of proteomes. Here, we present a novel software tool, Musite, specifically designed for large scale predictions of both general and kinase-specific phosphorylation sites. We collected phosphoproteomics data in multiple organisms from several reliable sources and used them to train prediction models by a comprehensive machine-learning approach that integrates local sequence similarities to known phosphorylation sites, protein disorder scores, and amino acid frequencies. Application of Musite on several proteomes yielded tens of thousands of phosphorylation site predictions at a high stringency level. Cross-validation tests show that Musite achieves some improvement over existing tools in predicting general phosphorylation sites, and it is at least comparable with those for predicting kinase-specific phosphorylation sites. In Musite V1.0, we have trained general prediction models for six organisms and kinase-specific prediction models for 13 kinases or kinase families. Although the current pretrained models were not correlated with any particular cellular conditions, Musite provides a unique functionality for training customized prediction models (including condition-specific models) from users' own data. In addition, with its easily extensible open source application programming interface, Musite is aimed at being an open platform for community-based development of machine learning-based phosphorylation site prediction applications. Musite is available at http://musite.sourceforge.net/.

  4. Rigid-Docking Approaches to Explore Protein-Protein Interaction Space.

    PubMed

    Matsuzaki, Yuri; Uchikoga, Nobuyuki; Ohue, Masahito; Akiyama, Yutaka

    Protein-protein interactions play core roles in living cells, especially in the regulatory systems. As information on proteins has rapidly accumulated on publicly available databases, much effort has been made to obtain a better picture of protein-protein interaction networks using protein tertiary structure data. Predicting relevant interacting partners from their tertiary structure is a challenging task and computer science methods have the potential to assist with this. Protein-protein rigid docking has been utilized by several projects, docking-based approaches having the advantages that they can suggest binding poses of predicted binding partners which would help in understanding the interaction mechanisms and that comparing docking results of both non-binders and binders can lead to understanding the specificity of protein-protein interactions from structural viewpoints. In this review we focus on explaining current computational prediction methods to predict pairwise direct protein-protein interactions that form protein complexes.

  5. Failure of CRP decline within three days of hospitalization is associated with poor prognosis of Community-acquired Pneumonia.

    PubMed

    Andersen, Stine Bang; Baunbæk Egelund, Gertrud; Jensen, Andreas Vestergaard; Petersen, Pelle Trier; Rohde, Gernot; Ravn, Pernille

    2017-04-01

    C-reactive protein (CRP) is a well-known acute phase protein used to monitor the patient's response during treatment in infectious diseases. Mortality from Community-acquired Pneumonia (CAP) remains high, particularly in hospitalized patients. Better risk prediction during hospitalization could improve management and ultimately reduce mortality levels. The aim of this study was to evaluate CRP on the 3rd day (CRP3) of hospitalization as a predictor for 30 days mortality. A retrospective multicentre cohort study of adult patients admitted with CAP at three Danish hospitals. Predictive associations of CRP3 (absolute levels and relative decline) and 30 days mortality were analysed using receiver operating characteristics and logistic regression. Eight hundred and fourteen patients were included and 90 (11%) died within 30 days. The area under the curve for CRP3 level and decline for predicting 30 days mortality were 0.64 (0.57-0.70) and 0.71 (0.65-0.76). Risk of death was increased in patients with CRP3 level >75 mg/l (OR 2.44; 95%CI 1.36-4.37) and in patients with a CRP3 decline <50% (OR 4.25; 95%CI 2.30-7.83). In the multivariate analysis, the highest mortality risk was seen in patients who failed to decline by 50%, irrespective of the actual level of CRP (OR 7.8; 95%CI 3.2-19.3). Mortality risk increased significantly according to CRP decline for all strata of CURB-65 score. CRP responses day 3 is a valuable predictor of 30 days mortality in hospitalized CAP patients. Failure to decline in CRP was associated with a poor prognosis irrespective of the actual level of CRP or CURB-65.

  6. The Popeye Domain Containing Genes and Their Function as cAMP Effector Proteins in Striated Muscle.

    PubMed

    Brand, Thomas

    2018-03-13

    The Popeye domain containing (POPDC) genes encode transmembrane proteins, which are abundantly expressed in striated muscle cells. Hallmarks of the POPDC proteins are the presence of three transmembrane domains and the Popeye domain, which makes up a large part of the cytoplasmic portion of the protein and functions as a cAMP-binding domain. Interestingly, despite the prediction of structural similarity between the Popeye domain and other cAMP binding domains, at the protein sequence level they strongly differ from each other suggesting an independent evolutionary origin of POPDC proteins. Loss-of-function experiments in zebrafish and mouse established an important role of POPDC proteins for cardiac conduction and heart rate adaptation after stress. Loss-of function mutations in patients have been associated with limb-girdle muscular dystrophy and AV-block. These data suggest an important role of these proteins in the maintenance of structure and function of striated muscle cells.

  7. Predicting the helix packing of globular proteins by self-correcting distance geometry.

    PubMed

    Mumenthaler, C; Braun, W

    1995-05-01

    A new self-correcting distance geometry method for predicting the three-dimensional structure of small globular proteins was assessed with a test set of 8 helical proteins. With the knowledge of the amino acid sequence and the helical segments, our completely automated method calculated the correct backbone topology of six proteins. The accuracy of the predicted structures ranged from 2.3 A to 3.1 A for the helical segments compared to the experimentally determined structures. For two proteins, the predicted constraints were not restrictive enough to yield a conclusive prediction. The method can be applied to all small globular proteins, provided the secondary structure is known from NMR analysis or can be predicted with high reliability.

  8. Exploring Human Diseases and Biological Mechanisms by Protein Structure Prediction and Modeling.

    PubMed

    Wang, Juexin; Luttrell, Joseph; Zhang, Ning; Khan, Saad; Shi, NianQing; Wang, Michael X; Kang, Jing-Qiong; Wang, Zheng; Xu, Dong

    2016-01-01

    Protein structure prediction and modeling provide a tool for understanding protein functions by computationally constructing protein structures from amino acid sequences and analyzing them. With help from protein prediction tools and web servers, users can obtain the three-dimensional protein structure models and gain knowledge of functions from the proteins. In this chapter, we will provide several examples of such studies. As an example, structure modeling methods were used to investigate the relation between mutation-caused misfolding of protein and human diseases including epilepsy and leukemia. Protein structure prediction and modeling were also applied in nucleotide-gated channels and their interaction interfaces to investigate their roles in brain and heart cells. In molecular mechanism studies of plants, rice salinity tolerance mechanism was studied via structure modeling on crucial proteins identified by systems biology analysis; trait-associated protein-protein interactions were modeled, which sheds some light on the roles of mutations in soybean oil/protein content. In the age of precision medicine, we believe protein structure prediction and modeling will play more and more important roles in investigating biomedical mechanism of diseases and drug design.

  9. Novel nonlinear knowledge-based mean force potentials based on machine learning.

    PubMed

    Dong, Qiwen; Zhou, Shuigeng

    2011-01-01

    The prediction of 3D structures of proteins from amino acid sequences is one of the most challenging problems in molecular biology. An essential task for solving this problem with coarse-grained models is to deduce effective interaction potentials. The development and evaluation of new energy functions is critical to accurately modeling the properties of biological macromolecules. Knowledge-based mean force potentials are derived from statistical analysis of proteins of known structures. Current knowledge-based potentials are almost in the form of weighted linear sum of interaction pairs. In this study, a class of novel nonlinear knowledge-based mean force potentials is presented. The potential parameters are obtained by nonlinear classifiers, instead of relative frequencies of interaction pairs against a reference state or linear classifiers. The support vector machine is used to derive the potential parameters on data sets that contain both native structures and decoy structures. Five knowledge-based mean force Boltzmann-based or linear potentials are introduced and their corresponding nonlinear potentials are implemented. They are the DIH potential (single-body residue-level Boltzmann-based potential), the DFIRE-SCM potential (two-body residue-level Boltzmann-based potential), the FS potential (two-body atom-level Boltzmann-based potential), the HR potential (two-body residue-level linear potential), and the T32S3 potential (two-body atom-level linear potential). Experiments are performed on well-established decoy sets, including the LKF data set, the CASP7 data set, and the Decoys “R”Us data set. The evaluation metrics include the energy Z score and the ability of each potential to discriminate native structures from a set of decoy structures. Experimental results show that all nonlinear potentials significantly outperform the corresponding Boltzmann-based or linear potentials, and the proposed discriminative framework is effective in developing knowledge-based mean force potentials. The nonlinear potentials can be widely used for ab initio protein structure prediction, model quality assessment, protein docking, and other challenging problems in computational biology.

  10. Dietary potassium intake and mortality in long-term hemodialysis patients.

    PubMed

    Noori, Nazanin; Kalantar-Zadeh, Kamyar; Kovesdy, Csaba P; Murali, Sameer B; Bross, Rachelle; Nissenson, Allen R; Kopple, Joel D

    2010-08-01

    Hyperkalemia has been associated with higher mortality in long-term hemodialysis (HD) patients. There are few data concerning the relationship between dietary potassium intake and outcome. The mortality predictability of dietary potassium intake from reported food items estimated using the Block Food Frequency Questionnaire (FFQ) at the start of the cohort was examined in a 5-year (2001-2006) cohort of 224 HD patients in Southern California using Cox proportional hazards regression. 224 long-term HD patients from 8 DaVita dialysis clinics. Dietary potassium intake ranking using the Block FFQ. 5-year survival. HD patients with higher potassium intake had greater dietary energy, protein, and phosphorus intakes and higher predialysis serum potassium and phosphorus levels. Greater dietary potassium intake was associated with significantly increased death HRs in unadjusted models and after incremental adjustments for case-mix, nutritional factors (including 3-month averaged predialysis serum creatinine, potassium, and phosphorus levels; body mass index; normalized protein nitrogen appearance; and energy, protein, and phosphorus intake) and inflammatory marker levels. HRs for death across the 3 higher quartiles of dietary potassium intake in the fully adjusted model (compared with the lowest quartile) were 1.4 (95% CI, 0.6-3.0), 2.2 (95% CI, 0.9-5.4), and 2.4 (95% CI, 1.1-7.5), respectively (P for trend = 0.03). Restricted cubic spline analyses confirmed the incremental mortality predictability of higher potassium intake. FFQs may underestimate individual potassium intake and should be used to rank dietary intake across the population. Higher dietary potassium intake is associated with increased death risk in long-term HD patients, even after adjustments for serum potassium level; dietary protein; energy, and phosphorus intake; and nutritional and inflammatory marker levels. The potential role of dietary potassium in the high mortality rate of HD patients warrants clinical trials. Copyright (c) 2010 National Kidney Foundation, Inc. Published by Elsevier Inc. All rights reserved.

  11. Towards quantitative classification of folded proteins in terms of elementary functions.

    PubMed

    Hu, Shuangwei; Krokhotin, Andrei; Niemi, Antti J; Peng, Xubiao

    2011-04-01

    A comparative classification scheme provides a good basis for several approaches to understand proteins, including prediction of relations between their structure and biological function. But it remains a challenge to combine a classification scheme that describes a protein starting from its well-organized secondary structures and often involves direct human involvement, with an atomary-level physics-based approach where a protein is fundamentally nothing more than an ensemble of mutually interacting carbon, hydrogen, oxygen, and nitrogen atoms. In order to bridge these two complementary approaches to proteins, conceptually novel tools need to be introduced. Here we explain how an approach toward geometric characterization of entire folded proteins can be based on a single explicit elementary function that is familiar from nonlinear physical systems where it is known as the kink soliton. Our approach enables the conversion of hierarchical structural information into a quantitative form that allows for a folded protein to be characterized in terms of a small number of global parameters that are in principle computable from atomary-level considerations. As an example we describe in detail how the native fold of the myoglobin 1M6C emerges from a combination of kink solitons with a very high atomary-level accuracy. We also verify that our approach describes longer loops and loops connecting α helices with β strands, with the same overall accuracy. ©2011 American Physical Society

  12. [Cloning, mutagenesis and symbiotic phenotype of three lipid transfer protein encoding genes from Mesorhizobium huakuii 7653R].

    PubMed

    Li, Yanan; Zeng, Xiaobo; Zhou, Xuejuan; Li, Youguo

    2016-12-04

    Lipid transfer protein superfamily is involved in lipid transport and metabolism. This study aimed to construct mutants of three lipid transfer protein encoding genes in Mesorhizobium huakuii 7653R, and to study the phenotypes and function of mutations during symbiosis with Astragalus sinicus. We used bioinformatics to predict structure characteristics and biological functions of lipid transfer proteins, and conducted semi-quantitative and fluorescent quantitative real-time PCR to analyze the expression levels of target genes in free-living and symbiotic conditions. Using pK19mob insertion mutagenesis to construct mutants, we carried out pot plant experiments to observe symbiotic phenotypes. MCHK-5577, MCHK-2172 and MCHK-2779 genes encoding proteins belonged to START/RHO alpha_C/PITP/Bet_v1/CoxG/CalC (SRPBCC) superfamily, involved in lipid transport or metabolism, and were identical to M. loti at 95% level. Gene relative transcription level of the three genes all increased compared to free-living condition. We obtained three mutants. Compared with wild-type 7653R, above-ground biomass of plants and nodulenitrogenase activity induced by the three mutants significantly decreased. Results indicated that lipid transfer protein encoding genes of Mesorhizobium huakuii 7653R may play important roles in symbiotic nitrogen fixation, and the mutations significantly affected the symbiotic phenotypes. The present work provided a basis to study further symbiotic function mechanism associated with lipid transfer proteins from rhizobia.

  13. An inventory of the Aspergillus niger secretome by combining in silico predictions with shotgun proteomics data.

    PubMed

    Braaksma, Machtelt; Martens-Uzunova, Elena S; Punt, Peter J; Schaap, Peter J

    2010-10-19

    The ecological niche occupied by a fungal species, its pathogenicity and its usefulness as a microbial cell factory to a large degree depends on its secretome. Protein secretion usually requires the presence of a N-terminal signal peptide (SP) and by scanning for this feature using available highly accurate SP-prediction tools, the fraction of potentially secreted proteins can be directly predicted. However, prediction of a SP does not guarantee that the protein is actually secreted and current in silico prediction methods suffer from gene-model errors introduced during genome annotation. A majority rule based classifier that also evaluates signal peptide predictions from the best homologs of three neighbouring Aspergillus species was developed to create an improved list of potential signal peptide containing proteins encoded by the Aspergillus niger genome. As a complement to these in silico predictions, the secretome associated with growth and upon carbon source depletion was determined using a shotgun proteomics approach. Overall, some 200 proteins with a predicted signal peptide were identified to be secreted proteins. Concordant changes in the secretome state were observed as a response to changes in growth/culture conditions. Additionally, two proteins secreted via a non-classical route operating in A. niger were identified. We were able to improve the in silico inventory of A. niger secretory proteins by combining different gene-model predictions from neighbouring Aspergilli and thereby avoiding prediction conflicts associated with inaccurate gene-models. The expected accuracy of signal peptide prediction for proteins that lack homologous sequences in the proteomes of related species is 85%. An experimental validation of the predicted proteome confirmed in silico predictions.

  14. An inventory of the Aspergillus niger secretome by combining in silico predictions with shotgun proteomics data

    PubMed Central

    2010-01-01

    Background The ecological niche occupied by a fungal species, its pathogenicity and its usefulness as a microbial cell factory to a large degree depends on its secretome. Protein secretion usually requires the presence of a N-terminal signal peptide (SP) and by scanning for this feature using available highly accurate SP-prediction tools, the fraction of potentially secreted proteins can be directly predicted. However, prediction of a SP does not guarantee that the protein is actually secreted and current in silico prediction methods suffer from gene-model errors introduced during genome annotation. Results A majority rule based classifier that also evaluates signal peptide predictions from the best homologs of three neighbouring Aspergillus species was developed to create an improved list of potential signal peptide containing proteins encoded by the Aspergillus niger genome. As a complement to these in silico predictions, the secretome associated with growth and upon carbon source depletion was determined using a shotgun proteomics approach. Overall, some 200 proteins with a predicted signal peptide were identified to be secreted proteins. Concordant changes in the secretome state were observed as a response to changes in growth/culture conditions. Additionally, two proteins secreted via a non-classical route operating in A. niger were identified. Conclusions We were able to improve the in silico inventory of A. niger secretory proteins by combining different gene-model predictions from neighbouring Aspergilli and thereby avoiding prediction conflicts associated with inaccurate gene-models. The expected accuracy of signal peptide prediction for proteins that lack homologous sequences in the proteomes of related species is 85%. An experimental validation of the predicted proteome confirmed in silico predictions. PMID:20959013

  15. Proteins and Their Interacting Partners: An Introduction to Protein-Ligand Binding Site Prediction Methods.

    PubMed

    Roche, Daniel Barry; Brackenridge, Danielle Allison; McGuffin, Liam James

    2015-12-15

    Elucidating the biological and biochemical roles of proteins, and subsequently determining their interacting partners, can be difficult and time consuming using in vitro and/or in vivo methods, and consequently the majority of newly sequenced proteins will have unknown structures and functions. However, in silico methods for predicting protein-ligand binding sites and protein biochemical functions offer an alternative practical solution. The characterisation of protein-ligand binding sites is essential for investigating new functional roles, which can impact the major biological research spheres of health, food, and energy security. In this review we discuss the role in silico methods play in 3D modelling of protein-ligand binding sites, along with their role in predicting biochemical functionality. In addition, we describe in detail some of the key alternative in silico prediction approaches that are available, as well as discussing the Critical Assessment of Techniques for Protein Structure Prediction (CASP) and the Continuous Automated Model EvaluatiOn (CAMEO) projects, and their impact on developments in the field. Furthermore, we discuss the importance of protein function prediction methods for tackling 21st century problems.

  16. A computational tool to predict the evolutionarily conserved protein-protein interaction hot-spot residues from the structure of the unbound protein.

    PubMed

    Agrawal, Neeraj J; Helk, Bernhard; Trout, Bernhardt L

    2014-01-21

    Identifying hot-spot residues - residues that are critical to protein-protein binding - can help to elucidate a protein's function and assist in designing therapeutic molecules to target those residues. We present a novel computational tool, termed spatial-interaction-map (SIM), to predict the hot-spot residues of an evolutionarily conserved protein-protein interaction from the structure of an unbound protein alone. SIM can predict the protein hot-spot residues with an accuracy of 36-57%. Thus, the SIM tool can be used to predict the yet unknown hot-spot residues for many proteins for which the structure of the protein-protein complexes are not available, thereby providing a clue to their functions and an opportunity to design therapeutic molecules to target these proteins. Copyright © 2013 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved.

  17. Computational prediction of host-pathogen protein-protein interactions.

    PubMed

    Dyer, Matthew D; Murali, T M; Sobral, Bruno W

    2007-07-01

    Infectious diseases such as malaria result in millions of deaths each year. An important aspect of any host-pathogen system is the mechanism by which a pathogen can infect its host. One method of infection is via protein-protein interactions (PPIs) where pathogen proteins target host proteins. Developing computational methods that identify which PPIs enable a pathogen to infect a host has great implications in identifying potential targets for therapeutics. We present a method that integrates known intra-species PPIs with protein-domain profiles to predict PPIs between host and pathogen proteins. Given a set of intra-species PPIs, we identify the functional domains in each of the interacting proteins. For every pair of functional domains, we use Bayesian statistics to assess the probability that two proteins with that pair of domains will interact. We apply our method to the Homo sapiens-Plasmodium falciparum host-pathogen system. Our system predicts 516 PPIs between proteins from these two organisms. We show that pairs of human proteins we predict to interact with the same Plasmodium protein are close to each other in the human PPI network and that Plasmodium pairs predicted to interact with same human protein are co-expressed in DNA microarray datasets measured during various stages of the Plasmodium life cycle. Finally, we identify functionally enriched sub-networks spanned by the predicted interactions and discuss the plausibility of our predictions. Supplementary data are available at http://staff.vbi.vt.edu/dyermd/publications/dyer2007a.html. Supplementary data are available at Bioinformatics online.

  18. Can Serum Surfactant Protein D or CC-Chemokine Ligand 18 Predict Outcome of Interstitial Lung Disease in Patients with Early Systemic Sclerosis?

    PubMed Central

    Elhaj, Mona; Charles, Julio; Pedroza, Claudia; Liu, Xiaochun; Zhou, Xiaodong; Estrada-Y-Martin, Rosa M.; Gonzalez, Emilio B.; Lewis, Dorothy E.; Draeger, Hilda T.; Kim, Sarah; Arnett, Frank C.; Mayes, Maureen D.; Assassi, Shervin

    2013-01-01

    Objective To examine the predictive significance of 2 pneumoproteins, surfactant protein D (SP-D) and CC-chemokine ligand 18 (CCL18), for the course of systemic sclerosis (SSc)-related interstitial lung disease. Methods The pneumoproteins were determined in the baseline plasma samples of 266 patients with early SSc enrolled in the GENISOS observational cohort. They also were measured in 83 followup patient samples. Pulmonary function tests were obtained annually. The primary outcome was decline in forced vital capacity (FVC percentage predicted) over time. The predictive significance for longterm change in FVC was investigated by a joint analysis of longitudinal measurements (sequentially obtained FVC percentage predicted) and survival data. Results SP-D and CCL18 levels were both higher in patients with SSc than in matched controls (p < 0.001 and p = 0.015, respectively). Baseline SP-D levels correlated with lower concomitantly obtained FVC (r = −0.27, p < 0.001), but did not predict the short-term decline in FVC at 1 year followup visit or its longterm decline rate. CCL18 showed a significant correlation with steeper short-term decline in FVC (p = 0.049), but was not a predictor of its longterm decline rate. Similarly, a composite score of SP-D and CCL18 was a significant predictor of short-term decline in FVC but did not predict its longterm decline rate. Further, the longitudinal change in these 2 pneumoproteins did not correlate with the concomitant percentage change in FVC. Conclusion SP-D correlated with concomitantly obtained FVC, while CCL18 was a predictor of short-term decline in FVC. However, neither SP-D nor CCL18 was a longterm predictor of FVC course in patients with early SSc. PMID:23588945

  19. Benchmarking protein-protein interface predictions: why you should care about protein size.

    PubMed

    Martin, Juliette

    2014-07-01

    A number of predictive methods have been developed to predict protein-protein binding sites. Each new method is traditionally benchmarked using sets of protein structures of various sizes, and global statistics are used to assess the quality of the prediction. Little attention has been paid to the potential bias due to protein size on these statistics. Indeed, small proteins involve proportionally more residues at interfaces than large ones. If a predictive method is biased toward small proteins, this can lead to an over-estimation of its performance. Here, we investigate the bias due to the size effect when benchmarking protein-protein interface prediction on the widely used docking benchmark 4.0. First, we simulate random scores that favor small proteins over large ones. Instead of the 0.5 AUC (Area Under the Curve) value expected by chance, these biased scores result in an AUC equal to 0.6 using hypergeometric distributions, and up to 0.65 using constant scores. We then use real prediction results to illustrate how to detect the size bias by shuffling, and subsequently correct it using a simple conversion of the scores into normalized ranks. In addition, we investigate the scores produced by eight published methods and show that they are all affected by the size effect, which can change their relative ranking. The size effect also has an impact on linear combination scores by modifying the relative contributions of each method. In the future, systematic corrections should be applied when benchmarking predictive methods using data sets with mixed protein sizes. © 2014 Wiley Periodicals, Inc.

  20. Small, synthetic, GC-rich mRNA stem-loop modules 5' proximal to the AUG start-codon predictably tune gene expression in yeast.

    PubMed

    Lamping, Erwin; Niimi, Masakazu; Cannon, Richard D

    2013-07-29

    A large range of genetic tools has been developed for the optimal design and regulation of complex metabolic pathways in bacteria. However, fewer tools exist in yeast that can precisely tune the expression of individual enzymes in novel metabolic pathways suitable for industrial-scale production of non-natural compounds. Tuning expression levels is critical for reducing the metabolic burden of over-expressed proteins, the accumulation of toxic intermediates, and for redirecting metabolic flux from native pathways involving essential enzymes without negatively affecting the viability of the host. We have developed a yeast membrane protein hyper-expression system with critical advantages over conventional, plasmid-based, expression systems. However, expression levels are sometimes so high that they adversely affect protein targeting/folding or the growth and/or phenotype of the host. Here we describe the use of small synthetic mRNA control modules that allowed us to predictably tune protein expression levels to any desired level. Down-regulation of expression was achieved by engineering small GC-rich mRNA stem-loops into the 5' UTR that inhibited translation initiation of the yeast ribosomal 43S preinitiation complex (PIC). Exploiting the fact that the yeast 43S PIC has great difficulty scanning through GC-rich mRNA stem-loops, we created yeast strains containing 17 different RNA stem-loop modules in the 5' UTR that expressed varying amounts of the fungal multidrug efflux pump reporter Cdr1p from Candida albicans. Increasing the length of mRNA stem-loops (that contained only GC-pairs) near the AUG start-codon led to a surprisingly large decrease in Cdr1p expression; ~2.7-fold for every additional GC-pair added to the stem, while the mRNA levels remained largely unaffected. An mRNA stem-loop of seven GC-pairs (∆G = -15.8 kcal/mol) reduced Cdr1p expression levels by >99%, and even the smallest possible stem-loop of only three GC-pairs (∆G = -4.4 kcal/mol) inhibited Cdr1p expression by ~50%. We have developed a simple cloning strategy to fine-tune protein expression levels in yeast that has many potential applications in metabolic engineering and the optimization of protein expression in yeast. This study also highlights the importance of considering the use of multiple cloning-sites carefully to preclude unwanted effects on gene expression.

  1. Small, synthetic, GC-rich mRNA stem-loop modules 5′ proximal to the AUG start-codon predictably tune gene expression in yeast

    PubMed Central

    2013-01-01

    Background A large range of genetic tools has been developed for the optimal design and regulation of complex metabolic pathways in bacteria. However, fewer tools exist in yeast that can precisely tune the expression of individual enzymes in novel metabolic pathways suitable for industrial-scale production of non-natural compounds. Tuning expression levels is critical for reducing the metabolic burden of over-expressed proteins, the accumulation of toxic intermediates, and for redirecting metabolic flux from native pathways involving essential enzymes without negatively affecting the viability of the host. We have developed a yeast membrane protein hyper-expression system with critical advantages over conventional, plasmid-based, expression systems. However, expression levels are sometimes so high that they adversely affect protein targeting/folding or the growth and/or phenotype of the host. Here we describe the use of small synthetic mRNA control modules that allowed us to predictably tune protein expression levels to any desired level. Down-regulation of expression was achieved by engineering small GC-rich mRNA stem-loops into the 5′ UTR that inhibited translation initiation of the yeast ribosomal 43S preinitiation complex (PIC). Results Exploiting the fact that the yeast 43S PIC has great difficulty scanning through GC-rich mRNA stem-loops, we created yeast strains containing 17 different RNA stem-loop modules in the 5′ UTR that expressed varying amounts of the fungal multidrug efflux pump reporter Cdr1p from Candida albicans. Increasing the length of mRNA stem-loops (that contained only GC-pairs) near the AUG start-codon led to a surprisingly large decrease in Cdr1p expression; ~2.7-fold for every additional GC-pair added to the stem, while the mRNA levels remained largely unaffected. An mRNA stem-loop of seven GC-pairs (∆G = −15.8 kcal/mol) reduced Cdr1p expression levels by >99%, and even the smallest possible stem-loop of only three GC-pairs (∆G = −4.4 kcal/mol) inhibited Cdr1p expression by ~50%. Conclusion We have developed a simple cloning strategy to fine-tune protein expression levels in yeast that has many potential applications in metabolic engineering and the optimization of protein expression in yeast. This study also highlights the importance of considering the use of multiple cloning-sites carefully to preclude unwanted effects on gene expression. PMID:23895661

  2. A proteomic analysis of the chromoplasts isolated from sweet orange fruits [Citrus sinensis (L.) Osbeck].

    PubMed

    Zeng, Yunliu; Pan, Zhiyong; Ding, Yuduan; Zhu, Andan; Cao, Hongbo; Xu, Qiang; Deng, Xiuxin

    2011-11-01

    Here, a comprehensive proteomic analysis of the chromoplasts purified from sweet orange using Nycodenz density gradient centrifugation is reported. A GeLC-MS/MS shotgun approach was used to identify the proteins of pooled chromoplast samples. A total of 493 proteins were identified from purified chromoplasts, of which 418 are putative plastid proteins based on in silico sequence homology and functional analyses. Based on the predicted functions of these identified plastid proteins, a large proportion (∼60%) of the chromoplast proteome of sweet orange is constituted by proteins involved in carbohydrate metabolism, amino acid/protein synthesis, and secondary metabolism. Of note, HDS (hydroxymethylbutenyl 4-diphosphate synthase), PAP (plastid-lipid-associated protein), and psHSPs (plastid small heat shock proteins) involved in the synthesis or storage of carotenoid and stress response are among the most abundant proteins identified. A comparison of chromoplast proteomes between sweet orange and tomato suggested a high level of conservation in a broad range of metabolic pathways. However, the citrus chromoplast was characterized by more extensive carotenoid synthesis, extensive amino acid synthesis without nitrogen assimilation, and evidence for lipid metabolism concerning jasmonic acid synthesis. In conclusion, this study provides an insight into the major metabolic pathways as well as some unique characteristics of the sweet orange chromoplasts at the whole proteome level.

  3. Local Geometry and Evolutionary Conservation of Protein Surfaces Reveal the Multiple Recognition Patches in Protein-Protein Interactions

    PubMed Central

    Laine, Elodie; Carbone, Alessandra

    2015-01-01

    Protein-protein interactions (PPIs) are essential to all biological processes and they represent increasingly important therapeutic targets. Here, we present a new method for accurately predicting protein-protein interfaces, understanding their properties, origins and binding to multiple partners. Contrary to machine learning approaches, our method combines in a rational and very straightforward way three sequence- and structure-based descriptors of protein residues: evolutionary conservation, physico-chemical properties and local geometry. The implemented strategy yields very precise predictions for a wide range of protein-protein interfaces and discriminates them from small-molecule binding sites. Beyond its predictive power, the approach permits to dissect interaction surfaces and unravel their complexity. We show how the analysis of the predicted patches can foster new strategies for PPIs modulation and interaction surface redesign. The approach is implemented in JET2, an automated tool based on the Joint Evolutionary Trees (JET) method for sequence-based protein interface prediction. JET2 is freely available at www.lcqb.upmc.fr/JET2. PMID:26690684

  4. Periodontal inflamed surface area and C-reactive protein as predictors of HbA1c: a study in Indonesia.

    PubMed

    Susanto, Hendri; Nesse, Willem; Dijkstra, Pieter U; Hoedemaker, Evelien; van Reenen, Yvonne Huijser; Agustina, Dewi; Vissink, Arjan; Abbas, Frank

    2012-08-01

    Periodontitis may exert an infectious and inflammatory burden, evidenced by increased C-reactive protein (CRP). This burden may impair blood glucose control (HbA1c). The aim of our study was to analyze whether periodontitis severity as measured with the periodontal inflamed surface area (PISA) and CRP predict HbA1c levels in a group of healthy Indonesians and a group of Indonesians treated for type 2 diabetes mellitus (DM2). A full-mouth periodontal examination, including probing pocket depth, gingival recession, clinical attachment loss, plaque index and bleeding on probing, was performed in 132 healthy Indonesians and 101 Indonesians treated for DM2. Using these data, PISA was calculated. In addition, HbA1c and CRP were analyzed. A validated questionnaire was used to assess smoking, body mass index (BMI), education and medical conditions. In regression analyses, it was assessed whether periodontitis severity and CRP predict HbA1c, controlling for confounding and effect modification (i.e., age, sex, BMI, pack years, and education). In healthy Indonesians, PISA and CRP predicted HbA1c as did age, sex, and smoking. In Indonesians treated for DM2, PISA did not predict HbA1c. Periodontitis may impair blood glucose regulation in healthy Indonesians in conjunction with elevated CRP levels. The potential effect of periodontitis on glucose control in DM2 patients may be masked by DM2 treatment. periodontitis may impair blood glucose control through exerting an inflammatory and infectious burden evidenced by increased levels of CRP.

  5. Accurate Prediction of Contact Numbers for Multi-Spanning Helical Membrane Proteins

    PubMed Central

    Li, Bian; Mendenhall, Jeffrey; Nguyen, Elizabeth Dong; Weiner, Brian E.; Fischer, Axel W.; Meiler, Jens

    2017-01-01

    Prediction of the three-dimensional (3D) structures of proteins by computational methods is acknowledged as an unsolved problem. Accurate prediction of important structural characteristics such as contact number is expected to accelerate the otherwise slow progress being made in the prediction of 3D structure of proteins. Here, we present a dropout neural network-based method, TMH-Expo, for predicting the contact number of transmembrane helix (TMH) residues from sequence. Neuronal dropout is a strategy where certain neurons of the network are excluded from back-propagation to prevent co-adaptation of hidden-layer neurons. By using neuronal dropout, overfitting was significantly reduced and performance was noticeably improved. For multi-spanning helical membrane proteins, TMH-Expo achieved a remarkable Pearson correlation coefficient of 0.69 between predicted and experimental values and a mean absolute error of only 1.68. In addition, among those membrane protein–membrane protein interface residues, 76.8% were correctly predicted. Mapping of predicted contact numbers onto structures indicates that contact numbers predicted by TMH-Expo reflect the exposure patterns of TMHs and reveal membrane protein–membrane protein interfaces, reinforcing the potential of predicted contact numbers to be used as restraints for 3D structure prediction and protein–protein docking. TMH-Expo can be accessed via a Web server at www.meilerlab.org. PMID:26804342

  6. Predicting disulfide connectivity from protein sequence using multiple sequence feature vectors and secondary structure.

    PubMed

    Song, Jiangning; Yuan, Zheng; Tan, Hao; Huber, Thomas; Burrage, Kevin

    2007-12-01

    Disulfide bonds are primary covalent crosslinks between two cysteine residues in proteins that play critical roles in stabilizing the protein structures and are commonly found in extracy-toplasmatic or secreted proteins. In protein folding prediction, the localization of disulfide bonds can greatly reduce the search in conformational space. Therefore, there is a great need to develop computational methods capable of accurately predicting disulfide connectivity patterns in proteins that could have potentially important applications. We have developed a novel method to predict disulfide connectivity patterns from protein primary sequence, using a support vector regression (SVR) approach based on multiple sequence feature vectors and predicted secondary structure by the PSIPRED program. The results indicate that our method could achieve a prediction accuracy of 74.4% and 77.9%, respectively, when averaged on proteins with two to five disulfide bridges using 4-fold cross-validation, measured on the protein and cysteine pair on a well-defined non-homologous dataset. We assessed the effects of different sequence encoding schemes on the prediction performance of disulfide connectivity. It has been shown that the sequence encoding scheme based on multiple sequence feature vectors coupled with predicted secondary structure can significantly improve the prediction accuracy, thus enabling our method to outperform most of other currently available predictors. Our work provides a complementary approach to the current algorithms that should be useful in computationally assigning disulfide connectivity patterns and helps in the annotation of protein sequences generated by large-scale whole-genome projects. The prediction web server and Supplementary Material are accessible at http://foo.maths.uq.edu.au/~huber/disulfide

  7. Association between pregnancy-associated plasma protein-A levels in the first trimester and gestational diabetes mellitus in Chinese women.

    PubMed

    Cheuk, Q Ky; Lo, T K; Wong, S F; Lee, C P

    2016-02-01

    Several studies have shown that women with pre-existing diabetes mellitus have significantly lower pregnancy-associated plasma protein-A levels than those without. This study aimed to evaluate whether first-trimester pregnancy-associated plasma protein-A multiple of median is associated with gestational diabetes mellitus in Chinese pregnant women. This prospectively collected case series was conducted in a regional hospital in Hong Kong. All consecutive Chinese women with a singleton pregnancy who attended the hospital for their first antenatal visit (before 14 weeks' gestation) from April to July 2014 were included. Pregnancy-associated plasma protein-A multiple of median was compared between the gestational diabetic (especially for early-onset gestational diabetes) and non-diabetic groups. The correlation between pregnancy-associated plasma protein-A level and glycosylated haemoglobin level in women with gestational diabetes was also examined. Of the 520 women recruited, gestational diabetes was diagnosed in 169 (32.5%). Among them, 43 (25.4%) had an early diagnosis, and 167 (98.8%) with the disease were managed by diet alone. The gestational diabetic group did not differ significantly to the non-diabetic group in pregnancy-associated plasma protein-A (0.97 vs 0.99, P=0.40) or free β-human chorionic gonadotrophin multiple of median (1.05 vs 1.02, P=0.29). Compared with the non-gestational diabetic group, women with early diagnosis of gestational diabetes had a non-significant reduction in pregnancy-associated plasma protein-A multiple of median (median, interquartile range: 0.86, 0.57-1.23 vs 0.99, 0.67-1.44; P=0.11). Pregnancy-associated plasma protein-A and glycosylated haemoglobin levels were not correlated in women with gestational diabetes (r=0.027; P=0.74). Chinese women with non-insulin-dependent gestational diabetes did not exhibit significant changes to pregnancy-associated plasma protein-A multiple of median nor a correlation between pregnancy-associated plasma protein-A with glycosylated haemoglobin levels. Pregnancy-associated plasma protein-A multiple of median was not predictive of non-insulin-dependent gestational diabetes or early onset of gestational diabetes. There was a high prevalence of gestational diabetes in the Chinese population.

  8. A Method for WD40 Repeat Detection and Secondary Structure Prediction

    PubMed Central

    Wang, Yang; Jiang, Fan; Zhuo, Zhu; Wu, Xian-Hui; Wu, Yun-Dong

    2013-01-01

    WD40-repeat proteins (WD40s), as one of the largest protein families in eukaryotes, play vital roles in assembling protein-protein/DNA/RNA complexes. WD40s fold into similar β-propeller structures despite diversified sequences. A program WDSP (WD40 repeat protein Structure Predictor) has been developed to accurately identify WD40 repeats and predict their secondary structures. The method is designed specifically for WD40 proteins by incorporating both local residue information and non-local family-specific structural features. It overcomes the problem of highly diversified protein sequences and variable loops. In addition, WDSP achieves a better prediction in identifying multiple WD40-domain proteins by taking the global combination of repeats into consideration. In secondary structure prediction, the average Q3 accuracy of WDSP in jack-knife test reaches 93.7%. A disease related protein LRRK2 was used as a representive example to demonstrate the structure prediction. PMID:23776530

  9. PPCM: Combing multiple classifiers to improve protein-protein interaction prediction

    DOE PAGES

    Yao, Jianzhuang; Guo, Hong; Yang, Xiaohan

    2015-08-01

    Determining protein-protein interaction (PPI) in biological systems is of considerable importance, and prediction of PPI has become a popular research area. Although different classifiers have been developed for PPI prediction, no single classifier seems to be able to predict PPI with high confidence. We postulated that by combining individual classifiers the accuracy of PPI prediction could be improved. We developed a method called protein-protein interaction prediction classifiers merger (PPCM), and this method combines output from two PPI prediction tools, GO2PPI and Phyloprof, using Random Forests algorithm. The performance of PPCM was tested by area under the curve (AUC) using anmore » assembled Gold Standard database that contains both positive and negative PPI pairs. Our AUC test showed that PPCM significantly improved the PPI prediction accuracy over the corresponding individual classifiers. We found that additional classifiers incorporated into PPCM could lead to further improvement in the PPI prediction accuracy. Furthermore, cross species PPCM could achieve competitive and even better prediction accuracy compared to the single species PPCM. This study established a robust pipeline for PPI prediction by integrating multiple classifiers using Random Forests algorithm. Ultimately, this pipeline will be useful for predicting PPI in nonmodel species.« less

  10. Plasma Shh levels reduced in pancreatic cancer patients

    PubMed Central

    El-Zaatari, Mohamad; Daignault, Stephanie; Tessier, Art; Kelsey, Gail; Travnikar, Lisa A.; Cantu, Esperanza F.; Lee, Jamie; Plonka, Caitlyn M.; Simeone, Diane M.; Anderson, Michelle A.; Merchant, Juanita L.

    2012-01-01

    Objectives Normally, sonic hedgehog (Shh) is expressed in the pancreas during fetal development and transiently after tissue injury. Although pancreatic cancers express Shh, it is not known if the protein is secreted into the blood and whether its plasma levels change with pancreatic transformation. The goal of this study was to develop an ELISA to detect human Shh in blood, and determine the levels in subjects with and without pancreatic cancer. Methods A human Shh ELISA assay was developed, and plasma Shh levels were measured in blood samples from normal volunteers and subjects with pancreatitis or pancreatic cancer. The biological activity of plasma Shh was tested using NIH-3T3 cells. Results The average levels of Shh in human blood were lower in pancreatitis and pancreatic cancer patients than in normal individuals. Hematopoietic cells did not express Shh suggesting that Shh is secreted into the bloodstream. Plasma fractions enriched for Shh did not induce Gli-1 mRNA suggesting that the protein was not biologically active. Conclusions Shh is secreted from tissues and organs into the circulation but its activity is blocked by plasma proteins. Reduced plasma levels were found in pancreatic cancer patients, but alone were not sufficient to predict pancreatic cancer. PMID:22513293

  11. Plasma Shh levels reduced in pancreatic cancer patients.

    PubMed

    El-Zaatari, Mohamad; Daignault, Stephanie; Tessier, Art; Kelsey, Gail; Travnikar, Lisa A; Cantu, Esperanza F; Lee, Jamie; Plonka, Caitlyn M; Simeone, Diane M; Anderson, Michelle A; Merchant, Juanita L

    2012-10-01

    Normally, sonic hedgehog (Shh) is expressed in the pancreas during fetal development and transiently after tissue injury. Although pancreatic cancers express Shh, it is not known if the protein is secreted into the blood and whether its plasma levels change with pancreatic transformation. The goal of this study was to develop an enzyme-linked immunosorbent assay to detect human Shh in blood and determine its levels in subjects with and without pancreatic cancer. A human Shh enzyme-linked immunosorbent assay was developed, and plasma Shh levels were measured in blood samples from healthy subjects and patients with pancreatitis or pancreatic cancer. The biological activity of plasma Shh was tested using NIH-3T3 cells. The mean levels of Shh in human blood were lower in patients with pancreatitis and pancreatic cancer than in healthy subjects. Hematopoietic cells did not express Shh, suggesting that Shh is secreted into the bloodstream. Plasma fractions enriched with Shh did not induce Gli-1 messenger RNA, suggesting that the protein was not biologically active. Shh is secreted from tissues and organs into the circulation, but its activity is blocked by plasma proteins. Reduced plasma levels were found in pancreatic cancer patients, but alone were not sufficient to predict pancreatic cancer.

  12. [Near-infrared reflectance spectroscopy predicts protein, moisture and ash in beans].

    PubMed

    Gao, Huiyu; Wang, Guodong; Men, Jianhua; Wang, Zhu

    2017-05-01

    To explore the potential of near-infrared reflectance( NIR)spectroscopy to determine macronutrient contents in beans. NIR spectra and analytical measurements of protein, moisture and ash were collected from 70 kinds of beans. Reference methods were used to analyze all the ground beans samples. NIR spectra on intact and ground beans samples were registered. Partial least-squares( PLS)regression models were developed with principal components analysis( PCA) to assign 49 bean accessions to a calibration data set and 21 accessions to an external validation set. For intact beans, the relative predictive determinant( RPD) values for protein and ash( 3. 67 and 3. 97, respectively) were good for screening. RPD value for moisture was only 1. 39, which was not recommended. For ground beans, the RPD values for protein, moisture and ash( 6. 63, 5. 25 and 3. 57, respectively) were good enough for screening. The protein, moisture and ash levels for intact and ground beans were all significantly correlated( P < 0. 001) between the NIR and reference method and there was no statistically significant difference in the mean with these three traits. This research demonstrates that NIR is a promising technique for simultaneous sorting ofmultiple traits in beans with no or easy sample preparation.

  13. Heterodimer Binding Scaffolds Recognition via the Analysis of Kinetically Hot Residues.

    PubMed

    Perišić, Ognjen

    2018-03-16

    Physical interactions between proteins are often difficult to decipher. The aim of this paper is to present an algorithm that is designed to recognize binding patches and supporting structural scaffolds of interacting heterodimer proteins using the Gaussian Network Model (GNM). The recognition is based on the (self) adjustable identification of kinetically hot residues and their connection to possible binding scaffolds. The kinetically hot residues are residues with the lowest entropy, i.e., the highest contribution to the weighted sum of the fastest modes per chain extracted via GNM. The algorithm adjusts the number of fast modes in the GNM's weighted sum calculation using the ratio of predicted and expected numbers of target residues (contact and the neighboring first-layer residues). This approach produces very good results when applied to dimers with high protein sequence length ratios. The protocol's ability to recognize near native decoys was compared to the ability of the residue-level statistical potential of Lu and Skolnick using the Sternberg and Vakser decoy dimers sets. The statistical potential produced better overall results, but in a number of cases its predicting ability was comparable, or even inferior, to the prediction ability of the adjustable GNM approach. The results presented in this paper suggest that in heterodimers at least one protein has interacting scaffold determined by the immovable, kinetically hot residues. In many cases, interacting proteins (especially if being of noticeably different sizes) either behave as a rigid lock and key or, presumably, exhibit the opposite dynamic behavior. While the binding surface of one protein is rigid and stable, its partner's interacting scaffold is more flexible and adaptable.

  14. Protein-RNA interface residue prediction using machine learning: an assessment of the state of the art.

    PubMed

    Walia, Rasna R; Caragea, Cornelia; Lewis, Benjamin A; Towfic, Fadi; Terribilini, Michael; El-Manzalawy, Yasser; Dobbs, Drena; Honavar, Vasant

    2012-05-10

    RNA molecules play diverse functional and structural roles in cells. They function as messengers for transferring genetic information from DNA to proteins, as the primary genetic material in many viruses, as catalysts (ribozymes) important for protein synthesis and RNA processing, and as essential and ubiquitous regulators of gene expression in living organisms. Many of these functions depend on precisely orchestrated interactions between RNA molecules and specific proteins in cells. Understanding the molecular mechanisms by which proteins recognize and bind RNA is essential for comprehending the functional implications of these interactions, but the recognition 'code' that mediates interactions between proteins and RNA is not yet understood. Success in deciphering this code would dramatically impact the development of new therapeutic strategies for intervening in devastating diseases such as AIDS and cancer. Because of the high cost of experimental determination of protein-RNA interfaces, there is an increasing reliance on statistical machine learning methods for training predictors of RNA-binding residues in proteins. However, because of differences in the choice of datasets, performance measures, and data representations used, it has been difficult to obtain an accurate assessment of the current state of the art in protein-RNA interface prediction. We provide a review of published approaches for predicting RNA-binding residues in proteins and a systematic comparison and critical assessment of protein-RNA interface residue predictors trained using these approaches on three carefully curated non-redundant datasets. We directly compare two widely used machine learning algorithms (Naïve Bayes (NB) and Support Vector Machine (SVM)) using three different data representations in which features are encoded using either sequence- or structure-based windows. Our results show that (i) Sequence-based classifiers that use a position-specific scoring matrix (PSSM)-based representation (PSSMSeq) outperform those that use an amino acid identity based representation (IDSeq) or a smoothed PSSM (SmoPSSMSeq); (ii) Structure-based classifiers that use smoothed PSSM representation (SmoPSSMStr) outperform those that use PSSM (PSSMStr) as well as sequence identity based representation (IDStr). PSSMSeq classifiers, when tested on an independent test set of 44 proteins, achieve performance that is comparable to that of three state-of-the-art structure-based predictors (including those that exploit geometric features) in terms of Matthews Correlation Coefficient (MCC), although the structure-based methods achieve substantially higher Specificity (albeit at the expense of Sensitivity) compared to sequence-based methods. We also find that the expected performance of the classifiers on a residue level can be markedly different from that on a protein level. Our experiments show that the classifiers trained on three different non-redundant protein-RNA interface datasets achieve comparable cross-validation performance. However, we find that the results are significantly affected by differences in the distance threshold used to define interface residues. Our results demonstrate that protein-RNA interface residue predictors that use a PSSM-based encoding of sequence windows outperform classifiers that use other encodings of sequence windows. While structure-based methods that exploit geometric features can yield significant increases in the Specificity of protein-RNA interface residue predictions, such increases are offset by decreases in Sensitivity. These results underscore the importance of comparing alternative methods using rigorous statistical procedures, multiple performance measures, and datasets that are constructed based on several alternative definitions of interface residues and redundancy cutoffs as well as including evaluations on independent test sets into the comparisons.

  15. In-depth analysis of the thylakoid membrane proteome of Arabidopsis thaliana chloroplasts: new proteins, new functions, and a plastid proteome database.

    PubMed

    Friso, Giulia; Giacomelli, Lisa; Ytterberg, A Jimmy; Peltier, Jean-Benoit; Rudella, Andrea; Sun, Qi; Wijk, Klaas J van

    2004-02-01

    An extensive analysis of the Arabidopsis thaliana peripheral and integral thylakoid membrane proteome was performed by sequential extractions with salt, detergent, and organic solvents, followed by multidimensional protein separation steps (reverse-phase HPLC and one- and two-dimensional electrophoresis gels), different enzymatic and nonenzymatic protein cleavage techniques, mass spectrometry, and bioinformatics. Altogether, 154 proteins were identified, of which 76 (49%) were alpha-helical integral membrane proteins. Twenty-seven new proteins without known function but with predicted chloroplast transit peptides were identified, of which 17 (63%) are integral membrane proteins. These new proteins, likely important in thylakoid biogenesis, include two rubredoxins, a potential metallochaperone, and a new DnaJ-like protein. The data were integrated with our analysis of the lumenal-enriched proteome. We identified 83 out of 100 known proteins of the thylakoid localized photosynthetic apparatus, including several new paralogues and some 20 proteins involved in protein insertion, assembly, folding, or proteolysis. An additional 16 proteins are involved in translation, demonstrating that the thylakoid membrane surface is an important site for protein synthesis. The high coverage of the photosynthetic apparatus and the identification of known hydrophobic proteins with low expression levels, such as cpSecE, Ohp1, and Ohp2, indicate an excellent dynamic resolution of the analysis. The sequential extraction process proved very helpful to validate transmembrane prediction. Our data also were cross-correlated to chloroplast subproteome analyses by other laboratories. All data are deposited in a new curated plastid proteome database (PPDB) with multiple search functions (http://cbsusrv01.tc.cornell.edu/users/ppdb/). This PPDB will serve as an expandable resource for the plant community.

  16. Predicting Ligand Binding Sites on Protein Surfaces by 3-Dimensional Probability Density Distributions of Interacting Atoms

    PubMed Central

    Jian, Jhih-Wei; Elumalai, Pavadai; Pitti, Thejkiran; Wu, Chih Yuan; Tsai, Keng-Chang; Chang, Jeng-Yih; Peng, Hung-Pin; Yang, An-Suei

    2016-01-01

    Predicting ligand binding sites (LBSs) on protein structures, which are obtained either from experimental or computational methods, is a useful first step in functional annotation or structure-based drug design for the protein structures. In this work, the structure-based machine learning algorithm ISMBLab-LIG was developed to predict LBSs on protein surfaces with input attributes derived from the three-dimensional probability density maps of interacting atoms, which were reconstructed on the query protein surfaces and were relatively insensitive to local conformational variations of the tentative ligand binding sites. The prediction accuracy of the ISMBLab-LIG predictors is comparable to that of the best LBS predictors benchmarked on several well-established testing datasets. More importantly, the ISMBLab-LIG algorithm has substantial tolerance to the prediction uncertainties of computationally derived protein structure models. As such, the method is particularly useful for predicting LBSs not only on experimental protein structures without known LBS templates in the database but also on computationally predicted model protein structures with structural uncertainties in the tentative ligand binding sites. PMID:27513851

  17. Comparison of the performances of copeptin and multiple biomarkers in long-term prognosis of severe traumatic brain injury.

    PubMed

    Zhang, Zu-Yong; Zhang, Li-Xin; Dong, Xiao-Qiao; Yu, Wen-Hua; Du, Quan; Yang, Ding-Bo; Shen, Yong-Feng; Wang, Hao; Zhu, Qiang; Che, Zhi-Hao; Liu, Qun-Jie; Jiang, Li; Du, Yuan-Feng

    2014-10-01

    Enhanced blood levels of copeptin correlate with poor clinical outcomes after acute critical illness. This study aimed to compare the prognostic performances of plasma concentrations of copeptin and other biomarkers like myelin basic protein, glial fibrillary astrocyte protein, S100B, neuron-specific enolase, phosphorylated axonal neurofilament subunit H, Tau and ubiquitin carboxyl-terminal hydrolase L1 in severe traumatic brain injury. We recruited 102 healthy controls and 102 acute patients with severe traumatic brain injury. Plasma concentrations of these biomarkers were determined using enzyme-linked immunosorbent assay. Their prognostic predictive performances of 6-month mortality and unfavorable outcome (Glasgow Outcome Scale score of 1-3) were compared. Plasma concentrations of these biomarkers were statistically significantly higher in all patients than in healthy controls, in non-survivors than in survivors and in patients with unfavorable outcome than with favorable outcome. Areas under receiver operating characteristic curves of plasma concentrations of these biomarkers were similar to those of Glasgow Coma Scale score for prognostic prediction. Except plasma copeptin concentration, other biomarkers concentrations in plasma did not statistically significantly improve prognostic predictive value of Glasgow Coma Scale score. Copeptin levels may be a useful tool to predict long-term clinical outcomes after severe traumatic brain injury and have a potential to assist clinicians. Copyright © 2014 Elsevier Inc. All rights reserved.

  18. Ligand and structure-based methodologies for the prediction of the activity of G protein-coupled receptor ligands

    NASA Astrophysics Data System (ADS)

    Costanzi, Stefano; Tikhonova, Irina G.; Harden, T. Kendall; Jacobson, Kenneth A.

    2009-11-01

    Accurate in silico models for the quantitative prediction of the activity of G protein-coupled receptor (GPCR) ligands would greatly facilitate the process of drug discovery and development. Several methodologies have been developed based on the properties of the ligands, the direct study of the receptor-ligand interactions, or a combination of both approaches. Ligand-based three-dimensional quantitative structure-activity relationships (3D-QSAR) techniques, not requiring knowledge of the receptor structure, have been historically the first to be applied to the prediction of the activity of GPCR ligands. They are generally endowed with robustness and good ranking ability; however they are highly dependent on training sets. Structure-based techniques generally do not provide the level of accuracy necessary to yield meaningful rankings when applied to GPCR homology models. However, they are essentially independent from training sets and have a sufficient level of accuracy to allow an effective discrimination between binders and nonbinders, thus qualifying as viable lead discovery tools. The combination of ligand and structure-based methodologies in the form of receptor-based 3D-QSAR and ligand and structure-based consensus models results in robust and accurate quantitative predictions. The contribution of the structure-based component to these combined approaches is expected to become more substantial and effective in the future, as more sophisticated scoring functions are developed and more detailed structural information on GPCRs is gathered.

  19. Physicochemical characteristics of structurally determined metabolite-protein and drug-protein binding events with respect to binding specificity.

    PubMed

    Korkuć, Paula; Walther, Dirk

    2015-01-01

    To better understand and ultimately predict both the metabolic activities as well as the signaling functions of metabolites, a detailed understanding of the physical interactions of metabolites with proteins is highly desirable. Focusing in particular on protein binding specificity vs. promiscuity, we performed a comprehensive analysis of the physicochemical properties of compound-protein binding events as reported in the Protein Data Bank (PDB). We compared the molecular and structural characteristics obtained for metabolites to those of the well-studied interactions of drug compounds with proteins. Promiscuously binding metabolites and drugs are characterized by low molecular weight and high structural flexibility. Unlike reported for drug compounds, low rather than high hydrophobicity appears associated, albeit weakly, with promiscuous binding for the metabolite set investigated in this study. Across several physicochemical properties, drug compounds exhibit characteristic binding propensities that are distinguishable from those associated with metabolites. Prediction of target diversity and compound promiscuity using physicochemical properties was possible at modest accuracy levels only, but was consistently better for drugs than for metabolites. Compound properties capturing structural flexibility and hydrogen-bond formation descriptors proved most informative in PLS-based prediction models. With regard to diversity of enzymatic activities of the respective metabolite target enzymes, the metabolites benzylsuccinate, hypoxanthine, trimethylamine N-oxide, oleoylglycerol, and resorcinol showed very narrow process involvement, while glycine, imidazole, tryptophan, succinate, and glutathione were identified to possess broad enzymatic reaction scopes. Promiscuous metabolites were found to mainly serve as general energy currency compounds, but were identified to also be involved in signaling processes and to appear in diverse organismal systems (digestive and nervous system) suggesting specific molecular and physiological roles of promiscuous metabolites.

  20. Protein complex prediction for large protein protein interaction networks with the Core&Peel method.

    PubMed

    Pellegrini, Marco; Baglioni, Miriam; Geraci, Filippo

    2016-11-08

    Biological networks play an increasingly important role in the exploration of functional modularity and cellular organization at a systemic level. Quite often the first tools used to analyze these networks are clustering algorithms. We concentrate here on the specific task of predicting protein complexes (PC) in large protein-protein interaction networks (PPIN). Currently, many state-of-the-art algorithms work well for networks of small or moderate size. However, their performance on much larger networks, which are becoming increasingly common in modern proteome-wise studies, needs to be re-assessed. We present a new fast algorithm for clustering large sparse networks: Core&Peel, which runs essentially in time and storage O(a(G)m+n) for a network G of n nodes and m arcs, where a(G) is the arboricity of G (which is roughly proportional to the maximum average degree of any induced subgraph in G). We evaluated Core&Peel on five PPI networks of large size and one of medium size from both yeast and homo sapiens, comparing its performance against those of ten state-of-the-art methods. We demonstrate that Core&Peel consistently outperforms the ten competitors in its ability to identify known protein complexes and in the functional coherence of its predictions. Our method is remarkably robust, being quite insensible to the injection of random interactions. Core&Peel is also empirically efficient attaining the second best running time over large networks among the tested algorithms. Our algorithm Core&Peel pushes forward the state-of the-art in PPIN clustering providing an algorithmic solution with polynomial running time that attains experimentally demonstrable good output quality and speed on challenging large real networks.

  1. Physicochemical characteristics of structurally determined metabolite-protein and drug-protein binding events with respect to binding specificity

    PubMed Central

    Korkuć, Paula; Walther, Dirk

    2015-01-01

    To better understand and ultimately predict both the metabolic activities as well as the signaling functions of metabolites, a detailed understanding of the physical interactions of metabolites with proteins is highly desirable. Focusing in particular on protein binding specificity vs. promiscuity, we performed a comprehensive analysis of the physicochemical properties of compound-protein binding events as reported in the Protein Data Bank (PDB). We compared the molecular and structural characteristics obtained for metabolites to those of the well-studied interactions of drug compounds with proteins. Promiscuously binding metabolites and drugs are characterized by low molecular weight and high structural flexibility. Unlike reported for drug compounds, low rather than high hydrophobicity appears associated, albeit weakly, with promiscuous binding for the metabolite set investigated in this study. Across several physicochemical properties, drug compounds exhibit characteristic binding propensities that are distinguishable from those associated with metabolites. Prediction of target diversity and compound promiscuity using physicochemical properties was possible at modest accuracy levels only, but was consistently better for drugs than for metabolites. Compound properties capturing structural flexibility and hydrogen-bond formation descriptors proved most informative in PLS-based prediction models. With regard to diversity of enzymatic activities of the respective metabolite target enzymes, the metabolites benzylsuccinate, hypoxanthine, trimethylamine N-oxide, oleoylglycerol, and resorcinol showed very narrow process involvement, while glycine, imidazole, tryptophan, succinate, and glutathione were identified to possess broad enzymatic reaction scopes. Promiscuous metabolites were found to mainly serve as general energy currency compounds, but were identified to also be involved in signaling processes and to appear in diverse organismal systems (digestive and nervous system) suggesting specific molecular and physiological roles of promiscuous metabolites. PMID:26442281

  2. As Simple As Possible, but Not Simpler: Exploring the Fidelity of Coarse-Grained Protein Models for Simulated Force Spectroscopy

    PubMed Central

    Rottler, Jörg; Plotkin, Steven S.

    2016-01-01

    Mechanical unfolding of a single domain of loop-truncated superoxide dismutase protein has been simulated via force spectroscopy techniques with both all-atom (AA) models and several coarse-grained models having different levels of resolution: A Gō model containing all heavy atoms in the protein (HA-Gō), the associative memory, water mediated, structure and energy model (AWSEM) which has 3 interaction sites per amino acid, and a Gō model containing only one interaction site per amino acid at the Cα position (Cα-Gō). To systematically compare results across models, the scales of time, energy, and force had to be suitably renormalized in each model. Surprisingly, the HA-Gō model gives the softest protein, exhibiting much smaller force peaks than all other models after the above renormalization. Clustering to render a structural taxonomy as the protein unfolds showed that the AA, HA-Gō, and Cα-Gō models exhibit a single pathway for early unfolding, which eventually bifurcates repeatedly to multiple branches only after the protein is about half-unfolded. The AWSEM model shows a single dominant unfolding pathway over the whole range of unfolding, in contrast to all other models. TM alignment, clustering analysis, and native contact maps show that the AWSEM pathway has however the most structural similarity to the AA model at high nativeness, but the least structural similarity to the AA model at low nativeness. In comparison to the AA model, the sequence of native contact breakage is best predicted by the HA-Gō model. All models consistently predict a similar unfolding mechanism for early force-induced unfolding events, but diverge in their predictions for late stage unfolding events when the protein is more significantly disordered. PMID:27898663

  3. As Simple As Possible, but Not Simpler: Exploring the Fidelity of Coarse-Grained Protein Models for Simulated Force Spectroscopy.

    PubMed

    Habibi, Mona; Rottler, Jörg; Plotkin, Steven S

    2016-11-01

    Mechanical unfolding of a single domain of loop-truncated superoxide dismutase protein has been simulated via force spectroscopy techniques with both all-atom (AA) models and several coarse-grained models having different levels of resolution: A Gō model containing all heavy atoms in the protein (HA-Gō), the associative memory, water mediated, structure and energy model (AWSEM) which has 3 interaction sites per amino acid, and a Gō model containing only one interaction site per amino acid at the Cα position (Cα-Gō). To systematically compare results across models, the scales of time, energy, and force had to be suitably renormalized in each model. Surprisingly, the HA-Gō model gives the softest protein, exhibiting much smaller force peaks than all other models after the above renormalization. Clustering to render a structural taxonomy as the protein unfolds showed that the AA, HA-Gō, and Cα-Gō models exhibit a single pathway for early unfolding, which eventually bifurcates repeatedly to multiple branches only after the protein is about half-unfolded. The AWSEM model shows a single dominant unfolding pathway over the whole range of unfolding, in contrast to all other models. TM alignment, clustering analysis, and native contact maps show that the AWSEM pathway has however the most structural similarity to the AA model at high nativeness, but the least structural similarity to the AA model at low nativeness. In comparison to the AA model, the sequence of native contact breakage is best predicted by the HA-Gō model. All models consistently predict a similar unfolding mechanism for early force-induced unfolding events, but diverge in their predictions for late stage unfolding events when the protein is more significantly disordered.

  4. Designing and benchmarking the MULTICOM protein structure prediction system

    PubMed Central

    2013-01-01

    Background Predicting protein structure from sequence is one of the most significant and challenging problems in bioinformatics. Numerous bioinformatics techniques and tools have been developed to tackle almost every aspect of protein structure prediction ranging from structural feature prediction, template identification and query-template alignment to structure sampling, model quality assessment, and model refinement. How to synergistically select, integrate and improve the strengths of the complementary techniques at each prediction stage and build a high-performance system is becoming a critical issue for constructing a successful, competitive protein structure predictor. Results Over the past several years, we have constructed a standalone protein structure prediction system MULTICOM that combines multiple sources of information and complementary methods at all five stages of the protein structure prediction process including template identification, template combination, model generation, model assessment, and model refinement. The system was blindly tested during the ninth Critical Assessment of Techniques for Protein Structure Prediction (CASP9) in 2010 and yielded very good performance. In addition to studying the overall performance on the CASP9 benchmark, we thoroughly investigated the performance and contributions of each component at each stage of prediction. Conclusions Our comprehensive and comparative study not only provides useful and practical insights about how to select, improve, and integrate complementary methods to build a cutting-edge protein structure prediction system but also identifies a few new sources of information that may help improve the design of a protein structure prediction system. Several components used in the MULTICOM system are available at: http://sysbio.rnet.missouri.edu/multicom_toolbox/. PMID:23442819

  5. A Theoretical Lower Bound for Selection on the Expression Levels of Proteins

    DOE PAGES

    Price, Morgan N.; Arkin, Adam P.

    2016-06-11

    We use simple models of the costs and benefits of microbial gene expression to show that changing a protein's expression away from its optimum by 2-fold should reduce fitness by at least [Formula: see text], where P is the fraction the cell's protein that the gene accounts for. As microbial genes are usually expressed at above 5 parts per million, and effective population sizes are likely to be above 10(6), this implies that 2-fold changes to gene expression levels are under strong selection, as [Formula: see text], where Ne is the effective population size and s is the selection coefficient.more » Thus, most gene duplications should be selected against. On the other hand, we predict that for most genes, small changes in the expression will be effectively neutral.« less

  6. A Theoretical Lower Bound for Selection on the Expression Levels of Proteins

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Price, Morgan N.; Arkin, Adam P.

    We use simple models of the costs and benefits of microbial gene expression to show that changing a protein's expression away from its optimum by 2-fold should reduce fitness by at least [Formula: see text], where P is the fraction the cell's protein that the gene accounts for. As microbial genes are usually expressed at above 5 parts per million, and effective population sizes are likely to be above 10(6), this implies that 2-fold changes to gene expression levels are under strong selection, as [Formula: see text], where Ne is the effective population size and s is the selection coefficient.more » Thus, most gene duplications should be selected against. On the other hand, we predict that for most genes, small changes in the expression will be effectively neutral.« less

  7. A Particle Swarm Optimization-Based Approach with Local Search for Predicting Protein Folding.

    PubMed

    Yang, Cheng-Hong; Lin, Yu-Shiun; Chuang, Li-Yeh; Chang, Hsueh-Wei

    2017-10-01

    The hydrophobic-polar (HP) model is commonly used for predicting protein folding structures and hydrophobic interactions. This study developed a particle swarm optimization (PSO)-based algorithm combined with local search algorithms; specifically, the high exploration PSO (HEPSO) algorithm (which can execute global search processes) was combined with three local search algorithms (hill-climbing algorithm, greedy algorithm, and Tabu table), yielding the proposed HE-L-PSO algorithm. By using 20 known protein structures, we evaluated the performance of the HE-L-PSO algorithm in predicting protein folding in the HP model. The proposed HE-L-PSO algorithm exhibited favorable performance in predicting both short and long amino acid sequences with high reproducibility and stability, compared with seven reported algorithms. The HE-L-PSO algorithm yielded optimal solutions for all predicted protein folding structures. All HE-L-PSO-predicted protein folding structures possessed a hydrophobic core that is similar to normal protein folding.

  8. Relationship between global structural parameters and Enzyme Commission hierarchy: implications for function prediction.

    PubMed

    Boareto, Marcelo; Yamagishi, Michel E B; Caticha, Nestor; Leite, Vitor B P

    2012-10-01

    In protein databases there is a substantial number of proteins structurally determined but without function annotation. Understanding the relationship between function and structure can be useful to predict function on a large scale. We have analyzed the similarities in global physicochemical parameters for a set of enzymes which were classified according to the four Enzyme Commission (EC) hierarchical levels. Using relevance theory we introduced a distance between proteins in the space of physicochemical characteristics. This was done by minimizing a cost function of the metric tensor built to reflect the EC classification system. Using an unsupervised clustering method on a set of 1025 enzymes, we obtained no relevant clustering formation compatible with EC classification. The distance distributions between enzymes from the same EC group and from different EC groups were compared by histograms. Such analysis was also performed using sequence alignment similarity as a distance. Our results suggest that global structure parameters are not sufficient to segregate enzymes according to EC hierarchy. This indicates that features essential for function are rather local than global. Consequently, methods for predicting function based on global attributes should not obtain high accuracy in main EC classes prediction without relying on similarities between enzymes from training and validation datasets. Furthermore, these results are consistent with a substantial number of studies suggesting that function evolves fundamentally by recruitment, i.e., a same protein motif or fold can be used to perform different enzymatic functions and a few specific amino acids (AAs) are actually responsible for enzyme activity. These essential amino acids should belong to active sites and an effective method for predicting function should be able to recognize them. Copyright © 2012 Elsevier Ltd. All rights reserved.

  9. Absolute Measurements of Macrophage Migration Inhibitory Factor and Interleukin-1-β mRNA Levels Accurately Predict Treatment Response in Depressed Patients.

    PubMed

    Cattaneo, Annamaria; Ferrari, Clarissa; Uher, Rudolf; Bocchio-Chiavetto, Luisella; Riva, Marco Andrea; Pariante, Carmine M

    2016-10-01

    Increased levels of inflammation have been associated with a poorer response to antidepressants in several clinical samples, but these findings have had been limited by low reproducibility of biomarker assays across laboratories, difficulty in predicting response probability on an individual basis, and unclear molecular mechanisms. Here we measured absolute mRNA values (a reliable quantitation of number of molecules) of Macrophage Migration Inhibitory Factor and interleukin-1β in a previously published sample from a randomized controlled trial comparing escitalopram vs nortriptyline (GENDEP) as well as in an independent, naturalistic replication sample. We then used linear discriminant analysis to calculate mRNA values cutoffs that best discriminated between responders and nonresponders after 12 weeks of antidepressants. As Macrophage Migration Inhibitory Factor and interleukin-1β might be involved in different pathways, we constructed a protein-protein interaction network by the Search Tool for the Retrieval of Interacting Genes/Proteins. We identified cutoff values for the absolute mRNA measures that accurately predicted response probability on an individual basis, with positive predictive values and specificity for nonresponders of 100% in both samples (negative predictive value=82% to 85%, sensitivity=52% to 61%). Using network analysis, we identified different clusters of targets for these 2 cytokines, with Macrophage Migration Inhibitory Factor interacting predominantly with pathways involved in neurogenesis, neuroplasticity, and cell proliferation, and interleukin-1β interacting predominantly with pathways involved in the inflammasome complex, oxidative stress, and neurodegeneration. We believe that these data provide a clinically suitable approach to the personalization of antidepressant therapy: patients who have absolute mRNA values above the suggested cutoffs could be directed toward earlier access to more assertive antidepressant strategies, including the addition of other antidepressants or antiinflammatory drugs. © The Author 2016. Published by Oxford University Press on behalf of CINP.

  10. Improved method for predicting protein fold patterns with ensemble classifiers.

    PubMed

    Chen, W; Liu, X; Huang, Y; Jiang, Y; Zou, Q; Lin, C

    2012-01-27

    Protein folding is recognized as a critical problem in the field of biophysics in the 21st century. Predicting protein-folding patterns is challenging due to the complex structure of proteins. In an attempt to solve this problem, we employed ensemble classifiers to improve prediction accuracy. In our experiments, 188-dimensional features were extracted based on the composition and physical-chemical property of proteins and 20-dimensional features were selected using a coupled position-specific scoring matrix. Compared with traditional prediction methods, these methods were superior in terms of prediction accuracy. The 188-dimensional feature-based method achieved 71.2% accuracy in five cross-validations. The accuracy rose to 77% when we used a 20-dimensional feature vector. These methods were used on recent data, with 54.2% accuracy. Source codes and dataset, together with web server and software tools for prediction, are available at: http://datamining.xmu.edu.cn/main/~cwc/ProteinPredict.html.

  11. Protein-Protein Interface Predictions by Data-Driven Methods: A Review

    PubMed Central

    Xue, Li C; Dobbs, Drena; Bonvin, Alexandre M.J.J.; Honavar, Vasant

    2015-01-01

    Reliably pinpointing which specific amino acid residues form the interface(s) between a protein and its binding partner(s) is critical for understanding the structural and physicochemical determinants of protein recognition and binding affinity, and has wide applications in modeling and validating protein interactions predicted by high-throughput methods, in engineering proteins, and in prioritizing drug targets. Here, we review the basic concepts, principles and recent advances in computational approaches to the analysis and prediction of protein-protein interfaces. We point out caveats for objectively evaluating interface predictors, and discuss various applications of data-driven interface predictors for improving energy model-driven protein-protein docking. Finally, we stress the importance of exploiting binding partner information in reliably predicting interfaces and highlight recent advances in this emerging direction. PMID:26460190

  12. Automated identification of protein-ligand interaction features using Inductive Logic Programming: a hexose binding case study.

    PubMed

    A Santos, Jose C; Nassif, Houssam; Page, David; Muggleton, Stephen H; E Sternberg, Michael J

    2012-07-11

    There is a need for automated methods to learn general features of the interactions of a ligand class with its diverse set of protein receptors. An appropriate machine learning approach is Inductive Logic Programming (ILP), which automatically generates comprehensible rules in addition to prediction. The development of ILP systems which can learn rules of the complexity required for studies on protein structure remains a challenge. In this work we use a new ILP system, ProGolem, and demonstrate its performance on learning features of hexose-protein interactions. The rules induced by ProGolem detect interactions mediated by aromatics and by planar-polar residues, in addition to less common features such as the aromatic sandwich. The rules also reveal a previously unreported dependency for residues cys and leu. They also specify interactions involving aromatic and hydrogen bonding residues. This paper shows that Inductive Logic Programming implemented in ProGolem can derive rules giving structural features of protein/ligand interactions. Several of these rules are consistent with descriptions in the literature. In addition to confirming literature results, ProGolem's model has a 10-fold cross-validated predictive accuracy that is superior, at the 95% confidence level, to another ILP system previously used to study protein/hexose interactions and is comparable with state-of-the-art statistical learners.

  13. Phylo_dCor: distance correlation as a novel metric for phylogenetic profiling.

    PubMed

    Sferra, Gabriella; Fratini, Federica; Ponzi, Marta; Pizzi, Elisabetta

    2017-09-05

    Elaboration of powerful methods to predict functional and/or physical protein-protein interactions from genome sequence is one of the main tasks in the post-genomic era. Phylogenetic profiling allows the prediction of protein-protein interactions at a whole genome level in both Prokaryotes and Eukaryotes. For this reason it is considered one of the most promising methods. Here, we propose an improvement of phylogenetic profiling that enables handling of large genomic datasets and infer global protein-protein interactions. This method uses the distance correlation as a new measure of phylogenetic profile similarity. We constructed robust reference sets and developed Phylo-dCor, a parallelized version of the algorithm for calculating the distance correlation that makes it applicable to large genomic data. Using Saccharomyces cerevisiae and Escherichia coli genome datasets, we showed that Phylo-dCor outperforms phylogenetic profiling methods previously described based on the mutual information and Pearson's correlation as measures of profile similarity. In this work, we constructed and assessed robust reference sets and propose the distance correlation as a measure for comparing phylogenetic profiles. To make it applicable to large genomic data, we developed Phylo-dCor, a parallelized version of the algorithm for calculating the distance correlation. Two R scripts that can be run on a wide range of machines are available upon request.

  14. Knowledge-based computational intelligence development for predicting protein secondary structures from sequences.

    PubMed

    Shen, Hong-Bin; Yi, Dong-Liang; Yao, Li-Xiu; Yang, Jie; Chou, Kuo-Chen

    2008-10-01

    In the postgenomic age, with the avalanche of protein sequences generated and relatively slow progress in determining their structures by experiments, it is important to develop automated methods to predict the structure of a protein from its sequence. The membrane proteins are a special group in the protein family that accounts for approximately 30% of all proteins; however, solved membrane protein structures only represent less than 1% of known protein structures to date. Although a great success has been achieved for developing computational intelligence techniques to predict secondary structures in both globular and membrane proteins, there is still much challenging work in this regard. In this review article, we firstly summarize the recent progress of automation methodology development in predicting protein secondary structures, especially in membrane proteins; we will then give some future directions in this research field.

  15. C-reactive protein, procalcitonin, clinical pulmonary infection score, and pneumonia severity scores in nursing home acquired pneumonia.

    PubMed

    Porfyridis, Ilias; Georgiadis, Georgios; Vogazianos, Paris; Mitis, Georgios; Georgiou, Andreas

    2014-04-01

    Patients with nursing home acquired pneumonia (NHAP) present a distinct group of lower respiratory track infections with different risk factors, clinical presentation, and mortality rates. To evaluate the diagnostic value of clinical pulmonary infection score (CPIS), C-reactive protein, and procalcitonin and to compare the accuracy of pneumonia severity scores (confusion, urea nitrogen, breathing frequency, blood pressure, ≥ 65 y of age [CURB-65]; pneumonia severity index; NHAP index; systolic blood pressure, multilobar involvement, albumin, breathing frequency, tachycardia, confusion, oxygen, arterial pH [SMART-COP]; and systolic blood pressure, oxygen, age > 65 y, breathing frequency [SOAR]) in predicting in-patient mortality from NHAP. Nursing home residents admitted to the hospital with acute respiratory illness were enrolled in the study. Subjects were classified as having NHAP (Group A) or other pulmonary disorders (Group B). Clinical, imaging, and laboratory data were assessed to compute CPIS and severity scores. C-reactive protein and procalcitonin were measured by immunonephelometry and immunoassay, respectively. Fifty-eight subjects were diagnosed with NHAP (Group A) and 29 with other pulmonary disorders (Group B). The mean C-reactive protein ± SD was 16.38 ± 8.6 mg/dL in Group A and 5.2 ± 5.6 mg/dL in Group B (P < .001). The mean procalcitonin ± SD was 1.52 ± 2.75 ng/mL in Group A and 0.24 ± 0.21 ng/mL in Group B (P = .001). The mean CPIS ± SD was 5.4 ± 1.2 in Group A and 2.3 ± 1.5 in Group B (P < .001). At a cutoff value of 0.475 ng/mL, procalcitonin had a sensitivity of 83% and a specificity of 72%. At a cutoff value of 8.05 mg/dL, C-reactive protein had a sensitivity of 81% and a specificity of 79%. Procalcitonin and C-reactive protein levels were significantly higher in Gram-positive NHAP. The in-patient mortality was 17.2% in Group A. Procalcitonin levels were 4.67 ± 5.4 ng/mL in non-survivors and 0.86 ± 0.9 ng/mL in survivors (P < .001). The area under the curve for procalcitonin in predicting in-patient mortality was 0.84 (95% CI 0.70-0.98, P = .001). A procalcitonin level upon admission > 1.1 ng/mL was an independent predictor of in-patient mortality. Of the pneumonia severity scores, CURB-65 showed greater accuracy in predicting in-patient mortality (area under the curve of 0.68, 95% CI 0.53-0.84, P = .06). CPIS, procalcitonin, and C-reactive protein are reliable for the diagnosis of NHAP. Procalcitonin and CURB-65 are accurate in predicting in-patient mortality in NHAP.

  16. Identification and Characterization of a Gene stp17 Located on the Linear Plasmid pBSSB1 as an Enhanced Gene of Growth and Motility in Salmonella enterica Serovar Typhi

    PubMed Central

    Zhang, Haifang; Zhu, Yunxia; Xie, Xiaofang; Wang, Min; Du, Hong; Xu, Shungao; Zhang, Ying; Gong, Mingyu; Ni, Bin; Xu, Huaxi; Huang, Xinxiang

    2016-01-01

    The linear plasmid pBSSB1 mediates the flagellar phase variation in H:z66 positive Salmonella enterica serovar Typhi (S. Typhi). The gene named stp17 (S. Typhi plasmid number 17 gene) is located on pBSSB1 and encodes the protein STP17. The expression pattern at the protein-level and function of STP17 remains unknown. In this study, the recombinant protein STP17His6 was expressed, purified and used to prepare the polyclonal anti-STP17 antibody. We detected protein-level expression of stp17 in S. Typhi and further investigated the protein expression characteristics of stp17 in different growth phases by western blot analysis. The effects of STP17 on bacterial growth and motility were analyzed. In addition, the structure of STP17 was predicted and the active site of STP17 was identified by site-directed mutagenesis. The results showed that STP17 was expressed stably in the wild type strain of S. Typhi. STP17 expression at the protein level peaks when cultures reach an OD600 value of 1.2. The growth rate and motility of the Δstp17 strain were significantly decreased compared with the wild type strain (P < 0.05) and this phenotype was restored in the stp17 complementary strain. Moreover, the growth rate and motility of the stp17 over-expression strain was greater than the wild type strain. STP17 contains nine Helix segments, six Stand segments and some Coil segments in the secondary structural level. The top-ranked 3-D structure of STP17 predicted by I-TASSER contains a putative ATPase domain and the amino acid residues of GLY16, GLY19, LYS20, ASN133, LYS157, and LYS158 may be the active site residues of STP17. Finally, STP17 was able to catalyze the ATP to ADP reaction, suggesting that STP17 may be an ATPase. To our knowledge, this is the first report describing the protein expression characteristics of STP17 in S. Typhi, showing that STP17 promotes bacterial growth and motility, which may be associated with its potential ATPase activity. PMID:27761429

  17. Characterizing informative sequence descriptors and predicting binding affinities of heterodimeric protein complexes.

    PubMed

    Srinivasulu, Yerukala Sathipati; Wang, Jyun-Rong; Hsu, Kai-Ti; Tsai, Ming-Ju; Charoenkwan, Phasit; Huang, Wen-Lin; Huang, Hui-Ling; Ho, Shinn-Ying

    2015-01-01

    Protein-protein interactions (PPIs) are involved in various biological processes, and underlying mechanism of the interactions plays a crucial role in therapeutics and protein engineering. Most machine learning approaches have been developed for predicting the binding affinity of protein-protein complexes based on structure and functional information. This work aims to predict the binding affinity of heterodimeric protein complexes from sequences only. This work proposes a support vector machine (SVM) based binding affinity classifier, called SVM-BAC, to classify heterodimeric protein complexes based on the prediction of their binding affinity. SVM-BAC identified 14 of 580 sequence descriptors (physicochemical, energetic and conformational properties of the 20 amino acids) to classify 216 heterodimeric protein complexes into low and high binding affinity. SVM-BAC yielded the training accuracy, sensitivity, specificity, AUC and test accuracy of 85.80%, 0.89, 0.83, 0.86 and 83.33%, respectively, better than existing machine learning algorithms. The 14 features and support vector regression were further used to estimate the binding affinities (Pkd) of 200 heterodimeric protein complexes. Prediction performance of a Jackknife test was the correlation coefficient of 0.34 and mean absolute error of 1.4. We further analyze three informative physicochemical properties according to their contribution to prediction performance. Results reveal that the following properties are effective in predicting the binding affinity of heterodimeric protein complexes: apparent partition energy based on buried molar fractions, relations between chemical structure and biological activity in principal component analysis IV, and normalized frequency of beta turn. The proposed sequence-based prediction method SVM-BAC uses an optimal feature selection method to identify 14 informative features to classify and predict binding affinity of heterodimeric protein complexes. The characterization analysis revealed that the average numbers of beta turns and hydrogen bonds at protein-protein interfaces in high binding affinity complexes are more than those in low binding affinity complexes.

  18. Characterizing informative sequence descriptors and predicting binding affinities of heterodimeric protein complexes

    PubMed Central

    2015-01-01

    Background Protein-protein interactions (PPIs) are involved in various biological processes, and underlying mechanism of the interactions plays a crucial role in therapeutics and protein engineering. Most machine learning approaches have been developed for predicting the binding affinity of protein-protein complexes based on structure and functional information. This work aims to predict the binding affinity of heterodimeric protein complexes from sequences only. Results This work proposes a support vector machine (SVM) based binding affinity classifier, called SVM-BAC, to classify heterodimeric protein complexes based on the prediction of their binding affinity. SVM-BAC identified 14 of 580 sequence descriptors (physicochemical, energetic and conformational properties of the 20 amino acids) to classify 216 heterodimeric protein complexes into low and high binding affinity. SVM-BAC yielded the training accuracy, sensitivity, specificity, AUC and test accuracy of 85.80%, 0.89, 0.83, 0.86 and 83.33%, respectively, better than existing machine learning algorithms. The 14 features and support vector regression were further used to estimate the binding affinities (Pkd) of 200 heterodimeric protein complexes. Prediction performance of a Jackknife test was the correlation coefficient of 0.34 and mean absolute error of 1.4. We further analyze three informative physicochemical properties according to their contribution to prediction performance. Results reveal that the following properties are effective in predicting the binding affinity of heterodimeric protein complexes: apparent partition energy based on buried molar fractions, relations between chemical structure and biological activity in principal component analysis IV, and normalized frequency of beta turn. Conclusions The proposed sequence-based prediction method SVM-BAC uses an optimal feature selection method to identify 14 informative features to classify and predict binding affinity of heterodimeric protein complexes. The characterization analysis revealed that the average numbers of beta turns and hydrogen bonds at protein-protein interfaces in high binding affinity complexes are more than those in low binding affinity complexes. PMID:26681483

  19. Protein complex prediction in large ontology attributed protein-protein interaction networks.

    PubMed

    Zhang, Yijia; Lin, Hongfei; Yang, Zhihao; Wang, Jian; Li, Yanpeng; Xu, Bo

    2013-01-01

    Protein complexes are important for unraveling the secrets of cellular organization and function. Many computational approaches have been developed to predict protein complexes in protein-protein interaction (PPI) networks. However, most existing approaches focus mainly on the topological structure of PPI networks, and largely ignore the gene ontology (GO) annotation information. In this paper, we constructed ontology attributed PPI networks with PPI data and GO resource. After constructing ontology attributed networks, we proposed a novel approach called CSO (clustering based on network structure and ontology attribute similarity). Structural information and GO attribute information are complementary in ontology attributed networks. CSO can effectively take advantage of the correlation between frequent GO annotation sets and the dense subgraph for protein complex prediction. Our proposed CSO approach was applied to four different yeast PPI data sets and predicted many well-known protein complexes. The experimental results showed that CSO was valuable in predicting protein complexes and achieved state-of-the-art performance.

  20. Temperature sensing in Yersinia pestis: translation of the LcrF activator protein is thermally regulated.

    PubMed Central

    Hoe, N P; Goguen, J D

    1993-01-01

    The lcrF gene of Yersinia pestis encodes a transcription activator responsible for inducing expression of several virulence-related proteins in response to temperature. The mechanism of this thermoregulation was investigated. An lcrF clone was found to produce much lower levels of LcrF protein at 26 than at 37 degrees C in Y. pestis, although it was transcribed at similar levels at both temperatures. High-level T7 polymerase-directed transcription of the lcrF gene in Escherichia coli also resulted in temperature-dependent production of the LcrF protein. Pulse-chase experiments showed that the LcrF protein was stable at 26 and 37 degrees C, suggesting that translation rate or message degradation is thermally controlled. The lcrF mRNA appears to be highly unstable and could not be reliably detected in Y. pestis. Insertion of the lcrF gene into plasmid pET4a, which produces high levels of plasmid-length RNA, aided detection of lcrF-specific message in E. coli. Comparison of the amount of LcrF protein produced per unit of message at 26 and 37 degrees C indicated that the efficiency of translation of lcrF message increased with temperature. mRNA secondary structure predictions suggest that the lcrF Shine-Dalgarno sequence is sequestered in a stem-loop. A model in which decreased stability of this stem-loop with increasing temperature leads to increased efficiency of translation initiation of lcrF message is presented. Images PMID:7504666

  1. A Coarse-Grained Elastic Network Atom Contact Model and Its Use in the Simulation of Protein Dynamics and the Prediction of the Effect of Mutations

    PubMed Central

    Frappier, Vincent; Najmanovich, Rafael J.

    2014-01-01

    Normal mode analysis (NMA) methods are widely used to study dynamic aspects of protein structures. Two critical components of NMA methods are coarse-graining in the level of simplification used to represent protein structures and the choice of potential energy functional form. There is a trade-off between speed and accuracy in different choices. In one extreme one finds accurate but slow molecular-dynamics based methods with all-atom representations and detailed atom potentials. On the other extreme, fast elastic network model (ENM) methods with Cα−only representations and simplified potentials that based on geometry alone, thus oblivious to protein sequence. Here we present ENCoM, an Elastic Network Contact Model that employs a potential energy function that includes a pairwise atom-type non-bonded interaction term and thus makes it possible to consider the effect of the specific nature of amino-acids on dynamics within the context of NMA. ENCoM is as fast as existing ENM methods and outperforms such methods in the generation of conformational ensembles. Here we introduce a new application for NMA methods with the use of ENCoM in the prediction of the effect of mutations on protein stability. While existing methods are based on machine learning or enthalpic considerations, the use of ENCoM, based on vibrational normal modes, is based on entropic considerations. This represents a novel area of application for NMA methods and a novel approach for the prediction of the effect of mutations. We compare ENCoM to a large number of methods in terms of accuracy and self-consistency. We show that the accuracy of ENCoM is comparable to that of the best existing methods. We show that existing methods are biased towards the prediction of destabilizing mutations and that ENCoM is less biased at predicting stabilizing mutations. PMID:24762569

  2. Pooled Results From 5 Validation Studies of Dietary Self-Report Instruments Using Recovery Biomarkers for Energy and Protein Intake

    PubMed Central

    Freedman, Laurence S.; Commins, John M.; Moler, James E.; Arab, Lenore; Baer, David J.; Kipnis, Victor; Midthune, Douglas; Moshfegh, Alanna J.; Neuhouser, Marian L.; Prentice, Ross L.; Schatzkin, Arthur; Spiegelman, Donna; Subar, Amy F.; Tinker, Lesley F.; Willett, Walter

    2014-01-01

    We pooled data from 5 large validation studies of dietary self-report instruments that used recovery biomarkers as references to clarify the measurement properties of food frequency questionnaires (FFQs) and 24-hour recalls. The studies were conducted in widely differing US adult populations from 1999 to 2009. We report on total energy, protein, and protein density intakes. Results were similar across sexes, but there was heterogeneity across studies. Using a FFQ, the average correlation coefficients for reported versus true intakes for energy, protein, and protein density were 0.21, 0.29, and 0.41, respectively. Using a single 24-hour recall, the coefficients were 0.26, 0.40, and 0.36, respectively, for the same nutrients and rose to 0.31, 0.49, and 0.46 when three 24-hour recalls were averaged. The average rate of under-reporting of energy intake was 28% with a FFQ and 15% with a single 24-hour recall, but the percentages were lower for protein. Personal characteristics related to under-reporting were body mass index, educational level, and age. Calibration equations for true intake that included personal characteristics provided improved prediction. This project establishes that FFQs have stronger correlations with truth for protein density than for absolute protein intake, that the use of multiple 24-hour recalls substantially increases the correlations when compared with a single 24-hour recall, and that body mass index strongly predicts under-reporting of energy and protein intakes. PMID:24918187

  3. Dynamics of Heat Shock Protein 70 Serum Levels As a Predictor of Clinical Response in Non-Small-Cell Lung Cancer and Correlation with the Hypoxia-Related Marker Osteopontin

    PubMed Central

    Ostheimer, Christian; Gunther, Sophie; Bache, Matthias; Vordermark, Dirk; Multhoff, Gabriele

    2017-01-01

    Hypoxia mediates resistance to radio(chemo)therapy (RT) by stimulating the synthesis of hypoxia-related genes, such as osteopontin (OPN) and stress proteins, including the major stress-inducible heat shock protein 70 (Hsp70). Apart from its intracellular localization, Hsp70 is also present on the plasma membrane of viable tumor cells that actively release it in lipid vesicles with biophysical characteristics of exosomes. Exosomal Hsp70 contributes to radioresistance while Hsp70 derived from dying tumor cells can serve as a stimulator of immune cells. Given these opposing traits of extracellular Hsp70 and the unsatisfactory outcome of locally advanced lung tumors, we investigated the role of Hsp70 in the plasma of patients with advanced, non-metastasized non-small-cell lung cancer (NSCLC) before (T1) and 4–6 weeks after RT (T2) in relation to OPN as potential biomarkers for clinical response. Plasma levels of Hsp70 correlate with those of OPN at T1, and high OPN levels are significantly associated with a decreased overall survival (OS). Due to a therapy-induced reduction in viable tumor mass after RT Hsp70 plasma levels dropped significantly at T2 (p = 0.016). However, with respect to the immunostimulatory capacity of Hsp70 derived from dying tumor cells, patients with higher post-therapeutic Hsp70 levels showed a significantly better response to RT (p = 0.034) than those with lower levels at T2. In summary, high OPN plasma levels at T1 are indicative for poor OS, whereas elevated post-therapeutic Hsp70 plasma levels together with a drop of Hsp70 between T1 and T2, successfully predict favorable responses to RT. Monitoring the dynamics of Hsp70 in NSCLC patients before and after RT can provide additional predictive information for clinical outcome and therefore might allow a more rapid therapy adaptation. PMID:29093708

  4. Construction of ontology augmented networks for protein complex prediction.

    PubMed

    Zhang, Yijia; Lin, Hongfei; Yang, Zhihao; Wang, Jian

    2013-01-01

    Protein complexes are of great importance in understanding the principles of cellular organization and function. The increase in available protein-protein interaction data, gene ontology and other resources make it possible to develop computational methods for protein complex prediction. Most existing methods focus mainly on the topological structure of protein-protein interaction networks, and largely ignore the gene ontology annotation information. In this article, we constructed ontology augmented networks with protein-protein interaction data and gene ontology, which effectively unified the topological structure of protein-protein interaction networks and the similarity of gene ontology annotations into unified distance measures. After constructing ontology augmented networks, a novel method (clustering based on ontology augmented networks) was proposed to predict protein complexes, which was capable of taking into account the topological structure of the protein-protein interaction network, as well as the similarity of gene ontology annotations. Our method was applied to two different yeast protein-protein interaction datasets and predicted many well-known complexes. The experimental results showed that (i) ontology augmented networks and the unified distance measure can effectively combine the structure closeness and gene ontology annotation similarity; (ii) our method is valuable in predicting protein complexes and has higher F1 and accuracy compared to other competing methods.

  5. Proteotranscriptomic Profiling of 231-BR Breast Cancer Cells: Identification of Potential Biomarkers and Therapeutic Targets for Brain Metastasis*

    PubMed Central

    Dun, Matthew D.; Chalkley, Robert J.; Faulkner, Sam; Keene, Sheridan; Avery-Kiejda, Kelly A.; Scott, Rodney J.; Falkenby, Lasse G.; Cairns, Murray J.; Larsen, Martin R.; Bradshaw, Ralph A.; Hondermarck, Hubert

    2015-01-01

    Brain metastases are a devastating consequence of cancer and currently there are no specific biomarkers or therapeutic targets for risk prediction, diagnosis, and treatment. Here the proteome of the brain metastatic breast cancer cell line 231-BR has been compared with that of the parental cell line MDA-MB-231, which is also metastatic but has no organ selectivity. Using SILAC and nanoLC-MS/MS, 1957 proteins were identified in reciprocal labeling experiments and 1584 were quantified in the two cell lines. A total of 152 proteins were confidently determined to be up- or down-regulated by more than twofold in 231-BR. Of note, 112/152 proteins were decreased as compared with only 40/152 that were increased, suggesting that down-regulation of specific proteins is an important part of the mechanism underlying the ability of breast cancer cells to metastasize to the brain. When matched against transcriptomic data, 43% of individual protein changes were associated with corresponding changes in mRNA, indicating that the transcript level is a limited predictor of protein level. In addition, differential miRNA analyses showed that most miRNA changes in 231-BR were up- (36/45) as compared with down-regulations (9/45). Pathway analysis revealed that proteome changes were mostly related to cell signaling and cell cycle, metabolism and extracellular matrix remodeling. The major protein changes in 231-BR were confirmed by parallel reaction monitoring mass spectrometry and consisted in increases (by more than fivefold) in the matrix metalloproteinase-1, ephrin-B1, stomatin, myc target-1, and decreases (by more than 10-fold) in transglutaminase-2, the S100 calcium-binding protein A4, and l-plastin. The clinicopathological significance of these major proteomic changes to predict the occurrence of brain metastases, and their potential value as therapeutic targets, warrants further investigation. PMID:26041846

  6. Analysis of EZH2: micro-RNA network in low and high grade astrocytic tumors.

    PubMed

    Sharma, Vikas; Purkait, Suvendu; Takkar, Sonam; Malgulwar, Prit Benny; Kumar, Anupam; Pathak, Pankaj; Suri, Vaishali; Sharma, Mehar C; Suri, Ashish; Kale, Shashank Sharad; Kulshreshtha, Ritu; Sarkar, Chitra

    2016-04-01

    Enhancer of Zeste homologue2 (EZH2) is an epigenetic regulator that functions as oncogene in astrocytic tumors, however, EZH2 regulation remains little studied. In this study, we measured EZH2 levels in low (Gr-II,DA) and high grade (Gr-IV,GBM) astrocytic tumors and found significant increased EZH2 transcript level with grade(median DA-8.5, GBM-28.9).However, a different trend was reflected in protein levels, with GBMs showing high EZH2 LI(median-26.5) compared to DA (median 0.3). This difference in correlation of EZH2 protein and RNA levels suggested post-transcriptional regulation of EZH2, likely mediated by miRNAs. We selected eleven miRNAs that strongly predicted to target EZH2 and measured their expression. Three miRNAs (miR-26a-5p,miR27a-3p and miR-498) showed significant correlation with EZH2 protein, suggesting them as regulators of EZH2, however miR-26a-5p levels decreased with grade. ChIP analyses revealed H3K27me3 modifications in miR-26a promoter suggesting feedback loop between EZH2 and miR26a. We further measured six downstream miRNA targets of EZH2 and found significant downregulation of four (miR-181a/b and 200b/c) in GBM. Interestingly, EZH2 associated miRNAs were predicted to target 25 genes in glioma-pathway, suggesting their role in tumor formation or progression. Collectively, our work suggests EZH2 and its miRNA interactors may serve as promising biomarkers for progression of astrocytic tumors and may offer novel therapeutic strategies.

  7. Post processing of protein-compound docking for fragment-based drug discovery (FBDD): in-silico structure-based drug screening and ligand-binding pose prediction.

    PubMed

    Fukunishi, Yoshifumi

    2010-01-01

    For fragment-based drug development, both hit (active) compound prediction and docking-pose (protein-ligand complex structure) prediction of the hit compound are important, since chemical modification (fragment linking, fragment evolution) subsequent to the hit discovery must be performed based on the protein-ligand complex structure. However, the naïve protein-compound docking calculation shows poor accuracy in terms of docking-pose prediction. Thus, post-processing of the protein-compound docking is necessary. Recently, several methods for the post-processing of protein-compound docking have been proposed. In FBDD, the compounds are smaller than those for conventional drug screening. This makes it difficult to perform the protein-compound docking calculation. A method to avoid this problem has been reported. Protein-ligand binding free energy estimation is useful to reduce the procedures involved in the chemical modification of the hit fragment. Several prediction methods have been proposed for high-accuracy estimation of protein-ligand binding free energy. This paper summarizes the various computational methods proposed for docking-pose prediction and their usefulness in FBDD.

  8. Nonparametric Simulation of Signal Transduction Networks with Semi-Synchronized Update

    PubMed Central

    Nassiri, Isar; Masoudi-Nejad, Ali; Jalili, Mahdi; Moeini, Ali

    2012-01-01

    Simulating signal transduction in cellular signaling networks provides predictions of network dynamics by quantifying the changes in concentration and activity-level of the individual proteins. Since numerical values of kinetic parameters might be difficult to obtain, it is imperative to develop non-parametric approaches that combine the connectivity of a network with the response of individual proteins to signals which travel through the network. The activity levels of signaling proteins computed through existing non-parametric modeling tools do not show significant correlations with the observed values in experimental results. In this work we developed a non-parametric computational framework to describe the profile of the evolving process and the time course of the proportion of active form of molecules in the signal transduction networks. The model is also capable of incorporating perturbations. The model was validated on four signaling networks showing that it can effectively uncover the activity levels and trends of response during signal transduction process. PMID:22737250

  9. Predicting protein-protein interactions from protein domains using a set cover approach.

    PubMed

    Huang, Chengbang; Morcos, Faruck; Kanaan, Simon P; Wuchty, Stefan; Chen, Danny Z; Izaguirre, Jesús A

    2007-01-01

    One goal of contemporary proteome research is the elucidation of cellular protein interactions. Based on currently available protein-protein interaction and domain data, we introduce a novel method, Maximum Specificity Set Cover (MSSC), for the prediction of protein-protein interactions. In our approach, we map the relationship between interactions of proteins and their corresponding domain architectures to a generalized weighted set cover problem. The application of a greedy algorithm provides sets of domain interactions which explain the presence of protein interactions to the largest degree of specificity. Utilizing domain and protein interaction data of S. cerevisiae, MSSC enables prediction of previously unknown protein interactions, links that are well supported by a high tendency of coexpression and functional homogeneity of the corresponding proteins. Focusing on concrete examples, we show that MSSC reliably predicts protein interactions in well-studied molecular systems, such as the 26S proteasome and RNA polymerase II of S. cerevisiae. We also show that the quality of the predictions is comparable to the Maximum Likelihood Estimation while MSSC is faster. This new algorithm and all data sets used are accessible through a Web portal at http://ppi.cse.nd.edu.

  10. Global, quantitative and dynamic mapping of protein subcellular localization.

    PubMed

    Itzhak, Daniel N; Tyanova, Stefka; Cox, Jürgen; Borner, Georg Hh

    2016-06-09

    Subcellular localization critically influences protein function, and cells control protein localization to regulate biological processes. We have developed and applied Dynamic Organellar Maps, a proteomic method that allows global mapping of protein translocation events. We initially used maps statically to generate a database with localization and absolute copy number information for over 8700 proteins from HeLa cells, approaching comprehensive coverage. All major organelles were resolved, with exceptional prediction accuracy (estimated at >92%). Combining spatial and abundance information yielded an unprecedented quantitative view of HeLa cell anatomy and organellar composition, at the protein level. We subsequently demonstrated the dynamic capabilities of the approach by capturing translocation events following EGF stimulation, which we integrated into a quantitative model. Dynamic Organellar Maps enable the proteome-wide analysis of physiological protein movements, without requiring any reagents specific to the investigated process, and will thus be widely applicable in cell biology.

  11. Apolipoprotein C-II Is a Potential Serum Biomarker as a Prognostic Factor of Locally Advanced Cervical Cancer After Chemoradiation Therapy

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Harima, Yoko, E-mail: harima@takii.kmu.ac.jp; Ikeda, Koshi; Utsunomiya, Keita

    Purpose: To determine pretreatment serum protein levels for generally applicable measurement to predict chemoradiation treatment outcomes in patients with locally advanced squamous cell cervical carcinoma (CC). Methods and Materials: In a screening study, measurements were conducted twice. At first, 6 serum samples from CC patients (3 with no evidence of disease [NED] and 3 with cancer-caused death [CD]) and 2 from healthy controls were tested. Next, 12 serum samples from different CC patients (8 NED, 4 CD) and 4 from healthy controls were examined. Subsequently, 28 different CC patients (18 NED, 10 CD) and 9 controls were analyzed in themore » validation study. Protein chips were treated with the sample sera, and the serum protein pattern was detected by surface-enhanced laser desorption and ionization–time-of-flight mass spectrometry (SELDI-TOF MS). Then, single MS-based peptide mass fingerprinting (PMF) and tandem MS (MS/MS)-based peptide/protein identification methods, were used to identify protein corresponding to the detected peak. And then, turbidimetric assay was used to measure the levels of a protein that indicated the best match with this peptide peak. Results: The same peak 8918 m/z was identified in both screening studies. Neither the screening study nor the validation study had significant differences in the appearance of this peak in the controls and NED. However, the intensity of the peak in CD was significantly lower than that of controls and NED in both pilot studies (P=.02, P=.04) and validation study (P=.01, P=.001). The protein indicated the best match with this peptide peak at 8918 m/z was identified as apolipoprotein C-II (ApoC-II) using PMF and MS/MS methods. Turbidimetric assay showed that the mean serum levels of ApoC-II tended to decrease in CD group when compared with NED group (P=.078). Conclusion: ApoC-II could be used as a biomarker for detection in predicting and estimating the radiation treatment outcome of patients with CC.« less

  12. The RNA-binding protein PCBP2 facilitates gastric carcinoma growth by targeting miR-34a

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hu, Cheng-En; Liu, Yong-Chao; Zhang, Hui-Dong

    Highlights: • PCBP2 is overexpressed in human gastric cancer. • PCBP2 high expression predicts poor survival. • PCBP2 regulates gastric cancer growth in vitro and in vivo. • PCBP2 regulates gastric cancer apoptosis by targeting miR-34a. - Abstract: Gastric carcinoma is the fourth most common cancer worldwide, with a high rate of death and low 5-year survival rate. However, the mechanism underling gastric cancer is still not fully understood. Here in the present study, we identify the RNA-binding protein PCBP2 as an oncogenic protein in human gastric carcinoma. Our results show that PCBP2 is up-regulated in human gastric cancer tissuesmore » compared to adjacent normal tissues, and that high level of PCBP2 predicts poor overall and disease-free survival. Knockdown of PCBP2 in gastric cancer cells inhibits cell proliferation and colony formation in vitro, whereas opposing results are obtained when PCBP2 is overexpressed. Our in vivo subcutaneous xenograft results also show that PCBP2 can critically regulate gastric cancer cell growth. In addition, we find that PCBP2-depletion induces apoptosis in gastric cancer cells via up-regulating expression of pro-apoptotic proteins and down-regulating anti-apoptotic proteins. Mechanically, we identify that miR-34a as a target of PCBP2, and that miR-34a is critically essential for the function of PCBP2. In summary, PCBP2 promotes gastric carcinoma development by regulating the level of miR-34a.« less

  13. Altered Plasma Profile of Antioxidant Proteins as an Early Correlate of Pancreatic β Cell Dysfunction*

    PubMed Central

    Kuo, Taiyi; Kim-Muller, Ja Young; McGraw, Timothy E.; Accili, Domenico

    2016-01-01

    Insulin resistance and β cell dysfunction contribute to the pathogenesis of type 2 diabetes. Unlike insulin resistance, β cell dysfunction remains difficult to predict and monitor, because of the inaccessibility of the endocrine pancreas, the integrated relationship with insulin sensitivity, and the paracrine effects of incretins. The goal of our study was to survey the plasma response to a metabolic challenge in order to identify factors predictive of β cell dysfunction. To this end, we combined (i) the power of unbiased iTRAQ (isobaric tag for relative and absolute quantification) mass spectrometry with (ii) direct sampling of the portal vein following an intravenous glucose/arginine challenge (IVGATT) in (iii) mice with a genetic β cell defect. By so doing, we excluded the effects of peripheral insulin sensitivity as well as those of incretins on β cells, and focused on the first phase of insulin secretion to capture the early pathophysiology of β cell dysfunction. We compared plasma protein profiles with ex vivo islet secretome and transcriptome analyses. We detected changes to 418 plasma proteins in vivo, and detected changes to 262 proteins ex vivo. The impairment of insulin secretion was associated with greater overall changes in the plasma response to IVGATT, possibly reflecting metabolic instability. Reduced levels of proteins regulating redox state and neuronal stress markers, as well as increased levels of coagulation factors, antedated the loss of insulin secretion in diabetic mice. These results suggest that a reduced complement of antioxidants in response to a mixed secretagogue challenge is an early correlate of future β cell failure. PMID:26917725

  14. Characterization of a gene product (Sec53p) required for protein assembly in the yeast endoplasmic reticulum

    PubMed Central

    1985-01-01

    SEC53, a gene that is required for completion of assembly of proteins in the endoplasmic reticulum in yeast, has been cloned, sequenced, and the product localized by cell fractionation. Complementation of a sec53 mutation is achieved with unique plasmids from genomic or cDNA expression banks. These inserts contain the authentic gene, a cloned copy of which integrates at the sec53 locus. An open reading frame in the insert predicts a 29-kD protein with no significant hydrophobic character. This prediction is confirmed by detection of a 28-kD protein overproduced in cells that carry SEC53 on a multicopy plasmid. To follow Sec53p more directly, a LacZ-SEC53 gene fusion has been constructed which allows the isolation of a hybrid protein for use in production of antibody. With such an antibody, quantitative immune decoration has shown that the sec53-6 mutation decreases the level of Sec53p at 37 degrees C, while levels comparable to wild-type are seen at 24 degrees C. An eightfold overproduction of Sec53p accompanies transformation of cells with a multicopy plasmid containing SEC53. Cell fractionation, performed with conditions that preserve the lumenal content of the endoplasmic reticulum (ER), shows Sec53p highly enriched in the cytosol fraction. We suggest that Sec53p acts indirectly to facilitate assembly in the ER, possibly by interacting with a stable ER component, or by providing a small molecule, other than an oligosaccharide precursor, necessary for the assembly event. PMID:3905826

  15. Predicting protein-binding RNA nucleotides with consideration of binding partners.

    PubMed

    Tuvshinjargal, Narankhuu; Lee, Wook; Park, Byungkyu; Han, Kyungsook

    2015-06-01

    In recent years several computational methods have been developed to predict RNA-binding sites in protein. Most of these methods do not consider interacting partners of a protein, so they predict the same RNA-binding sites for a given protein sequence even if the protein binds to different RNAs. Unlike the problem of predicting RNA-binding sites in protein, the problem of predicting protein-binding sites in RNA has received little attention mainly because it is much more difficult and shows a lower accuracy on average. In our previous study, we developed a method that predicts protein-binding nucleotides from an RNA sequence. In an effort to improve the prediction accuracy and usefulness of the previous method, we developed a new method that uses both RNA and protein sequence data. In this study, we identified effective features of RNA and protein molecules and developed a new support vector machine (SVM) model to predict protein-binding nucleotides from RNA and protein sequence data. The new model that used both protein and RNA sequence data achieved a sensitivity of 86.5%, a specificity of 86.2%, a positive predictive value (PPV) of 72.6%, a negative predictive value (NPV) of 93.8% and Matthews correlation coefficient (MCC) of 0.69 in a 10-fold cross validation; it achieved a sensitivity of 58.8%, a specificity of 87.4%, a PPV of 65.1%, a NPV of 84.2% and MCC of 0.48 in independent testing. For comparative purpose, we built another prediction model that used RNA sequence data alone and ran it on the same dataset. In a 10 fold-cross validation it achieved a sensitivity of 85.7%, a specificity of 80.5%, a PPV of 67.7%, a NPV of 92.2% and MCC of 0.63; in independent testing it achieved a sensitivity of 67.7%, a specificity of 78.8%, a PPV of 57.6%, a NPV of 85.2% and MCC of 0.45. In both cross-validations and independent testing, the new model that used both RNA and protein sequences showed a better performance than the model that used RNA sequence data alone in most performance measures. To the best of our knowledge, this is the first sequence-based prediction of protein-binding nucleotides in RNA which considers the binding partner of RNA. The new model will provide valuable information for designing biochemical experiments to find putative protein-binding sites in RNA with unknown structure. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  16. Understanding and Manipulating Electrostatic Fields at the Protein-Protein Interface Using Vibrational Spectroscopy and Continuum Electrostatics Calculations.

    PubMed

    Ritchie, Andrew W; Webb, Lauren J

    2015-11-05

    Biological function emerges in large part from the interactions of biomacromolecules in the complex and dynamic environment of the living cell. For this reason, macromolecular interactions in biological systems are now a major focus of interest throughout the biochemical and biophysical communities. The affinity and specificity of macromolecular interactions are the result of both structural and electrostatic factors. Significant advances have been made in characterizing structural features of stable protein-protein interfaces through the techniques of modern structural biology, but much less is understood about how electrostatic factors promote and stabilize specific functional macromolecular interactions over all possible choices presented to a given molecule in a crowded environment. In this Feature Article, we describe how vibrational Stark effect (VSE) spectroscopy is being applied to measure electrostatic fields at protein-protein interfaces, focusing on measurements of guanosine triphosphate (GTP)-binding proteins of the Ras superfamily binding with structurally related but functionally distinct downstream effector proteins. In VSE spectroscopy, spectral shifts of a probe oscillator's energy are related directly to that probe's local electrostatic environment. By performing this experiment repeatedly throughout a protein-protein interface, an experimental map of measured electrostatic fields generated at that interface is determined. These data can be used to rationalize selective binding of similarly structured proteins in both in vitro and in vivo environments. Furthermore, these data can be used to compare to computational predictions of electrostatic fields to explore the level of simulation detail that is necessary to accurately predict our experimental findings.

  17. Predictive and comparative analysis of Ebolavirus proteins

    PubMed Central

    Cong, Qian; Pei, Jimin; Grishin, Nick V

    2015-01-01

    Ebolavirus is the pathogen for Ebola Hemorrhagic Fever (EHF). This disease exhibits a high fatality rate and has recently reached a historically epidemic proportion in West Africa. Out of the 5 known Ebolavirus species, only Reston ebolavirus has lost human pathogenicity, while retaining the ability to cause EHF in long-tailed macaque. Significant efforts have been spent to determine the three-dimensional (3D) structures of Ebolavirus proteins, to study their interaction with host proteins, and to identify the functional motifs in these viral proteins. Here, in light of these experimental results, we apply computational analysis to predict the 3D structures and functional sites for Ebolavirus protein domains with unknown structure, including a zinc-finger domain of VP30, the RNA-dependent RNA polymerase catalytic domain and a methyltransferase domain of protein L. In addition, we compare sequences of proteins that interact with Ebolavirus proteins from RESTV-resistant primates with those from RESTV-susceptible monkeys. The host proteins that interact with GP and VP35 show an elevated level of sequence divergence between the RESTV-resistant and RESTV-susceptible species, suggesting that they may be responsible for host specificity. Meanwhile, we detect variable positions in protein sequences that are likely associated with the loss of human pathogenicity in RESTV, map them onto the 3D structures and compare their positions to known functional sites. VP35 and VP30 are significantly enriched in these potential pathogenicity determinants and the clustering of such positions on the surfaces of VP35 and GP suggests possible uncharacterized interaction sites with host proteins that contribute to the virulence of Ebolavirus. PMID:26158395

  18. Predictive and comparative analysis of Ebolavirus proteins.

    PubMed

    Cong, Qian; Pei, Jimin; Grishin, Nick V

    2015-01-01

    Ebolavirus is the pathogen for Ebola Hemorrhagic Fever (EHF). This disease exhibits a high fatality rate and has recently reached a historically epidemic proportion in West Africa. Out of the 5 known Ebolavirus species, only Reston ebolavirus has lost human pathogenicity, while retaining the ability to cause EHF in long-tailed macaque. Significant efforts have been spent to determine the three-dimensional (3D) structures of Ebolavirus proteins, to study their interaction with host proteins, and to identify the functional motifs in these viral proteins. Here, in light of these experimental results, we apply computational analysis to predict the 3D structures and functional sites for Ebolavirus protein domains with unknown structure, including a zinc-finger domain of VP30, the RNA-dependent RNA polymerase catalytic domain and a methyltransferase domain of protein L. In addition, we compare sequences of proteins that interact with Ebolavirus proteins from RESTV-resistant primates with those from RESTV-susceptible monkeys. The host proteins that interact with GP and VP35 show an elevated level of sequence divergence between the RESTV-resistant and RESTV-susceptible species, suggesting that they may be responsible for host specificity. Meanwhile, we detect variable positions in protein sequences that are likely associated with the loss of human pathogenicity in RESTV, map them onto the 3D structures and compare their positions to known functional sites. VP35 and VP30 are significantly enriched in these potential pathogenicity determinants and the clustering of such positions on the surfaces of VP35 and GP suggests possible uncharacterized interaction sites with host proteins that contribute to the virulence of Ebolavirus.

  19. Computational prediction of protein-protein interactions in Leishmania predicted proteomes.

    PubMed

    Rezende, Antonio M; Folador, Edson L; Resende, Daniela de M; Ruiz, Jeronimo C

    2012-01-01

    The Trypanosomatids parasites Leishmania braziliensis, Leishmania major and Leishmania infantum are important human pathogens. Despite of years of study and genome availability, effective vaccine has not been developed yet, and the chemotherapy is highly toxic. Therefore, it is clear just interdisciplinary integrated studies will have success in trying to search new targets for developing of vaccines and drugs. An essential part of this rationale is related to protein-protein interaction network (PPI) study which can provide a better understanding of complex protein interactions in biological system. Thus, we modeled PPIs for Trypanosomatids through computational methods using sequence comparison against public database of protein or domain interaction for interaction prediction (Interolog Mapping) and developed a dedicated combined system score to address the predictions robustness. The confidence evaluation of network prediction approach was addressed using gold standard positive and negative datasets and the AUC value obtained was 0.94. As result, 39,420, 43,531 and 45,235 interactions were predicted for L. braziliensis, L. major and L. infantum respectively. For each predicted network the top 20 proteins were ranked by MCC topological index. In addition, information related with immunological potential, degree of protein sequence conservation among orthologs and degree of identity compared to proteins of potential parasite hosts was integrated. This information integration provides a better understanding and usefulness of the predicted networks that can be valuable to select new potential biological targets for drug and vaccine development. Network modularity which is a key when one is interested in destabilizing the PPIs for drug or vaccine purposes along with multiple alignments of the predicted PPIs were performed revealing patterns associated with protein turnover. In addition, around 50% of hypothetical protein present in the networks received some degree of functional annotation which represents an important contribution since approximately 60% of Leishmania predicted proteomes has no predicted function.

  20. Parallel mRNA, proteomics and miRNA expression analysis in cell line models of the intestine.

    PubMed

    O'Sullivan, Finbarr; Keenan, Joanne; Aherne, Sinead; O'Neill, Fiona; Clarke, Colin; Henry, Michael; Meleady, Paula; Breen, Laura; Barron, Niall; Clynes, Martin; Horgan, Karina; Doolan, Padraig; Murphy, Richard

    2017-11-07

    To identify miRNA-regulated proteins differentially expressed between Caco2 and HT-29: two principal cell line models of the intestine. Exponentially growing Caco-2 and HT-29 cells were harvested and prepared for mRNA, miRNA and proteomic profiling. mRNA microarray profiling analysis was carried out using the Affymetrix GeneChip Human Gene 1.0 ST array. miRNA microarray profiling analysis was carried out using the Affymetrix Genechip miRNA 3.0 array. Quantitative Label-free LC-MS/MS proteomic analysis was performed using a Dionex Ultimate 3000 RSLCnano system coupled to a hybrid linear ion trap/Orbitrap mass spectrometer. Peptide identities were validated in Proteome Discoverer 2.1 and were subsequently imported into Progenesis QI software for further analysis. Hierarchical cluster analysis for all three parallel datasets (miRNA, proteomics, mRNA) was conducted in the R software environment using the Euclidean distance measure and Ward's clustering algorithm. The prediction of miRNA and oppositely correlated protein/mRNA interactions was performed using TargetScan 6.1. GO biological process, molecular function and cellular component enrichment analysis was carried out for the DE miRNA, protein and mRNA lists via the Pathway Studio 11.3 Web interface using their Mammalian database. Differential expression (DE) profiling comparing the intestinal cell lines HT-29 and Caco-2 identified 1795 Genes, 168 Proteins and 160 miRNAs as DE between the two cell lines. At the gene level, 1084 genes were upregulated and 711 were downregulated in the Caco-2 cell line relative to the HT-29 cell line. At the protein level, 57 proteins were found to be upregulated and 111 downregulated in the Caco-2 cell line relative to the HT-29 cell line. Finally, at the miRNAs level, 104 were upregulated and 56 downregulated in the Caco-2 cell line relative to the HT-29 cell line. Gene ontology (GO) analysis of the DE mRNA identified cell adhesion, migration and ECM organization, cellular lipid and cholesterol metabolic processes, small molecule transport and a range of responses to external stimuli, while similar analysis of the DE protein list identified gene expression/transcription, epigenetic mechanisms, DNA replication, differentiation and translation ontology categories. The DE protein and gene lists were found to share 15 biological processes including for example epithelial cell differentiation [ P value ≤ 1.81613E-08 (protein list); P ≤ 0.000434311 (gene list)] and actin filament bundle assembly [ P value ≤ 0.001582797 (protein list); P ≤ 0.002733714 (gene list)]. Analysis was conducted on the three data streams acquired in parallel to identify targets undergoing potential miRNA translational repression identified 34 proteins, whose respective mRNAs were detected but no change in expression was observed. Of these 34 proteins, 27 proteins downregulated in the Caco-2 cell line relative to the HT-29 cell line and predicted to be targeted by 19 unique anti-correlated/upregulated microRNAs and 7 proteins upregulated in the Caco-2 cell line relative to the HT-29 cell line and predicted to be targeted by 15 unique anti-correlated/downregulated microRNAs. This first study providing "tri-omics" analysis of the principal intestinal cell line models Caco-2 and HT-29 has identified 34 proteins potentially undergoing miRNA translational repression.

  1. Parallel mRNA, proteomics and miRNA expression analysis in cell line models of the intestine

    PubMed Central

    O’Sullivan, Finbarr; Keenan, Joanne; Aherne, Sinead; O’Neill, Fiona; Clarke, Colin; Henry, Michael; Meleady, Paula; Breen, Laura; Barron, Niall; Clynes, Martin; Horgan, Karina; Doolan, Padraig; Murphy, Richard

    2017-01-01

    AIM To identify miRNA-regulated proteins differentially expressed between Caco2 and HT-29: two principal cell line models of the intestine. METHODS Exponentially growing Caco-2 and HT-29 cells were harvested and prepared for mRNA, miRNA and proteomic profiling. mRNA microarray profiling analysis was carried out using the Affymetrix GeneChip Human Gene 1.0 ST array. miRNA microarray profiling analysis was carried out using the Affymetrix Genechip miRNA 3.0 array. Quantitative Label-free LC-MS/MS proteomic analysis was performed using a Dionex Ultimate 3000 RSLCnano system coupled to a hybrid linear ion trap/Orbitrap mass spectrometer. Peptide identities were validated in Proteome Discoverer 2.1 and were subsequently imported into Progenesis QI software for further analysis. Hierarchical cluster analysis for all three parallel datasets (miRNA, proteomics, mRNA) was conducted in the R software environment using the Euclidean distance measure and Ward’s clustering algorithm. The prediction of miRNA and oppositely correlated protein/mRNA interactions was performed using TargetScan 6.1. GO biological process, molecular function and cellular component enrichment analysis was carried out for the DE miRNA, protein and mRNA lists via the Pathway Studio 11.3 Web interface using their Mammalian database. RESULTS Differential expression (DE) profiling comparing the intestinal cell lines HT-29 and Caco-2 identified 1795 Genes, 168 Proteins and 160 miRNAs as DE between the two cell lines. At the gene level, 1084 genes were upregulated and 711 were downregulated in the Caco-2 cell line relative to the HT-29 cell line. At the protein level, 57 proteins were found to be upregulated and 111 downregulated in the Caco-2 cell line relative to the HT-29 cell line. Finally, at the miRNAs level, 104 were upregulated and 56 downregulated in the Caco-2 cell line relative to the HT-29 cell line. Gene ontology (GO) analysis of the DE mRNA identified cell adhesion, migration and ECM organization, cellular lipid and cholesterol metabolic processes, small molecule transport and a range of responses to external stimuli, while similar analysis of the DE protein list identified gene expression/transcription, epigenetic mechanisms, DNA replication, differentiation and translation ontology categories. The DE protein and gene lists were found to share 15 biological processes including for example epithelial cell differentiation [P value ≤ 1.81613E-08 (protein list); P ≤ 0.000434311 (gene list)] and actin filament bundle assembly [P value ≤ 0.001582797 (protein list); P ≤ 0.002733714 (gene list)]. Analysis was conducted on the three data streams acquired in parallel to identify targets undergoing potential miRNA translational repression identified 34 proteins, whose respective mRNAs were detected but no change in expression was observed. Of these 34 proteins, 27 proteins downregulated in the Caco-2 cell line relative to the HT-29 cell line and predicted to be targeted by 19 unique anti-correlated/upregulated microRNAs and 7 proteins upregulated in the Caco-2 cell line relative to the HT-29 cell line and predicted to be targeted by 15 unique anti-correlated/downregulated microRNAs. CONCLUSION This first study providing “tri-omics” analysis of the principal intestinal cell line models Caco-2 and HT-29 has identified 34 proteins potentially undergoing miRNA translational repression. PMID:29151691

  2. ACLAME: a CLAssification of Mobile genetic Elements, update 2010.

    PubMed

    Leplae, Raphaël; Lima-Mendez, Gipsi; Toussaint, Ariane

    2010-01-01

    The ACLAME database is dedicated to the collection, analysis and classification of sequenced mobile genetic elements (MGEs, in particular phages and plasmids). In addition to providing information on the MGEs content, classifications are available at various levels of organization. At the gene/protein level, families group similar sequences that are expected to share the same function. Families of four or more proteins are manually assigned with a functional annotation using the GeneOntology and the locally developed ontology MeGO dedicated to MGEs. At the genome level, evolutionary cohesive modules group sets of protein families shared among MGEs. At the population level, networks display the reticulate evolutionary relationships among MGEs. To increase the coverage of the phage sequence space, ACLAME version 0.4 incorporates 760 high-quality predicted prophages selected from the Prophinder database. Most of the data can be downloaded from the freely accessible ACLAME web site (http://aclame.ulb.ac.be). The BLAST interface for querying the database has been extended and numerous tools for in-depth analysis of the results have been added.

  3. Sequence-based prediction of protein-binding sites in DNA: comparative study of two SVM models.

    PubMed

    Park, Byungkyu; Im, Jinyong; Tuvshinjargal, Narankhuu; Lee, Wook; Han, Kyungsook

    2014-11-01

    As many structures of protein-DNA complexes have been known in the past years, several computational methods have been developed to predict DNA-binding sites in proteins. However, its inverse problem (i.e., predicting protein-binding sites in DNA) has received much less attention. One of the reasons is that the differences between the interaction propensities of nucleotides are much smaller than those between amino acids. Another reason is that DNA exhibits less diverse sequence patterns than protein. Therefore, predicting protein-binding DNA nucleotides is much harder than predicting DNA-binding amino acids. We computed the interaction propensity (IP) of nucleotide triplets with amino acids using an extensive dataset of protein-DNA complexes, and developed two support vector machine (SVM) models that predict protein-binding nucleotides from sequence data alone. One SVM model predicts protein-binding nucleotides using DNA sequence data alone, and the other SVM model predicts protein-binding nucleotides using both DNA and protein sequences. In a 10-fold cross-validation with 1519 DNA sequences, the SVM model that uses DNA sequence data only predicted protein-binding nucleotides with an accuracy of 67.0%, an F-measure of 67.1%, and a Matthews correlation coefficient (MCC) of 0.340. With an independent dataset of 181 DNAs that were not used in training, it achieved an accuracy of 66.2%, an F-measure 66.3% and a MCC of 0.324. Another SVM model that uses both DNA and protein sequences achieved an accuracy of 69.6%, an F-measure of 69.6%, and a MCC of 0.383 in a 10-fold cross-validation with 1519 DNA sequences and 859 protein sequences. With an independent dataset of 181 DNAs and 143 proteins, it showed an accuracy of 67.3%, an F-measure of 66.5% and a MCC of 0.329. Both in cross-validation and independent testing, the second SVM model that used both DNA and protein sequence data showed better performance than the first model that used DNA sequence data. To the best of our knowledge, this is the first attempt to predict protein-binding nucleotides in a given DNA sequence from the sequence data alone. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  4. Thermodynamic prediction of protein neutrality.

    PubMed

    Bloom, Jesse D; Silberg, Jonathan J; Wilke, Claus O; Drummond, D Allan; Adami, Christoph; Arnold, Frances H

    2005-01-18

    We present a simple theory that uses thermodynamic parameters to predict the probability that a protein retains the wild-type structure after one or more random amino acid substitutions. Our theory predicts that for large numbers of substitutions the probability that a protein retains its structure will decline exponentially with the number of substitutions, with the severity of this decline determined by properties of the structure. Our theory also predicts that a protein can gain extra robustness to the first few substitutions by increasing its thermodynamic stability. We validate our theory with simulations on lattice protein models and by showing that it quantitatively predicts previously published experimental measurements on subtilisin and our own measurements on variants of TEM1 beta-lactamase. Our work unifies observations about the clustering of functional proteins in sequence space, and provides a basis for interpreting the response of proteins to substitutions in protein engineering applications.

  5. Thermodynamic prediction of protein neutrality

    PubMed Central

    Bloom, Jesse D.; Silberg, Jonathan J.; Wilke, Claus O.; Drummond, D. Allan; Adami, Christoph; Arnold, Frances H.

    2005-01-01

    We present a simple theory that uses thermodynamic parameters to predict the probability that a protein retains the wild-type structure after one or more random amino acid substitutions. Our theory predicts that for large numbers of substitutions the probability that a protein retains its structure will decline exponentially with the number of substitutions, with the severity of this decline determined by properties of the structure. Our theory also predicts that a protein can gain extra robustness to the first few substitutions by increasing its thermodynamic stability. We validate our theory with simulations on lattice protein models and by showing that it quantitatively predicts previously published experimental measurements on subtilisin and our own measurements on variants of TEM1 β-lactamase. Our work unifies observations about the clustering of functional proteins in sequence space, and provides a basis for interpreting the response of proteins to substitutions in protein engineering applications. PMID:15644440

  6. Recovering Protein-Protein and Domain-Domain Interactions from Aggregation of IP-MS Proteomics of Coregulator Complexes

    PubMed Central

    Mazloom, Amin R.; Dannenfelser, Ruth; Clark, Neil R.; Grigoryan, Arsen V.; Linder, Kathryn M.; Cardozo, Timothy J.; Bond, Julia C.; Boran, Aislyn D. W.; Iyengar, Ravi; Malovannaya, Anna; Lanz, Rainer B.; Ma'ayan, Avi

    2011-01-01

    Coregulator proteins (CoRegs) are part of multi-protein complexes that transiently assemble with transcription factors and chromatin modifiers to regulate gene expression. In this study we analyzed data from 3,290 immuno-precipitations (IP) followed by mass spectrometry (MS) applied to human cell lines aimed at identifying CoRegs complexes. Using the semi-quantitative spectral counts, we scored binary protein-protein and domain-domain associations with several equations. Unlike previous applications, our methods scored prey-prey protein-protein interactions regardless of the baits used. We also predicted domain-domain interactions underlying predicted protein-protein interactions. The quality of predicted protein-protein and domain-domain interactions was evaluated using known binary interactions from the literature, whereas one protein-protein interaction, between STRN and CTTNBP2NL, was validated experimentally; and one domain-domain interaction, between the HEAT domain of PPP2R1A and the Pkinase domain of STK25, was validated using molecular docking simulations. The scoring schemes presented here recovered known, and predicted many new, complexes, protein-protein, and domain-domain interactions. The networks that resulted from the predictions are provided as a web-based interactive application at http://maayanlab.net/HT-IP-MS-2-PPI-DDI/. PMID:22219718

  7. Prediction of beta-turns from amino acid sequences using the residue-coupled model.

    PubMed

    Guruprasad, K; Shukla, S

    2003-04-01

    We evaluated the prediction of beta-turns from amino acid sequences using the residue-coupled model with an enlarged representative protein data set selected from the Protein Data Bank. Our results show that the probability values derived from a data set comprising 425 protein chains yielded an overall beta-turn prediction accuracy 68.74%, compared with 94.7% reported earlier on a data set of 30 proteins using the same method. However, we noted that the overall beta-turn prediction accuracy using probability values derived from the 30-protein data set reduces to 40.74% when tested on the data set comprising 425 protein chains. In contrast, using probability values derived from the 425 data set used in this analysis, the overall beta-turn prediction accuracy yielded consistent results when tested on either the 30-protein data set (64.62%) used earlier or a more recent representative data set comprising 619 protein chains (64.66%) or on a jackknife data set comprising 476 representative protein chains (63.38%). We therefore recommend the use of probability values derived from the 425 representative protein chains data set reported here, which gives more realistic and consistent predictions of beta-turns from amino acid sequences.

  8. Hill-Climbing search and diversification within an evolutionary approach to protein structure prediction.

    PubMed

    Chira, Camelia; Horvath, Dragos; Dumitrescu, D

    2011-07-30

    Proteins are complex structures made of amino acids having a fundamental role in the correct functioning of living cells. The structure of a protein is the result of the protein folding process. However, the general principles that govern the folding of natural proteins into a native structure are unknown. The problem of predicting a protein structure with minimum-energy starting from the unfolded amino acid sequence is a highly complex and important task in molecular and computational biology. Protein structure prediction has important applications in fields such as drug design and disease prediction. The protein structure prediction problem is NP-hard even in simplified lattice protein models. An evolutionary model based on hill-climbing genetic operators is proposed for protein structure prediction in the hydrophobic - polar (HP) model. Problem-specific search operators are implemented and applied using a steepest-ascent hill-climbing approach. Furthermore, the proposed model enforces an explicit diversification stage during the evolution in order to avoid local optimum. The main features of the resulting evolutionary algorithm - hill-climbing mechanism and diversification strategy - are evaluated in a set of numerical experiments for the protein structure prediction problem to assess their impact to the efficiency of the search process. Furthermore, the emerging consolidated model is compared to relevant algorithms from the literature for a set of difficult bidimensional instances from lattice protein models. The results obtained by the proposed algorithm are promising and competitive with those of related methods.

  9. Study on grain quality forecasting method and indicators by using hyperspectral data in wheat

    NASA Astrophysics Data System (ADS)

    Huang, Wenjiang; Wang, Jihua; Liu, Liangyun; Wang, Zhijie; Tan, Changwei; Song, Xiaoyu; Wang, Jingdi

    2005-01-01

    Field experiments were conducted to examine the influence factors of cultivar, nitrogen application and irrigation on grain protein content, gluten content and grain hardness in three winter wheat cultivars under four levels of nitrogen and irrigation treatments. Firstly, the influence of cultivars and environment factors on grain quality were studied, the effective factors were cultivars, irrigation, fertilization, et al. Secondly, total nitrogen content around winter wheat anthesis stage was proved to be significant correlative with grain protein content, and spectral vegetation index significantly correlated to total nitrogen content around anthesis stage were the potential indicators for grain protein content. Accumulation of total nitrogen content and its transfer to grain is the physical link to produce the final grain protein, and total nitrogen content at anthesis stage was proved to be an indicator of final grain protein content. The selected normalized photochemical reflectance index (NPRI) was proved to be able to predict of grain protein content on the close correlation between the ratio of total carotenoid to chlorophyll a and total nitrogen content. The method contributes towards developing optimal procedures for predicting wheat grain quality through analysis of their canopy reflected spectrum at anthesis stage. Regression equations were established for forecasting grain protein and dry gluten content by total nitrogen content at anthesis stage, so it is feasible for forecasting grain quality by establishing correlation equations between biochemical constitutes and canopy reflected spectrum.

  10. Expression and localization of a novel phosducin-like protein from amphioxus Branchiostoma belcheri

    NASA Astrophysics Data System (ADS)

    Saren, Gaowa; Zhao, Yonggang

    2009-05-01

    A full length amphioxus cDNA, encoding a novel phosducin-like protein ( Amphi-PhLP), was identified for the first time from the gut cDNA library of Branchiostoma belcheri. It is comprised of 1 550 bp and an open reading frame (ORF) of 241 amino acids, with a predicted molecular mass of approximately 28 kDa. In situ hybridization histochemistry revealed a tissue-specific expression pattern of Amphi-PhLP with the high levels in the ovary, and at a lower level in the hind gut and testis, hepatic caecum, gill, endostyle, and epipharyngeal groove, while it was absent in the muscle, neural tube and notochord. In the Chinese Hamster Ovary (CHO) cells transfected with the expression plasmid pEGFP-N1/ Amphi-PhLP, the fusion protein was targeted in the cytoplasm of CHO cells, suggesting that Amphi-PhLP is a cytosolic protein. This work may provide a framework for further understanding of the physiological function of Amphi-PhLP in B. belcheri.

  11. How may targeted proteomics complement genomic data in breast cancer?

    PubMed

    Guerin, Mathilde; Gonçalves, Anthony; Toiron, Yves; Baudelet, Emilie; Audebert, Stéphane; Boyer, Jean-Baptiste; Borg, Jean-Paul; Camoin, Luc

    2017-01-01

    Breast cancer (BC) is the most common female cancer in the world and was recently deconstructed in different molecular entities. Although most of the recent assays to characterize tumors at the molecular level are genomic-based, proteins are the actual executors of cellular functions and represent the vast majority of targets for anticancer drugs. Accumulated data has demonstrated an important level of quantitative and qualitative discrepancies between genomic/transcriptomic alterations and their protein counterparts, mostly related to the large number of post-translational modifications. Areas covered: This review will present novel proteomics technologies such as Reverse Phase Protein Array (RPPA) or mass-spectrometry (MS) based approaches that have emerged and that could progressively replace old-fashioned methods (e.g. immunohistochemistry, ELISA, etc.) to validate proteins as diagnostic, prognostic or predictive biomarkers, and eventually monitor them in the routine practice. Expert commentary: These different targeted proteomic approaches, able to complement genomic data in BC and characterize tumors more precisely, will permit to go through a more personalized treatment for each patient and tumor.

  12. Predictive factors of clinical response in steroid-refractory ulcerative colitis treated with granulocyte-monocyte apheresis

    PubMed Central

    D'Ovidio, Valeria; Meo, Donatella; Viscido, Angelo; Bresci, Giampaolo; Vernia, Piero; Caprilli, Renzo

    2011-01-01

    AIM: To identify factors predicting the clinical response of ulcerative colitis patients to granulocyte-monocyte apheresis (GMA). METHODS: Sixty-nine ulcerative colitis patients (39 F, 30 M) dependent upon/refractory to steroids were treated with GMA. Steroid dependency, clinical activity index (CAI), C reactive protein (CRP) level, erythrocyte sedimentation rate (ESR), values at baseline, use of immunosuppressant, duration of disease, and age and extent of disease were considered for statistical analysis as predictive factors of clinical response. Univariate and multivariate logistic regression models were used. RESULTS: In the univariate analysis, CAI (P = 0.039) and ESR (P = 0.017) levels at baseline were singled out as predictive of clinical remission. In the multivariate analysis steroid dependency [Odds ratio (OR) = 0.390, 95% Confidence interval (CI): 0.176-0.865, Wald 5.361, P = 0.0160] and low CAI levels at baseline (4 < CAI < 7) (OR = 0.770, 95% CI: 0.425-1.394, Wald 3.747, P = 0.028) proved to be effective as factors predicting clinical response. CONCLUSION: GMA may be a valid therapeutic option for steroid-dependent ulcerative colitis patients with mild-moderate disease and its clinical efficacy seems to persist for 12 mo. PMID:21528055

  13. Systematic Prediction of Scaffold Proteins Reveals New Design Principles in Scaffold-Mediated Signal Transduction

    PubMed Central

    Hu, Jianfei; Neiswinger, Johnathan; Zhang, Jin; Zhu, Heng; Qian, Jiang

    2015-01-01

    Scaffold proteins play a crucial role in facilitating signal transduction in eukaryotes by bringing together multiple signaling components. In this study, we performed a systematic analysis of scaffold proteins in signal transduction by integrating protein-protein interaction and kinase-substrate relationship networks. We predicted 212 scaffold proteins that are involved in 605 distinct signaling pathways. The computational prediction was validated using a protein microarray-based approach. The predicted scaffold proteins showed several interesting characteristics, as we expected from the functionality of scaffold proteins. We found that the scaffold proteins are likely to interact with each other, which is consistent with previous finding that scaffold proteins tend to form homodimers and heterodimers. Interestingly, a single scaffold protein can be involved in multiple signaling pathways by interacting with other scaffold protein partners. Furthermore, we propose two possible regulatory mechanisms by which the activity of scaffold proteins is coordinated with their associated pathways through phosphorylation process. PMID:26393507

  14. Applications of Protein Thermodynamic Database for Understanding Protein Mutant Stability and Designing Stable Mutants.

    PubMed

    Gromiha, M Michael; Anoosha, P; Huang, Liang-Tsung

    2016-01-01

    Protein stability is the free energy difference between unfolded and folded states of a protein, which lies in the range of 5-25 kcal/mol. Experimentally, protein stability is measured with circular dichroism, differential scanning calorimetry, and fluorescence spectroscopy using thermal and denaturant denaturation methods. These experimental data have been accumulated in the form of a database, ProTherm, thermodynamic database for proteins and mutants. It also contains sequence and structure information of a protein, experimental methods and conditions, and literature information. Different features such as search, display, and sorting options and visualization tools have been incorporated in the database. ProTherm is a valuable resource for understanding/predicting the stability of proteins and it can be accessed at http://www.abren.net/protherm/ . ProTherm has been effectively used to examine the relationship among thermodynamics, structure, and function of proteins. We describe the recent progress on the development of methods for understanding/predicting protein stability, such as (1) general trends on mutational effects on stability, (2) relationship between the stability of protein mutants and amino acid properties, (3) applications of protein three-dimensional structures for predicting their stability upon point mutations, (4) prediction of protein stability upon single mutations from amino acid sequence, and (5) prediction methods for addressing double mutants. A list of online resources for predicting has also been provided.

  15. Fast computational methods for predicting protein structure from primary amino acid sequence

    DOEpatents

    Agarwal, Pratul Kumar [Knoxville, TN

    2011-07-19

    The present invention provides a method utilizing primary amino acid sequence of a protein, energy minimization, molecular dynamics and protein vibrational modes to predict three-dimensional structure of a protein. The present invention also determines possible intermediates in the protein folding pathway. The present invention has important applications to the design of novel drugs as well as protein engineering. The present invention predicts the three-dimensional structure of a protein independent of size of the protein, overcoming a significant limitation in the prior art.

  16. A benchmark testing ground for integrating homology modeling and protein docking.

    PubMed

    Bohnuud, Tanggis; Luo, Lingqi; Wodak, Shoshana J; Bonvin, Alexandre M J J; Weng, Zhiping; Vajda, Sandor; Schueler-Furman, Ora; Kozakov, Dima

    2017-01-01

    Protein docking procedures carry out the task of predicting the structure of a protein-protein complex starting from the known structures of the individual protein components. More often than not, however, the structure of one or both components is not known, but can be derived by homology modeling on the basis of known structures of related proteins deposited in the Protein Data Bank (PDB). Thus, the problem is to develop methods that optimally integrate homology modeling and docking with the goal of predicting the structure of a complex directly from the amino acid sequences of its component proteins. One possibility is to use the best available homology modeling and docking methods. However, the models built for the individual subunits often differ to a significant degree from the bound conformation in the complex, often much more so than the differences observed between free and bound structures of the same protein, and therefore additional conformational adjustments, both at the backbone and side chain levels need to be modeled to achieve an accurate docking prediction. In particular, even homology models of overall good accuracy frequently include localized errors that unfavorably impact docking results. The predicted reliability of the different regions in the model can also serve as a useful input for the docking calculations. Here we present a benchmark dataset that should help to explore and solve combined modeling and docking problems. This dataset comprises a subset of the experimentally solved 'target' complexes from the widely used Docking Benchmark from the Weng Lab (excluding antibody-antigen complexes). This subset is extended to include the structures from the PDB related to those of the individual components of each complex, and hence represent potential templates for investigating and benchmarking integrated homology modeling and docking approaches. Template sets can be dynamically customized by specifying ranges in sequence similarity and in PDB release dates, or using other filtering options, such as excluding sets of specific structures from the template list. Multiple sequence alignments, as well as structural alignments of the templates to their corresponding subunits in the target are also provided. The resource is accessible online or can be downloaded at http://cluspro.org/benchmark, and is updated on a weekly basis in synchrony with new PDB releases. Proteins 2016; 85:10-16. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  17. Comparing the Levels of Acute-Phase Reactants Between Smoker and Nonsmoker Diabetic Patients: More Predicted Risk for Cardiovascular Diseases in Smoker Compared to Nonsmoker Diabetics.

    PubMed

    Rezaei-Adl, Sepideh; Ghahroudi Tali, Arash; Saffar, Hiva; Rajabiani, Afsaneh; Abdollahi, Alireza

    2017-09-01

     Due to a close link between cardiovascular disorders and increased acute phase responses, it is now proposed the relation of total sialic acid (TSA) and C Reactive Protein (CRP) as main components of acute phase proteins and cardiovascular risk profiles such as diabetes mellitus and smoking. We hypothesized that the elevation in the level of TSA along with other prototype acute phase reactants such as CRP is expected more in the coexistence of diabetes and smoking than in diabetes mellitus alone. Ninety diabetic patients were randomly selected and entered into this case-control study. Using block randomization method, the patients were randomly assigned into smokers (n=45) and nonsmokers (n=45). A group of ten healthy individuals was also included as the control. The serum levels of TSA, CRP, iron, and hemoglobin were measured by the specific techniques. Comparing laboratory parameters across the three groups indicated significantly higher levels of TSA and CRP in smoker diabetics as compared to non-smoker diabetics and the healthy controls, while there was no difference in other parameters including serum iron and hemoglobin. A significant positive correlation was also revealed between TCA and CRP (r=0.324, P=0.030), but no significant association was found between other parameters. In the background of smoking, increasing the level of both TSA and CRP is predicted more than the existence of diabetes mellitus alone. In fact, the increase in these biomarkers is more predictable in smoker than in nonsmoker diabetics. This finding emphasizes the increased risk for cardiovascular disorders in smoker compared to non-smoker diabetics.

  18. Understanding the undelaying mechanism of HA-subtyping in the level of physic-chemical characteristics of protein.

    PubMed

    Ebrahimi, Mansour; Aghagolzadeh, Parisa; Shamabadi, Narges; Tahmasebi, Ahmad; Alsharifi, Mohammed; Adelson, David L; Hemmatzadeh, Farhid; Ebrahimie, Esmaeil

    2014-01-01

    The evolution of the influenza A virus to increase its host range is a major concern worldwide. Molecular mechanisms of increasing host range are largely unknown. Influenza surface proteins play determining roles in reorganization of host-sialic acid receptors and host range. In an attempt to uncover the physic-chemical attributes which govern HA subtyping, we performed a large scale functional analysis of over 7000 sequences of 16 different HA subtypes. Large number (896) of physic-chemical protein characteristics were calculated for each HA sequence. Then, 10 different attribute weighting algorithms were used to find the key characteristics distinguishing HA subtypes. Furthermore, to discover machine leaning models which can predict HA subtypes, various Decision Tree, Support Vector Machine, Naïve Bayes, and Neural Network models were trained on calculated protein characteristics dataset as well as 10 trimmed datasets generated by attribute weighting algorithms. The prediction accuracies of the machine learning methods were evaluated by 10-fold cross validation. The results highlighted the frequency of Gln (selected by 80% of attribute weighting algorithms), percentage/frequency of Tyr, percentage of Cys, and frequencies of Try and Glu (selected by 70% of attribute weighting algorithms) as the key features that are associated with HA subtyping. Random Forest tree induction algorithm and RBF kernel function of SVM (scaled by grid search) showed high accuracy of 98% in clustering and predicting HA subtypes based on protein attributes. Decision tree models were successful in monitoring the short mutation/reassortment paths by which influenza virus can gain the key protein structure of another HA subtype and increase its host range in a short period of time with less energy consumption. Extracting and mining a large number of amino acid attributes of HA subtypes of influenza A virus through supervised algorithms represent a new avenue for understanding and predicting possible future structure of influenza pandemics.

  19. Understanding the Underlying Mechanism of HA-Subtyping in the Level of Physic-Chemical Characteristics of Protein

    PubMed Central

    Ebrahimi, Mansour; Aghagolzadeh, Parisa; Shamabadi, Narges; Tahmasebi, Ahmad; Alsharifi, Mohammed; Adelson, David L.

    2014-01-01

    The evolution of the influenza A virus to increase its host range is a major concern worldwide. Molecular mechanisms of increasing host range are largely unknown. Influenza surface proteins play determining roles in reorganization of host-sialic acid receptors and host range. In an attempt to uncover the physic-chemical attributes which govern HA subtyping, we performed a large scale functional analysis of over 7000 sequences of 16 different HA subtypes. Large number (896) of physic-chemical protein characteristics were calculated for each HA sequence. Then, 10 different attribute weighting algorithms were used to find the key characteristics distinguishing HA subtypes. Furthermore, to discover machine leaning models which can predict HA subtypes, various Decision Tree, Support Vector Machine, Naïve Bayes, and Neural Network models were trained on calculated protein characteristics dataset as well as 10 trimmed datasets generated by attribute weighting algorithms. The prediction accuracies of the machine learning methods were evaluated by 10-fold cross validation. The results highlighted the frequency of Gln (selected by 80% of attribute weighting algorithms), percentage/frequency of Tyr, percentage of Cys, and frequencies of Try and Glu (selected by 70% of attribute weighting algorithms) as the key features that are associated with HA subtyping. Random Forest tree induction algorithm and RBF kernel function of SVM (scaled by grid search) showed high accuracy of 98% in clustering and predicting HA subtypes based on protein attributes. Decision tree models were successful in monitoring the short mutation/reassortment paths by which influenza virus can gain the key protein structure of another HA subtype and increase its host range in a short period of time with less energy consumption. Extracting and mining a large number of amino acid attributes of HA subtypes of influenza A virus through supervised algorithms represent a new avenue for understanding and predicting possible future structure of influenza pandemics. PMID:24809455

  20. The diagnostic and prognostic importance of oxidative stress biomarkers and acute phase proteins in Urinary Tract Infection (UTI) in camels.

    PubMed

    El-Deeb, Wael M; Buczinski, Sébastien

    2015-01-01

    The present study aimed to investigate the diagnostic and prognostic importance of oxidative stress biomarkers and acute phase proteins in urinary tract infection (UTI) in camels. We describe the clinical, bacteriological and biochemical findings in 89 camels. Blood and urine samples from diseased (n = 74) and control camels (n = 15) were submitted to laboratory investigations. The urine analysis revealed high number of RBCS and pus cells. The concentrations of serum and erythrocytic malondialdehyde (sMDA & eMDA), Haptoglobin (Hp), serum amyloid A (SAA), Ceruloplasmin (Cp), fibrinogen (Fb), albumin, globulin and interleukin 6 (IL-6) were higher in diseased camels when compared to healthy ones. Catalase, super oxide dismutase and glutathione levels were lower in diseased camels when compared with control group. Forty one of 74 camels with UTI were successfully treated. The levels of malondialdehyde, catalase, super oxide dismutase, glutathione, Hp, SAA, Fb, total protein, globulin and IL-6 were associated with the odds of treatment failure. The MDA showed a great sensitivity (Se) and specificity (Sp) in predicting treatment failure (Se 85%/Sp 100%) as well as the SAA (Se 92%/Sp 87%) and globulin levels (Se 85%/Sp 100%) when using the cutoffs that maximizes the sum of Se + Sp. Multivariate logistic regression analysis revealed that two models had a high accuracy to predict failure with the first model including sex, sMDA and Hp as covariates (area under the receiver operating characteristic curve (AUC) = 0.92) and a second model using sex, SAA and Hp (AUC = 0.89). Conclusively, the oxidative stress biomarkers and acute phase proteins could be used as diagnostic and prognostic biomarkers in camel UTI management. Efforts should be forced to investigate such biomarkers in other species with UTI.

  1. An ensemble framework for identifying essential proteins.

    PubMed

    Zhang, Xue; Xiao, Wangxin; Acencio, Marcio Luis; Lemke, Ney; Wang, Xujing

    2016-08-25

    Many centrality measures have been proposed to mine and characterize the correlations between network topological properties and protein essentiality. However, most of them show limited prediction accuracy, and the number of common predicted essential proteins by different methods is very small. In this paper, an ensemble framework is proposed which integrates gene expression data and protein-protein interaction networks (PINs). It aims to improve the prediction accuracy of basic centrality measures. The idea behind this ensemble framework is that different protein-protein interactions (PPIs) may show different contributions to protein essentiality. Five standard centrality measures (degree centrality, betweenness centrality, closeness centrality, eigenvector centrality, and subgraph centrality) are integrated into the ensemble framework respectively. We evaluated the performance of the proposed ensemble framework using yeast PINs and gene expression data. The results show that it can considerably improve the prediction accuracy of the five centrality measures individually. It can also remarkably increase the number of common predicted essential proteins among those predicted by each centrality measure individually and enable each centrality measure to find more low-degree essential proteins. This paper demonstrates that it is valuable to differentiate the contributions of different PPIs for identifying essential proteins based on network topological characteristics. The proposed ensemble framework is a successful paradigm to this end.

  2. CONFOLD2: improved contact-driven ab initio protein structure modeling.

    PubMed

    Adhikari, Badri; Cheng, Jianlin

    2018-01-25

    Contact-guided protein structure prediction methods are becoming more and more successful because of the latest advances in residue-residue contact prediction. To support contact-driven structure prediction, effective tools that can quickly build tertiary structural models of good quality from predicted contacts need to be developed. We develop an improved contact-driven protein modelling method, CONFOLD2, and study how it may be effectively used for ab initio protein structure prediction with predicted contacts as input. It builds models using various subsets of input contacts to explore the fold space under the guidance of a soft square energy function, and then clusters the models to obtain the top five models. CONFOLD2 obtains an average reconstruction accuracy of 0.57 TM-score for the 150 proteins in the PSICOV contact prediction dataset. When benchmarked on the CASP11 contacts predicted using CONSIP2 and CASP12 contacts predicted using Raptor-X, CONFOLD2 achieves a mean TM-score of 0.41 on both datasets. CONFOLD2 allows to quickly generate top five structural models for a protein sequence when its secondary structures and contacts predictions at hand. The source code of CONFOLD2 is publicly available at https://github.com/multicom-toolbox/CONFOLD2/ .

  3. ProTSAV: A protein tertiary structure analysis and validation server.

    PubMed

    Singh, Ankita; Kaushik, Rahul; Mishra, Avinash; Shanker, Asheesh; Jayaram, B

    2016-01-01

    Quality assessment of predicted model structures of proteins is as important as the protein tertiary structure prediction. A highly efficient quality assessment of predicted model structures directs further research on function. Here we present a new server ProTSAV, capable of evaluating predicted model structures based on some popular online servers and standalone tools. ProTSAV furnishes the user with a single quality score in case of individual protein structure along with a graphical representation and ranking in case of multiple protein structure assessment. The server is validated on ~64,446 protein structures including experimental structures from RCSB and predicted model structures for CASP targets and from public decoy sets. ProTSAV succeeds in predicting quality of protein structures with a specificity of 100% and a sensitivity of 98% on experimentally solved structures and achieves a specificity of 88%and a sensitivity of 91% on predicted protein structures of CASP11 targets under 2Å.The server overcomes the limitations of any single server/method and is seen to be robust in helping in quality assessment. ProTSAV is freely available at http://www.scfbio-iitd.res.in/software/proteomics/protsav.jsp. Copyright © 2015 Elsevier B.V. All rights reserved.

  4. SPRINT: ultrafast protein-protein interaction prediction of the entire human interactome.

    PubMed

    Li, Yiwei; Ilie, Lucian

    2017-11-15

    Proteins perform their functions usually by interacting with other proteins. Predicting which proteins interact is a fundamental problem. Experimental methods are slow, expensive, and have a high rate of error. Many computational methods have been proposed among which sequence-based ones are very promising. However, so far no such method is able to predict effectively the entire human interactome: they require too much time or memory. We present SPRINT (Scoring PRotein INTeractions), a new sequence-based algorithm and tool for predicting protein-protein interactions. We comprehensively compare SPRINT with state-of-the-art programs on seven most reliable human PPI datasets and show that it is more accurate while running orders of magnitude faster and using very little memory. SPRINT is the only sequence-based program that can effectively predict the entire human interactome: it requires between 15 and 100 min, depending on the dataset. Our goal is to transform the very challenging problem of predicting the entire human interactome into a routine task. The source code of SPRINT is freely available from https://github.com/lucian-ilie/SPRINT/ and the datasets and predicted PPIs from www.csd.uwo.ca/faculty/ilie/SPRINT/ .

  5. Accurate prediction of subcellular location of apoptosis proteins combining Chou's PseAAC and PsePSSM based on wavelet denoising.

    PubMed

    Yu, Bin; Li, Shan; Qiu, Wen-Ying; Chen, Cheng; Chen, Rui-Xin; Wang, Lei; Wang, Ming-Hui; Zhang, Yan

    2017-12-08

    Apoptosis proteins subcellular localization information are very important for understanding the mechanism of programmed cell death and the development of drugs. The prediction of subcellular localization of an apoptosis protein is still a challenging task because the prediction of apoptosis proteins subcellular localization can help to understand their function and the role of metabolic processes. In this paper, we propose a novel method for protein subcellular localization prediction. Firstly, the features of the protein sequence are extracted by combining Chou's pseudo amino acid composition (PseAAC) and pseudo-position specific scoring matrix (PsePSSM), then the feature information of the extracted is denoised by two-dimensional (2-D) wavelet denoising. Finally, the optimal feature vectors are input to the SVM classifier to predict subcellular location of apoptosis proteins. Quite promising predictions are obtained using the jackknife test on three widely used datasets and compared with other state-of-the-art methods. The results indicate that the method proposed in this paper can remarkably improve the prediction accuracy of apoptosis protein subcellular localization, which will be a supplementary tool for future proteomics research.

  6. Accurate prediction of subcellular location of apoptosis proteins combining Chou’s PseAAC and PsePSSM based on wavelet denoising

    PubMed Central

    Chen, Cheng; Chen, Rui-Xin; Wang, Lei; Wang, Ming-Hui; Zhang, Yan

    2017-01-01

    Apoptosis proteins subcellular localization information are very important for understanding the mechanism of programmed cell death and the development of drugs. The prediction of subcellular localization of an apoptosis protein is still a challenging task because the prediction of apoptosis proteins subcellular localization can help to understand their function and the role of metabolic processes. In this paper, we propose a novel method for protein subcellular localization prediction. Firstly, the features of the protein sequence are extracted by combining Chou's pseudo amino acid composition (PseAAC) and pseudo-position specific scoring matrix (PsePSSM), then the feature information of the extracted is denoised by two-dimensional (2-D) wavelet denoising. Finally, the optimal feature vectors are input to the SVM classifier to predict subcellular location of apoptosis proteins. Quite promising predictions are obtained using the jackknife test on three widely used datasets and compared with other state-of-the-art methods. The results indicate that the method proposed in this paper can remarkably improve the prediction accuracy of apoptosis protein subcellular localization, which will be a supplementary tool for future proteomics research. PMID:29296195

  7. Systematically Differentiating Functions for Alternatively Spliced Isoforms through Integrating RNA-seq Data

    PubMed Central

    Menon, Rajasree; Wen, Yuchen; Omenn, Gilbert S.; Kretzler, Matthias; Guan, Yuanfang

    2013-01-01

    Integrating large-scale functional genomic data has significantly accelerated our understanding of gene functions. However, no algorithm has been developed to differentiate functions for isoforms of the same gene using high-throughput genomic data. This is because standard supervised learning requires ‘ground-truth’ functional annotations, which are lacking at the isoform level. To address this challenge, we developed a generic framework that interrogates public RNA-seq data at the transcript level to differentiate functions for alternatively spliced isoforms. For a specific function, our algorithm identifies the ‘responsible’ isoform(s) of a gene and generates classifying models at the isoform level instead of at the gene level. Through cross-validation, we demonstrated that our algorithm is effective in assigning functions to genes, especially the ones with multiple isoforms, and robust to gene expression levels and removal of homologous gene pairs. We identified genes in the mouse whose isoforms are predicted to have disparate functionalities and experimentally validated the ‘responsible’ isoforms using data from mammary tissue. With protein structure modeling and experimental evidence, we further validated the predicted isoform functional differences for the genes Cdkn2a and Anxa6. Our generic framework is the first to predict and differentiate functions for alternatively spliced isoforms, instead of genes, using genomic data. It is extendable to any base machine learner and other species with alternatively spliced isoforms, and shifts the current gene-centered function prediction to isoform-level predictions. PMID:24244129

  8. Representability of algebraic topology for biomolecules in machine learning based scoring and virtual screening

    PubMed Central

    Mu, Lin

    2018-01-01

    This work introduces a number of algebraic topology approaches, including multi-component persistent homology, multi-level persistent homology, and electrostatic persistence for the representation, characterization, and description of small molecules and biomolecular complexes. In contrast to the conventional persistent homology, multi-component persistent homology retains critical chemical and biological information during the topological simplification of biomolecular geometric complexity. Multi-level persistent homology enables a tailored topological description of inter- and/or intra-molecular interactions of interest. Electrostatic persistence incorporates partial charge information into topological invariants. These topological methods are paired with Wasserstein distance to characterize similarities between molecules and are further integrated with a variety of machine learning algorithms, including k-nearest neighbors, ensemble of trees, and deep convolutional neural networks, to manifest their descriptive and predictive powers for protein-ligand binding analysis and virtual screening of small molecules. Extensive numerical experiments involving 4,414 protein-ligand complexes from the PDBBind database and 128,374 ligand-target and decoy-target pairs in the DUD database are performed to test respectively the scoring power and the discriminatory power of the proposed topological learning strategies. It is demonstrated that the present topological learning outperforms other existing methods in protein-ligand binding affinity prediction and ligand-decoy discrimination. PMID:29309403

  9. PreSSAPro: a software for the prediction of secondary structure by amino acid properties.

    PubMed

    Costantini, Susan; Colonna, Giovanni; Facchiano, Angelo M

    2007-10-01

    PreSSAPro is a software, available to the scientific community as a free web service designed to provide predictions of secondary structures starting from the amino acid sequence of a given protein. Predictions are based on our recently published work on the amino acid propensities for secondary structures in either large but not homogeneous protein data sets, as well as in smaller but homogeneous data sets corresponding to protein structural classes, i.e. all-alpha, all-beta, or alpha-beta proteins. Predictions result improved by the use of propensities evaluated for the right protein class. PreSSAPro predicts the secondary structure according to the right protein class, if known, or gives a multiple prediction with reference to the different structural classes. The comparison of these predictions represents a novel tool to evaluate what sequence regions can assume different secondary structures depending on the structural class assignment, in the perspective of identifying proteins able to fold in different conformations. The service is available at the URL http://bioinformatica.isa.cnr.it/PRESSAPRO/.

  10. Predicted RNA Binding Proteins Pes4 and Mip6 Regulate mRNA Levels, Translation, and Localization during Sporulation in Budding Yeast.

    PubMed

    Jin, Liang; Zhang, Kai; Sternglanz, Rolf; Neiman, Aaron M

    2017-05-01

    In response to starvation, diploid cells of Saccharomyces cerevisiae undergo meiosis and form haploid spores, a process collectively referred to as sporulation. The differentiation into spores requires extensive changes in gene expression. The transcriptional activator Ndt80 is a central regulator of this process, which controls many genes essential for sporulation. Ndt80 induces ∼300 genes coordinately during meiotic prophase, but different mRNAs within the NDT80 regulon are translated at different times during sporulation. The protein kinase Ime2 and RNA binding protein Rim4 are general regulators of meiotic translational delay, but how differential timing of individual transcripts is achieved was not known. This report describes the characterization of two related NDT80 -induced genes, PES4 and MIP6 , encoding predicted RNA binding proteins. These genes are necessary to regulate the steady-state expression, translational timing, and localization of a set of mRNAs that are transcribed by NDT80 but not translated until the end of meiosis II. Mutations in the predicted RNA binding domains within PES4 alter the stability of target mRNAs. PES4 and MIP6 affect only a small portion of the NDT80 regulon, indicating that they act as modulators of the general Ime2/Rim4 pathway for specific transcripts. Copyright © 2017 American Society for Microbiology.

  11. Generation of oscillations by the p53-Mdm2 feedback loop: A theoretical and experimental study

    PubMed Central

    Lev Bar-Or, Ruth; Maya, Ruth; Segel, Lee A.; Alon, Uri; Levine, Arnold J.; Oren, Moshe

    2000-01-01

    The intracellular activity of the p53 tumor suppressor protein is regulated through a feedback loop involving its transcriptional target, mdm2. We present a simple mathematical model suggesting that, under certain circumstances, oscillations in p53 and Mdm2 protein levels can emerge in response to a stress signal. A delay in p53-dependent induction of Mdm2 is predicted to be required, albeit not sufficient, for this oscillatory behavior. In line with the predictions of the model, oscillations of both p53 and Mdm2 indeed occur on exposure of various cell types to ionizing radiation. Such oscillations may allow cells to repair their DNA without risking the irreversible consequences of continuous excessive p53 activation. PMID:11016968

  12. Serum and cerebrospinal fluid levels of visinin-like protein-1 in acute encephalopathy with biphasic seizures and late reduced diffusion.

    PubMed

    Hasegawa, Shunji; Matsushige, Takeshi; Inoue, Hirofumi; Takahara, Midori; Kajimoto, Madoka; Momonaka, Hiroshi; Oka, Momoko; Isumi, Hiroshi; Emi, Sakie; Hayashi, Megumi; Ichiyama, Takashi

    2014-08-01

    Acute encephalopathy with biphasic seizures and late reduced diffusion (AESD) has recently been recognized as an encephalopathy subtype. Typical clinical symptoms of AESD are biphasic seizures, and MRI findings show reduced subcortical diffusion during clustering seizures with unconsciousness after the acute phase. Visinin-like protein-1 (VILIP-1) is a recently discovered protein that is abundant in the central nervous system, and some reports have shown that VILIP-1 may be a prognostic biomarker of conditions such as Alzheimer's disease, stroke, and brain injury. However, there have been no reports regarding serum and cerebrospinal fluid (CSF) levels of VILIP-1 in patients with AESD. We measured the serum and CSF levels of VILIP-1 in patients with AESD, and compared the levels to those in patients with prolonged febrile seizures (FS). Both serum and CSF levels of VILIP-1 were significantly higher in patients with AESD than in patients with prolonged FS. Serum and CSF VILIP-1 levels were normal on day 1 of AESD. Our results suggest that both serum and CSF levels of VILIP-1 may be one of predictive markers of AESD. Copyright © 2013 The Japanese Society of Child Neurology. Published by Elsevier B.V. All rights reserved.

  13. Computational prediction of protein interactions related to the invasion of erythrocytes by malarial parasites.

    PubMed

    Liu, Xuewu; Huang, Yuxiao; Liang, Jiao; Zhang, Shuai; Li, Yinghui; Wang, Jun; Shen, Yan; Xu, Zhikai; Zhao, Ya

    2014-11-30

    The invasion of red blood cells (RBCs) by malarial parasites is an essential step in the life cycle of Plasmodium falciparum. Human-parasite surface protein interactions play a critical role in this process. Although several interactions between human and parasite proteins have been discovered, the mechanism related to invasion remains poorly understood because numerous human-parasite protein interactions have not yet been identified. High-throughput screening experiments are not feasible for malarial parasites due to difficulty in expressing the parasite proteins. Here, we performed computational prediction of the PPIs involved in malaria parasite invasion to elucidate the mechanism by which invasion occurs. In this study, an expectation maximization algorithm was used to estimate the probabilities of domain-domain interactions (DDIs). Estimates of DDI probabilities were then used to infer PPI probabilities. We found that our prediction performance was better than that based on the information of D. melanogaster alone when information related to the six species was used. Prediction performance was assessed using protein interaction data from S. cerevisiae, indicating that the predicted results were reliable. We then used the estimates of DDI probabilities to infer interactions between 490 parasite and 3,787 human membrane proteins. A small-scale dataset was used to illustrate the usability of our method in predicting interactions between human and parasite proteins. The positive predictive value (PPV) was lower than that observed in S. cerevisiae. We integrated gene expression data to improve prediction accuracy and to reduce false positives. We identified 80 membrane proteins highly expressed in the schizont stage by fast Fourier transform method. Approximately 221 erythrocyte membrane proteins were identified using published mass spectral datasets. A network consisting of 205 interactions was predicted. Results of network analysis suggest that SNARE proteins of parasites and APP of humans may function in the invasion of RBCs by parasites. We predicted a small-scale PPI network that may be involved in parasite invasion of RBCs by integrating DDI information and expression profiles. Experimental studies should be conducted to validate the predicted interactions. The predicted PPIs help elucidate the mechanism of parasite invasion and provide directions for future experimental investigations.

  14. Surface energetics and protein-protein interactions: analysis and mechanistic implications

    PubMed Central

    Peri, Claudio; Morra, Giulia; Colombo, Giorgio

    2016-01-01

    Understanding protein-protein interactions (PPI) at the molecular level is a fundamental task in the design of new drugs, the prediction of protein function and the clarification of the mechanisms of (dis)regulation of biochemical pathways. In this study, we use a novel computational approach to investigate the energetics of aminoacid networks located on the surface of proteins, isolated and in complex with their respective partners. Interestingly, the analysis of individual proteins identifies patches of surface residues that, when mapped on the structure of their respective complexes, reveal regions of residue-pair couplings that extend across the binding interfaces, forming continuous motifs. An enhanced effect is visible across the proteins of the dataset forming larger quaternary assemblies. The method indicates the presence of energetic signatures in the isolated proteins that are retained in the bound form, which we hypothesize to determine binding orientation upon complex formation. We propose our method, BLUEPRINT, as a complement to different approaches ranging from the ab-initio characterization of PPIs, to protein-protein docking algorithms, for the physico-chemical and functional investigation of protein-protein interactions. PMID:27050828

  15. Vascular endothelial growth factor and protein level in pleural effusion for differentiating malignant from benign pleural effusion.

    PubMed

    Wu, Da-Wei; Chang, Wei-An; Liu, Kuan-Ting; Yen, Meng-Chi; Kuo, Po-Lin

    2017-09-01

    Pleural effusion is associated with multiple benign and malignant conditions. Currently no biomarkers differentiate malignant pleural effusion (MPE) and benign pleural effusion (BPE) sensitively and specifically. The present study identified a novel combination of biomarkers in pleural effusion for differentiating MPE from BPE by enrolling 75 patients, 34 with BPE and 41 with MPE. The levels of lactate dehydrogenase, glucose, protein, and total cell, neutrophil, monocyte and lymphocyte counts in the pleural effusion were measured. The concentrations of interleukin (IL)-1β, IL-4, IL-5, IL-6, IL-8, IL-10, IL-12, tumor necrosis factor-α, interferon γ, transforming growth factor-β1, colony stimulating factor 2, monocyte chemoattractant protein-1 and vascular endothelial growth factor (VEGF) were detected using cytometric bead arrays. Protein and VEGF levels differed significantly between patients with BPE and those with MPE. The optimal cutoff value of VEGF and protein was 214 pg/ml and 3.35 g/dl respectively, according to the receiver operating characteristic curve. A combination of VEGF >214 pg/ml and protein >3.35 g/dl in pleural effusion presented a sensitivity of 92.6% and an accuracy of 78.6% for MPE, but was not associated with a decreased survival rate. These results suggested that this novel combination strategy may provide useful biomarkers for predicting MPE and facilitating early diagnosis.

  16. Expression and Secretion of Endostar Protein by Escherichia Coli: Optimization of Culture Conditions Using the Response Surface Methodology.

    PubMed

    Mohajeri, Abbas; Abdolalizadeh, Jalal; Pilehvar-Soltanahmadi, Younes; Kiafar, Farhad; Zarghami, Nosratollah

    2016-10-01

    Endostar as a specific drug in treatment of the nonsmall cell lung cancer is produced using Escherichia coli expression system. Plackett-Burman design (PBD) and response surface methodology (RSM) are statistical tools for experimental design and optimization of biotechnological processes. This investigation aimed to predict and develop the optimal culture condition and its components for expression and secretion of endostar into the culture medium of E. coli. The synthetic endostar coding sequence was fused with PhoA signal peptide. The nine factors involved in the production of recombinant protein-postinduction temperature, cell density, rotation speed, postinduction time, concentration of glycerol, IPTG, peptone, glycine, and triton X-100-were evaluated using PBD. Four significant factors were selected based on PBD results for optimizing culture condition using RSM. Endostar was purified using cation exchange chromatography and size exclusion chromatography. The maximum level of endostar was obtained under the following condition: 13.57-h postinduction time, 0.76 % glycine, 0.7 % triton X-100, and 4.87 % glycerol. The predicted levels of endostar was significantly correlated with experimental levels (R 2 = 0.982, P = 0.00). The obtained results indicated that PBD and RSM are effective tools for optimization of culture condition and its components for endostar production in E. coli. The most important factors in the enhancement of the protein production are glycerol, glycine, and postinduction time.

  17. Identification of a melanosomal membrane protein encoded by the pink-eyed dilution (type II oculocutaneous albinism) gene.

    PubMed Central

    Rosemblat, S; Durham-Pierre, D; Gardner, J M; Nakatsu, Y; Brilliant, M H; Orlow, S J

    1994-01-01

    The pink-eyed dilution (p) locus in the mouse is critical to melanogenesis; mutations in the homologous locus in humans, P, are a cause of type II oculocutaneous albinism. Although a cDNA encoded by the p gene has recently been identified, nothing is known about the protein product of this gene. To characterize the protein encoded by the p gene, we performed immunoblot analysis of extracts of melanocytes cultured from wild-type mice with an antiserum from rabbits immunized with a peptide corresponding to amino acids 285-298 of the predicted protein product of the murine p gene. This antiserum recognized a 110-kDa protein. The protein was absent from extracts of melanocytes cultured from mice with two mutations (pcp and p) in which transcripts of the p gene are absent or greatly reduced. Introduction of the cDNA for the p gene into pcp melanocytes by electroporation resulted in expression of the 3.3-kb mRNA and the 110-kDa protein. Upon subcellular fractionation of cultured melanocytes, the 110-kDa protein was found to be present in melanosomes but absent from the vesicular fraction; phase separation performed with the nonionic detergent Triton X-114 confirmed the predicted hydrophobic nature of the protein. These results demonstrate that the p gene encodes a 110-kDa integral melanosomal membrane protein and establish a framework by which mutations at this locus, which diminish pigmentation, can be analyzed at the cellular and biochemical levels. Images PMID:7991586

  18. Toward the accurate first-principles prediction of ionization equilibria in proteins.

    PubMed

    Khandogin, Jana; Brooks, Charles L

    2006-08-08

    The calculation of pK(a) values for ionizable sites in proteins has been traditionally based on numerical solutions of the Poisson-Boltzmann equation carried out using a high-resolution protein structure. In this paper, we present a method based on continuous constant pH molecular dynamics (CPHMD) simulations, which allows the first-principles description of protein ionization equilibria. Our method utilizes an improved generalized Born implicit solvent model with an approximate Debye-Hückel screening function to account for salt effects and the replica-exchange (REX) protocol for enhanced conformational and protonation state sampling. The accuracy and robustness of the present method are demonstrated by 1 ns REX-CPHMD titration simulations of 10 proteins, which exhibit anomalously large pK(a) shifts for the carboxylate and histidine side chains. The experimental pK(a) values of these proteins are reliably reproduced with a root-mean-square error ranging from 0.6 unit for proteins containing few buried ionizable side chains to 1.0 unit or slightly higher for proteins containing ionizable side chains deeply buried in the core and experiencing strong charge-charge interactions. This unprecedented level of agreement with experimental benchmarks for the de novo calculation of pK(a) values suggests that the CPHMD method is maturing into a practical tool for the quantitative prediction of protein ionization equilibria, and this, in turn, opens a door to atomistic simulations of a wide variety of pH-coupled conformational phenomena in biological macromolecules such as protein folding or misfolding, aggregation, ligand binding, membrane interaction, and catalysis.

  19. TRIM16 inhibits proliferation and migration through regulation of interferon beta 1 in melanoma cells

    PubMed Central

    Sutton, Selina K.; Koach, Jessica; Tan, Owen; Liu, Bing; Carter, Daniel R.; Wilmott, James S.; Yosufi, Benafsha; Haydu, Lauren E.; Mann, Graham J.; Thompson, John F.; Long, Georgina V.; Liu, Tao; McArthur, Grant; Zhang, Xu Dong; Scolyer, Richard A.; Cheung, Belamy B.; Marshall, Glenn M.

    2014-01-01

    High basal or induced expression of the tripartite motif protein, TRIM16, leads to reduce cell growth and migration of neuroblastoma and skin squamous cell carcinoma cells. However, the role of TRIM16 in melanoma is currently unknown. TRIM16 protein levels were markedly reduced in human melanoma cell lines, compared with normal human epidermal melanocytes due to both DNA methylation and reduced protein stability. TRIM16 knockdown strongly increased cell migration in normal human epidermal melanocytes, while TRIM16 overexpression reduced cell migration and proliferation of melanoma cells in an interferon beta 1 (IFNβ1)-dependent manner. Chromatin immunoprecipitation assays revealed TRIM16 directly bound the IFNβ1 gene promoter. Low level TRIM16 expression in 91 melanoma patient samples, strongly correlated with lymph node metastasis, and, predicted poor patient prognosis in a separate cohort of 170 melanoma patients with lymph node metastasis. The BRAF inhibitor, vemurafenib, increased TRIM16 protein levels in melanoma cells in vitro, and induced growth arrest in BRAF-mutant melanoma cells in a TRIM16-dependent manner. High levels of TRIM16 in melanoma tissues from patients treated with Vemurafenib correlated with clinical response. Our data, for the first time, demonstrates TRIM16 is a marker of cell migration and metastasis, and a novel treatment target in melanoma. PMID:25333256

  20. A cross sectional study of two independent cohorts identifies serum biomarkers for facioscapulohumeral muscular dystrophy (FSHD)

    PubMed Central

    Petek, Lisa M.; Rickard, Amanda M.; Budech, Christopher; Poliachik, Sandra L.; Shaw, Dennis; Ferguson, Mark R.; Tawil, Rabi; Friedman, Seth D.; Miller, Daniel G.

    2016-01-01

    Measuring the severity and progression of facioscapulohumeral muscular dystrophy (FSHD) is particularly challenging because muscle weakness progresses over long periods of time and can be sporadic. Biomarkers are essential for measuring disease burden and testing treatment strategies. We utilized the sensitive, specific, high-throughput SomaLogic proteomics platform of 1129 proteins to identify proteins with levels that correlate with FSHD severity in a cross-sectional study of two independent cohorts. We discovered biomarkers that correlate with clinical severity and disease burden measured by magnetic resonance imaging. Sixty-eight proteins in the Rochester cohort (n = 48) and 51 proteins in the Seattle cohort (n = 30) had significantly different levels in FSHD-affected individuals when compared with controls (p-value ≤ .005). A subset of these varied by at least 1.5 fold and four biomarkers were significantly elevated in both cohorts. Levels of creatine kinase MM and MB isoforms, carbonic anhydrase III, and troponin I type 2 reliably predicted the disease state and correlated with disease severity. Other novel biomarkers were also discovered that may reveal mechanisms of disease pathology. Assessing the levels of these biomarkers during clinical trials may add significance to other measures of quantifying disease progression or regression. PMID:27185459

Top