Sample records for accurate protein identification

  1. Mass spectrometry-based protein identification with accurate statistical significance assignment.

    PubMed

    Alves, Gelio; Yu, Yi-Kuo

    2015-03-01

    Assigning statistical significance accurately has become increasingly important as metadata of many types, often assembled in hierarchies, are constructed and combined for further biological analyses. Statistical inaccuracy of metadata at any level may propagate to downstream analyses, undermining the validity of scientific conclusions thus drawn. From the perspective of mass spectrometry-based proteomics, even though accurate statistics for peptide identification can now be achieved, accurate protein level statistics remain challenging. We have constructed a protein ID method that combines peptide evidences of a candidate protein based on a rigorous formula derived earlier; in this formula the database P-value of every peptide is weighted, prior to the final combination, according to the number of proteins it maps to. We have also shown that this protein ID method provides accurate protein level E-value, eliminating the need of using empirical post-processing methods for type-I error control. Using a known protein mixture, we find that this protein ID method, when combined with the Sorić formula, yields accurate values for the proportion of false discoveries. In terms of retrieval efficacy, the results from our method are comparable with other methods tested. The source code, implemented in C++ on a linux system, is available for download at ftp://ftp.ncbi.nlm.nih.gov/pub/qmbp/qmbp_ms/RAId/RAId_Linux_64Bit. Published by Oxford University Press 2014. This work is written by US Government employees and is in the public domain in the US.

  2. Rapid Identification of Sequences for Orphan Enzymes to Power Accurate Protein Annotation

    PubMed Central

    Ojha, Sunil; Watson, Douglas S.; Bomar, Martha G.; Galande, Amit K.; Shearer, Alexander G.

    2013-01-01

    The power of genome sequencing depends on the ability to understand what those genes and their proteins products actually do. The automated methods used to assign functions to putative proteins in newly sequenced organisms are limited by the size of our library of proteins with both known function and sequence. Unfortunately this library grows slowly, lagging well behind the rapid increase in novel protein sequences produced by modern genome sequencing methods. One potential source for rapidly expanding this functional library is the “back catalog” of enzymology – “orphan enzymes,” those enzymes that have been characterized and yet lack any associated sequence. There are hundreds of orphan enzymes in the Enzyme Commission (EC) database alone. In this study, we demonstrate how this orphan enzyme “back catalog” is a fertile source for rapidly advancing the state of protein annotation. Starting from three orphan enzyme samples, we applied mass-spectrometry based analysis and computational methods (including sequence similarity networks, sequence and structural alignments, and operon context analysis) to rapidly identify the specific sequence for each orphan while avoiding the most time- and labor-intensive aspects of typical sequence identifications. We then used these three new sequences to more accurately predict the catalytic function of 385 previously uncharacterized or misannotated proteins. We expect that this kind of rapid sequence identification could be efficiently applied on a larger scale to make enzymology’s “back catalog” another powerful tool to drive accurate genome annotation. PMID:24386392

  3. Rapid identification of sequences for orphan enzymes to power accurate protein annotation.

    PubMed

    Ramkissoon, Kevin R; Miller, Jennifer K; Ojha, Sunil; Watson, Douglas S; Bomar, Martha G; Galande, Amit K; Shearer, Alexander G

    2013-01-01

    The power of genome sequencing depends on the ability to understand what those genes and their proteins products actually do. The automated methods used to assign functions to putative proteins in newly sequenced organisms are limited by the size of our library of proteins with both known function and sequence. Unfortunately this library grows slowly, lagging well behind the rapid increase in novel protein sequences produced by modern genome sequencing methods. One potential source for rapidly expanding this functional library is the "back catalog" of enzymology--"orphan enzymes," those enzymes that have been characterized and yet lack any associated sequence. There are hundreds of orphan enzymes in the Enzyme Commission (EC) database alone. In this study, we demonstrate how this orphan enzyme "back catalog" is a fertile source for rapidly advancing the state of protein annotation. Starting from three orphan enzyme samples, we applied mass-spectrometry based analysis and computational methods (including sequence similarity networks, sequence and structural alignments, and operon context analysis) to rapidly identify the specific sequence for each orphan while avoiding the most time- and labor-intensive aspects of typical sequence identifications. We then used these three new sequences to more accurately predict the catalytic function of 385 previously uncharacterized or misannotated proteins. We expect that this kind of rapid sequence identification could be efficiently applied on a larger scale to make enzymology's "back catalog" another powerful tool to drive accurate genome annotation.

  4. Carbene footprinting accurately maps binding sites in protein-ligand and protein-protein interactions

    NASA Astrophysics Data System (ADS)

    Manzi, Lucio; Barrow, Andrew S.; Scott, Daniel; Layfield, Robert; Wright, Timothy G.; Moses, John E.; Oldham, Neil J.

    2016-11-01

    Specific interactions between proteins and their binding partners are fundamental to life processes. The ability to detect protein complexes, and map their sites of binding, is crucial to understanding basic biology at the molecular level. Methods that employ sensitive analytical techniques such as mass spectrometry have the potential to provide valuable insights with very little material and on short time scales. Here we present a differential protein footprinting technique employing an efficient photo-activated probe for use with mass spectrometry. Using this methodology the location of a carbohydrate substrate was accurately mapped to the binding cleft of lysozyme, and in a more complex example, the interactions between a 100 kDa, multi-domain deubiquitinating enzyme, USP5 and a diubiquitin substrate were located to different functional domains. The much improved properties of this probe make carbene footprinting a viable method for rapid and accurate identification of protein binding sites utilizing benign, near-UV photoactivation.

  5. HIPPI: highly accurate protein family classification with ensembles of HMMs.

    PubMed

    Nguyen, Nam-Phuong; Nute, Michael; Mirarab, Siavash; Warnow, Tandy

    2016-11-11

    Given a new biological sequence, detecting membership in a known family is a basic step in many bioinformatics analyses, with applications to protein structure and function prediction and metagenomic taxon identification and abundance profiling, among others. Yet family identification of sequences that are distantly related to sequences in public databases or that are fragmentary remains one of the more difficult analytical problems in bioinformatics. We present a new technique for family identification called HIPPI (Hierarchical Profile Hidden Markov Models for Protein family Identification). HIPPI uses a novel technique to represent a multiple sequence alignment for a given protein family or superfamily by an ensemble of profile hidden Markov models computed using HMMER. An evaluation of HIPPI on the Pfam database shows that HIPPI has better overall precision and recall than blastp, HMMER, and pipelines based on HHsearch, and maintains good accuracy even for fragmentary query sequences and for protein families with low average pairwise sequence identity, both conditions where other methods degrade in accuracy. HIPPI provides accurate protein family identification and is robust to difficult model conditions. Our results, combined with observations from previous studies, show that ensembles of profile Hidden Markov models can better represent multiple sequence alignments than a single profile Hidden Markov model, and thus can improve downstream analyses for various bioinformatic tasks. Further research is needed to determine the best practices for building the ensemble of profile Hidden Markov models. HIPPI is available on GitHub at https://github.com/smirarab/sepp .

  6. The identification of complete domains within protein sequences using accurate E-values for semi-global alignment

    PubMed Central

    Kann, Maricel G.; Sheetlin, Sergey L.; Park, Yonil; Bryant, Stephen H.; Spouge, John L.

    2007-01-01

    The sequencing of complete genomes has created a pressing need for automated annotation of gene function. Because domains are the basic units of protein function and evolution, a gene can be annotated from a domain database by aligning domains to the corresponding protein sequence. Ideally, complete domains are aligned to protein subsequences, in a ‘semi-global alignment’. Local alignment, which aligns pieces of domains to subsequences, is common in high-throughput annotation applications, however. It is a mature technique, with the heuristics and accurate E-values required for screening large databases and evaluating the screening results. Hidden Markov models (HMMs) provide an alternative theoretical framework for semi-global alignment, but their use is limited because they lack heuristic acceleration and accurate E-values. Our new tool, GLOBAL, overcomes some limitations of previous semi-global HMMs: it has accurate E-values and the possibility of the heuristic acceleration required for high-throughput applications. Moreover, according to a standard of truth based on protein structure, two semi-global HMM alignment tools (GLOBAL and HMMer) had comparable performance in identifying complete domains, but distinctly outperformed two tools based on local alignment. When searching for complete protein domains, therefore, GLOBAL avoids disadvantages commonly associated with HMMs, yet maintains their superior retrieval performance. PMID:17596268

  7. An OGA-Resistant Probe Allows Specific Visualization and Accurate Identification of O-GlcNAc-Modified Proteins in Cells.

    PubMed

    Li, Jing; Wang, Jiajia; Wen, Liuqing; Zhu, He; Li, Shanshan; Huang, Kenneth; Jiang, Kuan; Li, Xu; Ma, Cheng; Qu, Jingyao; Parameswaran, Aishwarya; Song, Jing; Zhao, Wei; Wang, Peng George

    2016-11-18

    O-linked β-N-acetyl-glucosamine (O-GlcNAc) is an essential and ubiquitous post-translational modification present in nucleic and cytoplasmic proteins of multicellular eukaryotes. The metabolic chemical probes such as GlcNAc or GalNAc analogues bearing ketone or azide handles, in conjunction with bioorthogonal reactions, provide a powerful approach for detecting and identifying this modification. However, these chemical probes either enter multiple glycosylation pathways or have low labeling efficiency. Therefore, selective and potent probes are needed to assess this modification. We report here the development of a novel probe, 1,3,6-tri-O-acetyl-2-azidoacetamido-2,4-dideoxy-d-glucopyranose (Ac 3 4dGlcNAz), that can be processed by the GalNAc salvage pathway and transferred by O-GlcNAc transferase (OGT) to O-GlcNAc proteins. Due to the absence of a hydroxyl group at C4, this probe is less incorporated into α/β 4-GlcNAc or GalNAc containing glycoconjugates. Furthermore, the O-4dGlcNAz modification was resistant to the hydrolysis of O-GlcNAcase (OGA), which greatly enhanced the efficiency of incorporation for O-GlcNAcylation. Combined with a click reaction, Ac 3 4dGlcNAz allowed the selective visualization of O-GlcNAc in cells and accurate identification of O-GlcNAc-modified proteins with LC-MS/MS. This probe represents a more potent and selective tool in tracking, capturing, and identifying O-GlcNAc-modified proteins in cells and cell lysates.

  8. FragBag, an accurate representation of protein structure, retrieves structural neighbors from the entire PDB quickly and accurately.

    PubMed

    Budowski-Tal, Inbal; Nov, Yuval; Kolodny, Rachel

    2010-02-23

    Fast identification of protein structures that are similar to a specified query structure in the entire Protein Data Bank (PDB) is fundamental in structure and function prediction. We present FragBag: An ultrafast and accurate method for comparing protein structures. We describe a protein structure by the collection of its overlapping short contiguous backbone segments, and discretize this set using a library of fragments. Then, we succinctly represent the protein as a "bags-of-fragments"-a vector that counts the number of occurrences of each fragment-and measure the similarity between two structures by the similarity between their vectors. Our representation has two additional benefits: (i) it can be used to construct an inverted index, for implementing a fast structural search engine of the entire PDB, and (ii) one can specify a structure as a collection of substructures, without combining them into a single structure; this is valuable for structure prediction, when there are reliable predictions only of parts of the protein. We use receiver operating characteristic curve analysis to quantify the success of FragBag in identifying neighbor candidate sets in a dataset of over 2,900 structures. The gold standard is the set of neighbors found by six state of the art structural aligners. Our best FragBag library finds more accurate candidate sets than the three other filter methods: The SGM, PRIDE, and a method by Zotenko et al. More interestingly, FragBag performs on a par with the computationally expensive, yet highly trusted structural aligners STRUCTAL and CE.

  9. Accurate Identification of Cancerlectins through Hybrid Machine Learning Technology.

    PubMed

    Zhang, Jieru; Ju, Ying; Lu, Huijuan; Xuan, Ping; Zou, Quan

    2016-01-01

    Cancerlectins are cancer-related proteins that function as lectins. They have been identified through computational identification techniques, but these techniques have sometimes failed to identify proteins because of sequence diversity among the cancerlectins. Advanced machine learning identification methods, such as support vector machine and basic sequence features (n-gram), have also been used to identify cancerlectins. In this study, various protein fingerprint features and advanced classifiers, including ensemble learning techniques, were utilized to identify this group of proteins. We improved the prediction accuracy of the original feature extraction methods and classification algorithms by more than 10% on average. Our work provides a basis for the computational identification of cancerlectins and reveals the power of hybrid machine learning techniques in computational proteomics.

  10. [A accurate identification method for Chinese materia medica--systematic identification of Chinese materia medica].

    PubMed

    Wang, Xue-Yong; Liao, Cai-Li; Liu, Si-Qi; Liu, Chun-Sheng; Shao, Ai-Juan; Huang, Lu-Qi

    2013-05-01

    This paper put forward a more accurate identification method for identification of Chinese materia medica (CMM), the systematic identification of Chinese materia medica (SICMM) , which might solve difficulties in CMM identification used the ordinary traditional ways. Concepts, mechanisms and methods of SICMM were systematically introduced and possibility was proved by experiments. The establishment of SICMM will solve problems in identification of Chinese materia medica not only in phenotypic characters like the mnorphous, microstructure, chemical constituents, but also further discovery evolution and classification of species, subspecies and population in medical plants. The establishment of SICMM will improve the development of identification of CMM and create a more extensive study space.

  11. Covariation of Peptide Abundances Accurately Reflects Protein Concentration Differences*

    PubMed Central

    Pirmoradian, Mohammad

    2017-01-01

    Most implementations of mass spectrometry-based proteomics involve enzymatic digestion of proteins, expanding the analysis to multiple proteolytic peptides for each protein. Currently, there is no consensus of how to summarize peptides' abundances to protein concentrations, and such efforts are complicated by the fact that error control normally is applied to the identification process, and do not directly control errors linking peptide abundance measures to protein concentration. Peptides resulting from suboptimal digestion or being partially modified are not representative of the protein concentration. Without a mechanism to remove such unrepresentative peptides, their abundance adversely impacts the estimation of their protein's concentration. Here, we present a relative quantification approach, Diffacto, that applies factor analysis to extract the covariation of peptides' abundances. The method enables a weighted geometrical average summarization and automatic elimination of incoherent peptides. We demonstrate, based on a set of controlled label-free experiments using standard mixtures of proteins, that the covariation structure extracted by the factor analysis accurately reflects protein concentrations. In the 1% peptide-spectrum match-level FDR data set, as many as 11% of the peptides have abundance differences incoherent with the other peptides attributed to the same protein. If not controlled, such contradicting peptide abundance have a severe impact on protein quantifications. When adding the quantities of each protein's three most abundant peptides, we note as many as 14% of the proteins being estimated as having a negative correlation with their actual concentration differences between samples. Diffacto reduced the amount of such obviously incorrectly quantified proteins to 1.6%. Furthermore, by analyzing clinical data sets from two breast cancer studies, our method revealed the persistent proteomic signatures linked to three subtypes of breast cancer

  12. Identification of Extracellular Segments by Mass Spectrometry Improves Topology Prediction of Transmembrane Proteins.

    PubMed

    Langó, Tamás; Róna, Gergely; Hunyadi-Gulyás, Éva; Turiák, Lilla; Varga, Julia; Dobson, László; Várady, György; Drahos, László; Vértessy, Beáta G; Medzihradszky, Katalin F; Szakács, Gergely; Tusnády, Gábor E

    2017-02-13

    Transmembrane proteins play crucial role in signaling, ion transport, nutrient uptake, as well as in maintaining the dynamic equilibrium between the internal and external environment of cells. Despite their important biological functions and abundance, less than 2% of all determined structures are transmembrane proteins. Given the persisting technical difficulties associated with high resolution structure determination of transmembrane proteins, additional methods, including computational and experimental techniques remain vital in promoting our understanding of their topologies, 3D structures, functions and interactions. Here we report a method for the high-throughput determination of extracellular segments of transmembrane proteins based on the identification of surface labeled and biotin captured peptide fragments by LC/MS/MS. We show that reliable identification of extracellular protein segments increases the accuracy and reliability of existing topology prediction algorithms. Using the experimental topology data as constraints, our improved prediction tool provides accurate and reliable topology models for hundreds of human transmembrane proteins.

  13. Accurate mass measurements and their appropriate use for reliable analyte identification.

    PubMed

    Godfrey, A Ruth; Brenton, A Gareth

    2012-09-01

    Accurate mass instrumentation is becoming increasingly available to non-expert users. This data can be mis-used, particularly for analyte identification. Current best practice in assigning potential elemental formula for reliable analyte identification has been described with modern informatic approaches to analyte elucidation, including chemometric characterisation, data processing and searching using facilities such as the Chemical Abstracts Service (CAS) Registry and Chemspider.

  14. Identification of Microorganisms by High Resolution Tandem Mass Spectrometry with Accurate Statistical Significance

    NASA Astrophysics Data System (ADS)

    Alves, Gelio; Wang, Guanghui; Ogurtsov, Aleksey Y.; Drake, Steven K.; Gucek, Marjan; Suffredini, Anthony F.; Sacks, David B.; Yu, Yi-Kuo

    2016-02-01

    Correct and rapid identification of microorganisms is the key to the success of many important applications in health and safety, including, but not limited to, infection treatment, food safety, and biodefense. With the advance of mass spectrometry (MS) technology, the speed of identification can be greatly improved. However, the increasing number of microbes sequenced is challenging correct microbial identification because of the large number of choices present. To properly disentangle candidate microbes, one needs to go beyond apparent morphology or simple `fingerprinting'; to correctly prioritize the candidate microbes, one needs to have accurate statistical significance in microbial identification. We meet these challenges by using peptidome profiles of microbes to better separate them and by designing an analysis method that yields accurate statistical significance. Here, we present an analysis pipeline that uses tandem MS (MS/MS) spectra for microbial identification or classification. We have demonstrated, using MS/MS data of 81 samples, each composed of a single known microorganism, that the proposed pipeline can correctly identify microorganisms at least at the genus and species levels. We have also shown that the proposed pipeline computes accurate statistical significances, i.e., E-values for identified peptides and unified E-values for identified microorganisms. The proposed analysis pipeline has been implemented in MiCId, a freely available software for Microorganism Classification and Identification. MiCId is available for download at http://www.ncbi.nlm.nih.gov/CBBresearch/Yu/downloads.html.

  15. Lung Sliding Identification Is Less Accurate in the Left Hemithorax.

    PubMed

    Piette, Eric; Daoust, Raoul; Lambert, Jean; Denault, André

    2017-02-01

    The aim of our study was to compare the accuracy of lung sliding identification for the left and right hemithoraxes, using prerecorded short US sequences, in a group of physicians with mixed clinical and US training. A total of 140 US sequences of a complete respiratory cycle were recorded in the operating room. Each sequence was divided in two, yielding 140 sequences of present lung sliding and 140 sequences of absent lung sliding. Of these 280 sequences, 40 were randomly repeated to assess intraobserver variability, for a total of 320 sequences. Descriptive data, the mean accuracy of each participant, as well as the rate of correct answers for each of the original 280 sequences were tabulated and compared for different subgroups of clinical and US training. A video with examples of present and absent lung sliding and a lung pulse was shown before testing. Two sessions were planned to facilitate the participation of 75 clinicians. In the first group, the rate of accurate lung sliding identification was lower in the left hemithorax than in the right (67.0% [interquartile range (IQR), 43.0-83.0] versus 80.0% [IQR, 57.0-95.0]; P < .001). In the second group, the rate of accurate lung sliding identification was also lower in the left hemithorax than in the right (76.3% [IQR, 42.9-90.9] versus 88.7% [IQR, 63.1-96.9]; P = .001). Mean accuracy rates were 67.5% (95% confidence interval, 65.7-69.4) in the first group and 73.1% (95% confidence interval, 70.7-75.5) in the second (P < .001). Lung sliding identification seems less accurate in the left hemithorax when using a short US examination. This study was done on recorded US sequences and should be repeated in a live clinical situation to confirm our results. © 2016 by the American Institute of Ultrasound in Medicine.

  16. [Progress in the spectral library based protein identification strategy].

    PubMed

    Yu, Derui; Ma, Jie; Xie, Zengyan; Bai, Mingze; Zhu, Yunping; Shu, Kunxian

    2018-04-25

    Exponential growth of the mass spectrometry (MS) data is exhibited when the mass spectrometry-based proteomics has been developing rapidly. It is a great challenge to develop some quick, accurate and repeatable methods to identify peptides and proteins. Nowadays, the spectral library searching has become a mature strategy for tandem mass spectra based proteins identification in proteomics, which searches the experiment spectra against a collection of confidently identified MS/MS spectra that have been observed previously, and fully utilizes the abundance in the spectrum, peaks from non-canonical fragment ions, and other features. This review provides an overview of the implement of spectral library search strategy, and two key steps, spectral library construction and spectral library searching comprehensively, and discusses the progress and challenge of the library search strategy.

  17. Identification of DNA-binding proteins using multi-features fusion and binary firefly optimization algorithm.

    PubMed

    Zhang, Jian; Gao, Bo; Chai, Haiting; Ma, Zhiqiang; Yang, Guifu

    2016-08-26

    DNA-binding proteins (DBPs) play fundamental roles in many biological processes. Therefore, the developing of effective computational tools for identifying DBPs is becoming highly desirable. In this study, we proposed an accurate method for the prediction of DBPs. Firstly, we focused on the challenge of improving DBP prediction accuracy with information solely from the sequence. Secondly, we used multiple informative features to encode the protein. These features included evolutionary conservation profile, secondary structure motifs, and physicochemical properties. Thirdly, we introduced a novel improved Binary Firefly Algorithm (BFA) to remove redundant or noisy features as well as select optimal parameters for the classifier. The experimental results of our predictor on two benchmark datasets outperformed many state-of-the-art predictors, which revealed the effectiveness of our method. The promising prediction performance on a new-compiled independent testing dataset from PDB and a large-scale dataset from UniProt proved the good generalization ability of our method. In addition, the BFA forged in this research would be of great potential in practical applications in optimization fields, especially in feature selection problems. A highly accurate method was proposed for the identification of DBPs. A user-friendly web-server named iDbP (identification of DNA-binding Proteins) was constructed and provided for academic use.

  18. Applications of graph theory in protein structure identification

    PubMed Central

    2011-01-01

    There is a growing interest in the identification of proteins on the proteome wide scale. Among different kinds of protein structure identification methods, graph-theoretic methods are very sharp ones. Due to their lower costs, higher effectiveness and many other advantages, they have drawn more and more researchers’ attention nowadays. Specifically, graph-theoretic methods have been widely used in homology identification, side-chain cluster identification, peptide sequencing and so on. This paper reviews several methods in solving protein structure identification problems using graph theory. We mainly introduce classical methods and mathematical models including homology modeling based on clique finding, identification of side-chain clusters in protein structures upon graph spectrum, and de novo peptide sequencing via tandem mass spectrometry using the spectrum graph model. In addition, concluding remarks and future priorities of each method are given. PMID:22165974

  19. Accurate seismic phase identification and arrival time picking of glacial icequakes

    NASA Astrophysics Data System (ADS)

    Jones, G. A.; Doyle, S. H.; Dow, C.; Kulessa, B.; Hubbard, A.

    2010-12-01

    A catastrophic lake drainage event was monitored continuously using an array of 6, 4.5 Hz 3 component geophones in the Russell Glacier catchment, Western Greenland. Many thousands of events and arrival time phases (e.g., P- or S-wave) were recorded, often with events occurring simultaneously but at different locations. In addition, different styles of seismic events were identified from 'classical' tectonic earthquakes to tremors usually observed in volcanic regions. The presence of such a diverse and large dataset provides insight into the complex system of lake drainage. One of the most fundamental steps in seismology is the accurate identification of a seismic event and its associated arrival times. However, the collection of such a large and complex dataset makes the manual identification of a seismic event and picking of the arrival time phases time consuming with variable results. To overcome the issues of consistency and manpower, a number of different methods have been developed including short-term and long-term averages, spectrograms, wavelets, polarisation analyses, higher order statistics and auto-regressive techniques. Here we propose an automated procedure which establishes the phase type and accurately determines the arrival times. The procedure combines a number of different automated methods to achieve this, and is applied to the recently acquired lake drainage data. Accurate identification of events and their arrival time phases are the first steps in gaining a greater understanding of the extent of the deformation and the mechanism of such drainage events. A good knowledge of the propagation pathway of lake drainage meltwater through a glacier will have significant consequences for interpretation of glacial and ice sheet dynamics.

  20. A scalable and accurate method for classifying protein-ligand binding geometries using a MapReduce approach.

    PubMed

    Estrada, T; Zhang, B; Cicotti, P; Armen, R S; Taufer, M

    2012-07-01

    We present a scalable and accurate method for classifying protein-ligand binding geometries in molecular docking. Our method is a three-step process: the first step encodes the geometry of a three-dimensional (3D) ligand conformation into a single 3D point in the space; the second step builds an octree by assigning an octant identifier to every single point in the space under consideration; and the third step performs an octree-based clustering on the reduced conformation space and identifies the most dense octant. We adapt our method for MapReduce and implement it in Hadoop. The load-balancing, fault-tolerance, and scalability in MapReduce allow screening of very large conformation spaces not approachable with traditional clustering methods. We analyze results for docking trials for 23 protein-ligand complexes for HIV protease, 21 protein-ligand complexes for Trypsin, and 12 protein-ligand complexes for P38alpha kinase. We also analyze cross docking trials for 24 ligands, each docking into 24 protein conformations of the HIV protease, and receptor ensemble docking trials for 24 ligands, each docking in a pool of HIV protease receptors. Our method demonstrates significant improvement over energy-only scoring for the accurate identification of native ligand geometries in all these docking assessments. The advantages of our clustering approach make it attractive for complex applications in real-world drug design efforts. We demonstrate that our method is particularly useful for clustering docking results using a minimal ensemble of representative protein conformational states (receptor ensemble docking), which is now a common strategy to address protein flexibility in molecular docking. Copyright © 2012 Elsevier Ltd. All rights reserved.

  1. A statistical method for assessing peptide identification confidence in accurate mass and time tag proteomics

    PubMed Central

    Stanley, Jeffrey R.; Adkins, Joshua N.; Slysz, Gordon W.; Monroe, Matthew E.; Purvine, Samuel O.; Karpievitch, Yuliya V.; Anderson, Gordon A.; Smith, Richard D.; Dabney, Alan R.

    2011-01-01

    Current algorithms for quantifying peptide identification confidence in the accurate mass and time (AMT) tag approach assume that the AMT tags themselves have been correctly identified. However, there is uncertainty in the identification of AMT tags, as this is based on matching LC-MS/MS fragmentation spectra to peptide sequences. In this paper, we incorporate confidence measures for the AMT tag identifications into the calculation of probabilities for correct matches to an AMT tag database, resulting in a more accurate overall measure of identification confidence for the AMT tag approach. The method is referred to as Statistical Tools for AMT tag Confidence (STAC). STAC additionally provides a Uniqueness Probability (UP) to help distinguish between multiple matches to an AMT tag and a method to calculate an overall false discovery rate (FDR). STAC is freely available for download as both a command line and a Windows graphical application. PMID:21692516

  2. IFPTarget: A Customized Virtual Target Identification Method Based on Protein-Ligand Interaction Fingerprinting Analyses.

    PubMed

    Li, Guo-Bo; Yu, Zhu-Jun; Liu, Sha; Huang, Lu-Yi; Yang, Ling-Ling; Lohans, Christopher T; Yang, Sheng-Yong

    2017-07-24

    Small-molecule target identification is an important and challenging task for chemical biology and drug discovery. Structure-based virtual target identification has been widely used, which infers and prioritizes potential protein targets for the molecule of interest (MOI) principally via a scoring function. However, current "universal" scoring functions may not always accurately identify targets to which the MOI binds from the retrieved target database, in part due to a lack of consideration of the important binding features for an individual target. Here, we present IFPTarget, a customized virtual target identification method, which uses an interaction fingerprinting (IFP) method for target-specific interaction analyses and a comprehensive index (Cvalue) for target ranking. Evaluation results indicate that the IFP method enables substantially improved binding pose prediction, and Cvalue has an excellent performance in target ranking for the test set. When applied to screen against our established target library that contains 11,863 protein structures covering 2842 unique targets, IFPTarget could retrieve known targets within the top-ranked list and identified new potential targets for chemically diverse drugs. IFPTarget prediction led to the identification of the metallo-β-lactamase VIM-2 as a target for quercetin as validated by enzymatic inhibition assays. This study provides a new in silico target identification tool and will aid future efforts to develop new target-customized methods for target identification.

  3. Specific identification of Bacillus anthracis strains

    NASA Astrophysics Data System (ADS)

    Krishnamurthy, Thaiya; Deshpande, Samir; Hewel, Johannes; Liu, Hongbin; Wick, Charles H.; Yates, John R., III

    2007-01-01

    Accurate identification of human pathogens is the initial vital step in treating the civilian terrorism victims and military personnel afflicted in biological threat situations. We have applied a powerful multi-dimensional protein identification technology (MudPIT) along with newly generated software termed Profiler to identify the sequences of specific proteins observed for few strains of Bacillus anthracis, a human pathogen. Software termed Profiler was created to initially screen the MudPIT data of B. anthracis strains and establish the observed proteins specific for its strains. A database was also generated using Profiler containing marker proteins of B. anthracis and its strains, which in turn could be used for detecting the organism and its corresponding strains in samples. Analysis of the unknowns by our methodology, combining MudPIT and Profiler, led to the accurate identification of the anthracis strains present in samples. Thus, a new approach for the identification of B. anthracis strains in unknown samples, based on the molecular mass and sequences of marker proteins, has been ascertained.

  4. Rapid Classification and Identification of Multiple Microorganisms with Accurate Statistical Significance via High-Resolution Tandem Mass Spectrometry

    NASA Astrophysics Data System (ADS)

    Alves, Gelio; Wang, Guanghui; Ogurtsov, Aleksey Y.; Drake, Steven K.; Gucek, Marjan; Sacks, David B.; Yu, Yi-Kuo

    2018-06-01

    Rapid and accurate identification and classification of microorganisms is of paramount importance to public health and safety. With the advance of mass spectrometry (MS) technology, the speed of identification can be greatly improved. However, the increasing number of microbes sequenced is complicating correct microbial identification even in a simple sample due to the large number of candidates present. To properly untwine candidate microbes in samples containing one or more microbes, one needs to go beyond apparent morphology or simple "fingerprinting"; to correctly prioritize the candidate microbes, one needs to have accurate statistical significance in microbial identification. We meet these challenges by using peptide-centric representations of microbes to better separate them and by augmenting our earlier analysis method that yields accurate statistical significance. Here, we present an updated analysis workflow that uses tandem MS (MS/MS) spectra for microbial identification or classification. We have demonstrated, using 226 MS/MS publicly available data files (each containing from 2500 to nearly 100,000 MS/MS spectra) and 4000 additional MS/MS data files, that the updated workflow can correctly identify multiple microbes at the genus and often the species level for samples containing more than one microbe. We have also shown that the proposed workflow computes accurate statistical significances, i.e., E values for identified peptides and unified E values for identified microbes. Our updated analysis workflow MiCId, a freely available software for Microorganism Classification and Identification, is available for download at https://www.ncbi.nlm.nih.gov/CBBresearch/Yu/downloads.html.

  5. Rapid Classification and Identification of Multiple Microorganisms with Accurate Statistical Significance via High-Resolution Tandem Mass Spectrometry.

    PubMed

    Alves, Gelio; Wang, Guanghui; Ogurtsov, Aleksey Y; Drake, Steven K; Gucek, Marjan; Sacks, David B; Yu, Yi-Kuo

    2018-06-05

    Rapid and accurate identification and classification of microorganisms is of paramount importance to public health and safety. With the advance of mass spectrometry (MS) technology, the speed of identification can be greatly improved. However, the increasing number of microbes sequenced is complicating correct microbial identification even in a simple sample due to the large number of candidates present. To properly untwine candidate microbes in samples containing one or more microbes, one needs to go beyond apparent morphology or simple "fingerprinting"; to correctly prioritize the candidate microbes, one needs to have accurate statistical significance in microbial identification. We meet these challenges by using peptide-centric representations of microbes to better separate them and by augmenting our earlier analysis method that yields accurate statistical significance. Here, we present an updated analysis workflow that uses tandem MS (MS/MS) spectra for microbial identification or classification. We have demonstrated, using 226 MS/MS publicly available data files (each containing from 2500 to nearly 100,000 MS/MS spectra) and 4000 additional MS/MS data files, that the updated workflow can correctly identify multiple microbes at the genus and often the species level for samples containing more than one microbe. We have also shown that the proposed workflow computes accurate statistical significances, i.e., E values for identified peptides and unified E values for identified microbes. Our updated analysis workflow MiCId, a freely available software for Microorganism Classification and Identification, is available for download at https://www.ncbi.nlm.nih.gov/CBBresearch/Yu/downloads.html . Graphical Abstract ᅟ.

  6. Accurate in silico identification of protein succinylation sites using an iterative semi-supervised learning technique.

    PubMed

    Zhao, Xiaowei; Ning, Qiao; Chai, Haiting; Ma, Zhiqiang

    2015-06-07

    As a widespread type of protein post-translational modifications (PTMs), succinylation plays an important role in regulating protein conformation, function and physicochemical properties. Compared with the labor-intensive and time-consuming experimental approaches, computational predictions of succinylation sites are much desirable due to their convenient and fast speed. Currently, numerous computational models have been developed to identify PTMs sites through various types of two-class machine learning algorithms. These methods require both positive and negative samples for training. However, designation of the negative samples of PTMs was difficult and if it is not properly done can affect the performance of computational models dramatically. So that in this work, we implemented the first application of positive samples only learning (PSoL) algorithm to succinylation sites prediction problem, which was a special class of semi-supervised machine learning that used positive samples and unlabeled samples to train the model. Meanwhile, we proposed a novel succinylation sites computational predictor called SucPred (succinylation site predictor) by using multiple feature encoding schemes. Promising results were obtained by the SucPred predictor with an accuracy of 88.65% using 5-fold cross validation on the training dataset and an accuracy of 84.40% on the independent testing dataset, which demonstrated that the positive samples only learning algorithm presented here was particularly useful for identification of protein succinylation sites. Besides, the positive samples only learning algorithm can be applied to build predictors for other types of PTMs sites with ease. A web server for predicting succinylation sites was developed and was freely accessible at http://59.73.198.144:8088/SucPred/. Copyright © 2015 Elsevier Ltd. All rights reserved.

  7. Protein Identification Using Top-Down Spectra*

    PubMed Central

    Liu, Xiaowen; Sirotkin, Yakov; Shen, Yufeng; Anderson, Gordon; Tsai, Yihsuan S.; Ting, Ying S.; Goodlett, David R.; Smith, Richard D.; Bafna, Vineet; Pevzner, Pavel A.

    2012-01-01

    In the last two years, because of advances in protein separation and mass spectrometry, top-down mass spectrometry moved from analyzing single proteins to analyzing complex samples and identifying hundreds and even thousands of proteins. However, computational tools for database search of top-down spectra against protein databases are still in their infancy. We describe MS-Align+, a fast algorithm for top-down protein identification based on spectral alignment that enables searches for unexpected post-translational modifications. We also propose a method for evaluating statistical significance of top-down protein identifications and further benchmark various software tools on two top-down data sets from Saccharomyces cerevisiae and Salmonella typhimurium. We demonstrate that MS-Align+ significantly increases the number of identified spectra as compared with MASCOT and OMSSA on both data sets. Although MS-Align+ and ProSightPC have similar performance on the Salmonella typhimurium data set, MS-Align+ outperforms ProSightPC on the (more complex) Saccharomyces cerevisiae data set. PMID:22027200

  8. Mechanism for accurate, protein-assisted DNA annealing by Deinococcus radiodurans DdrB

    PubMed Central

    Sugiman-Marangos, Seiji N.; Weiss, Yoni M.; Junop, Murray S.

    2016-01-01

    Accurate pairing of DNA strands is essential for repair of DNA double-strand breaks (DSBs). How cells achieve accurate annealing when large regions of single-strand DNA are unpaired has remained unclear despite many efforts focused on understanding proteins, which mediate this process. Here we report the crystal structure of a single-strand annealing protein [DdrB (DNA damage response B)] in complex with a partially annealed DNA intermediate to 2.2 Å. This structure and supporting biochemical data reveal a mechanism for accurate annealing involving DdrB-mediated proofreading of strand complementarity. DdrB promotes high-fidelity annealing by constraining specific bases from unauthorized association and only releases annealed duplex when bound strands are fully complementary. To our knowledge, this mechanism provides the first understanding for how cells achieve accurate, protein-assisted strand annealing under biological conditions that would otherwise favor misannealing. PMID:27044084

  9. Stable isotope, site-specific mass tagging for protein identification

    DOEpatents

    Chen, Xian

    2006-10-24

    Proteolytic peptide mass mapping as measured by mass spectrometry provides an important method for the identification of proteins, which are usually identified by matching the measured and calculated m/z values of the proteolytic peptides. A unique identification is, however, heavily dependent upon the mass accuracy and sequence coverage of the fragment ions generated by peptide ionization. The present invention describes a method for increasing the specificity, accuracy and efficiency of the assignments of particular proteolytic peptides and consequent protein identification, by the incorporation of selected amino acid residue(s) enriched with stable isotope(s) into the protein sequence without the need for ultrahigh instrumental accuracy. Selected amino acid(s) are labeled with .sup.13C/.sup.15N/.sup.2H and incorporated into proteins in a sequence-specific manner during cell culturing. Each of these labeled amino acids carries a defined mass change encoded in its monoisotopic distribution pattern. Through their characteristic patterns, the peptides with mass tag(s) can then be readily distinguished from other peptides in mass spectra. The present method of identifying unique proteins can also be extended to protein complexes and will significantly increase data search specificity, efficiency and accuracy for protein identifications.

  10. A systematic identification of species-specific protein succinylation sites using joint element features information.

    PubMed

    Hasan, Md Mehedi; Khatun, Mst Shamima; Mollah, Md Nurul Haque; Yong, Cao; Guo, Dianjing

    2017-01-01

    Lysine succinylation, an important type of protein posttranslational modification, plays significant roles in many cellular processes. Accurate identification of succinylation sites can facilitate our understanding about the molecular mechanism and potential roles of lysine succinylation. However, even in well-studied systems, a majority of the succinylation sites remain undetected because the traditional experimental approaches to succinylation site identification are often costly, time-consuming, and laborious. In silico approach, on the other hand, is potentially an alternative strategy to predict succinylation substrates. In this paper, a novel computational predictor SuccinSite2.0 was developed for predicting generic and species-specific protein succinylation sites. This predictor takes the composition of profile-based amino acid and orthogonal binary features, which were used to train a random forest classifier. We demonstrated that the proposed SuccinSite2.0 predictor outperformed other currently existing implementations on a complementarily independent dataset. Furthermore, the important features that make visible contributions to species-specific and cross-species-specific prediction of protein succinylation site were analyzed. The proposed predictor is anticipated to be a useful computational resource for lysine succinylation site prediction. The integrated species-specific online tool of SuccinSite2.0 is publicly accessible.

  11. Identification of Protein-Protein Interactions with Glutathione-S-Transferase (GST) Fusion Proteins.

    PubMed

    Einarson, Margret B; Pugacheva, Elena N; Orlinick, Jason R

    2007-08-01

    INTRODUCTIONGlutathione-S-transferase (GST) fusion proteins have had a wide range of applications since their introduction as tools for synthesis of recombinant proteins in bacteria. GST was originally selected as a fusion moiety because of several desirable properties. First and foremost, when expressed in bacteria alone, or as a fusion, GST is not sequestered in inclusion bodies (in contrast to previous fusion protein systems). Second, GST can be affinity-purified without denaturation because it binds to immobilized glutathione, which provides the basis for simple purification. Consequently, GST fusion proteins are routinely used for antibody generation and purification, protein-protein interaction studies, and biochemical analysis. This article describes the use of GST fusion proteins as probes for the identification of protein-protein interactions.

  12. Comprehensive Identification of Proteins from MALDI Imaging*

    PubMed Central

    Maier, Stefan K.; Hahne, Hannes; Gholami, Amin Moghaddas; Balluff, Benjamin; Meding, Stephan; Schoene, Cédrik; Walch, Axel K.; Kuster, Bernhard

    2013-01-01

    Matrix-assisted laser desorption/ionization imaging mass spectrometry (MALDI IMS) is a powerful tool for the visualization of proteins in tissues and has demonstrated considerable diagnostic and prognostic value. One main challenge is that the molecular identity of such potential biomarkers mostly remains unknown. We introduce a generic method that removes this issue by systematically identifying the proteins embedded in the MALDI matrix using a combination of bottom-up and top-down proteomics. The analyses of ten human tissues lead to the identification of 1400 abundant and soluble proteins constituting the set of proteins detectable by MALDI IMS including >90% of all IMS biomarkers reported in the literature. Top-down analysis of the matrix proteome identified 124 mostly N- and C-terminally fragmented proteins indicating considerable protein processing activity in tissues. All protein identification data from this study as well as the IMS literature has been deposited into MaTisse, a new publically available database, which we anticipate will become a valuable resource for the IMS community. PMID:23782541

  13. Identification of secreted bacterial proteins by noncanonical amino acid tagging

    PubMed Central

    Mahdavi, Alborz; Szychowski, Janek; Ngo, John T.; Sweredoski, Michael J.; Graham, Robert L. J.; Hess, Sonja; Schneewind, Olaf; Mazmanian, Sarkis K.; Tirrell, David A.

    2014-01-01

    Pathogenic microbes have evolved complex secretion systems to deliver virulence factors into host cells. Identification of these factors is critical for understanding the infection process. We report a powerful and versatile approach to the selective labeling and identification of secreted pathogen proteins. Selective labeling of microbial proteins is accomplished via translational incorporation of azidonorleucine (Anl), a methionine surrogate that requires a mutant form of the methionyl-tRNA synthetase for activation. Secreted pathogen proteins containing Anl can be tagged by azide-alkyne cycloaddition and enriched by affinity purification. Application of the method to analysis of the type III secretion system of the human pathogen Yersinia enterocolitica enabled efficient identification of secreted proteins, identification of distinct secretion profiles for intracellular and extracellular bacteria, and determination of the order of substrate injection into host cells. This approach should be widely useful for the identification of virulence factors in microbial pathogens and the development of potential new targets for antimicrobial therapy. PMID:24347637

  14. Identification of a putative protein profile associating with tamoxifen therapy resistance in breast cancer

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Umar, Arzu; Kang, Hyuk; Timmermans, A. M.

    2009-06-01

    Tamoxifen-resistance is a major cause of death in patients with recurrent breast cancer. Current clinical factors can correctly predict therapy response in only half of the treated patients. Identification of proteins that associate with tamoxifen-resistance is a first step towards better response prediction and tailored treatment of patients. In the present study we intended to identify putative protein biomarkers indicative of tamoxifen therapy-resistance in breast cancer, using nanoLC coupled with FTICR MS. Comparative proteome analysis was performed on ~5,500 pooled tumor cells (corresponding to ~550 ng protein lysate/analysis) obtained through laser capture microdissection (LCM) from two independently processed data setsmore » (n=24 and n=27) containing both tamoxifen therapy-sensitive and therapy-resistant tumors. Peptides and proteins were identified by matching mass and elution time of newly acquired LC-MS features to information in previously generated accurate mass and time tag (AMT) reference databases.« less

  15. Accurate protein structure modeling using sparse NMR data and homologous structure information.

    PubMed

    Thompson, James M; Sgourakis, Nikolaos G; Liu, Gaohua; Rossi, Paolo; Tang, Yuefeng; Mills, Jeffrey L; Szyperski, Thomas; Montelione, Gaetano T; Baker, David

    2012-06-19

    While information from homologous structures plays a central role in X-ray structure determination by molecular replacement, such information is rarely used in NMR structure determination because it can be incorrect, both locally and globally, when evolutionary relationships are inferred incorrectly or there has been considerable evolutionary structural divergence. Here we describe a method that allows robust modeling of protein structures of up to 225 residues by combining (1)H(N), (13)C, and (15)N backbone and (13)Cβ chemical shift data, distance restraints derived from homologous structures, and a physically realistic all-atom energy function. Accurate models are distinguished from inaccurate models generated using incorrect sequence alignments by requiring that (i) the all-atom energies of models generated using the restraints are lower than models generated in unrestrained calculations and (ii) the low-energy structures converge to within 2.0 Å backbone rmsd over 75% of the protein. Benchmark calculations on known structures and blind targets show that the method can accurately model protein structures, even with very remote homology information, to a backbone rmsd of 1.2-1.9 Å relative to the conventional determined NMR ensembles and of 0.9-1.6 Å relative to X-ray structures for well-defined regions of the protein structures. This approach facilitates the accurate modeling of protein structures using backbone chemical shift data without need for side-chain resonance assignments and extensive analysis of NOESY cross-peak assignments.

  16. Automatic poisson peak harvesting for high throughput protein identification.

    PubMed

    Breen, E J; Hopwood, F G; Williams, K L; Wilkins, M R

    2000-06-01

    High throughput identification of proteins by peptide mass fingerprinting requires an efficient means of picking peaks from mass spectra. Here, we report the development of a peak harvester to automatically pick monoisotopic peaks from spectra generated on matrix-assisted laser desorption/ionisation time of flight (MALDI-TOF) mass spectrometers. The peak harvester uses advanced mathematical morphology and watershed algorithms to first process spectra to stick representations. Subsequently, Poisson modelling is applied to determine which peak in an isotopically resolved group represents the monoisotopic mass of a peptide. We illustrate the features of the peak harvester with mass spectra of standard peptides, digests of gel-separated bovine serum albumin, and with Escherictia coli proteins prepared by two-dimensional polyacrylamide gel electrophoresis. In all cases, the peak harvester proved effective in its ability to pick similar monoisotopic peaks as an experienced human operator, and also proved effective in the identification of monoisotopic masses in cases where isotopic distributions of peptides were overlapping. The peak harvester can be operated in an interactive mode, or can be completely automated and linked through to peptide mass fingerprinting protein identification tools to achieve high throughput automated protein identification.

  17. Method optimization for proteomic analysis of soybean leaf: Improvements in identification of new and low-abundance proteins

    PubMed Central

    Mesquita, Rosilene Oliveira; de Almeida Soares, Eduardo; de Barros, Everaldo Gonçalves; Loureiro, Marcelo Ehlers

    2012-01-01

    The most critical step in any proteomic study is protein extraction and sample preparation. Better solubilization increases the separation and resolution of gels, allowing identification of a higher number of proteins and more accurate quantitation of differences in gene expression. Despite the existence of published results for the optimization of proteomic analyses of soybean seeds, no comparable data are available for proteomic studies of soybean leaf tissue. In this work we have tested the effects of modification of a TCA-acetone method on the resolution of 2-DE gels of leaves and roots of soybean. Better focusing was obtained when both mercaptoethanol and dithiothreitol were used in the extraction buffer simultaneously. Increasing the number of washes of TCA precipitated protein with acetone, using a final wash with 80% ethanol and using sonication to ressuspend the pellet increased the number of detected proteins as well the resolution of the 2-DE gels. Using this approach we have constructed a soybean protein map. The major group of identified proteins corresponded to genes of unknown function. The second and third most abundant groups of proteins were composed of photosynthesis and metabolism related genes. The resulting protocol improved protein solubility and gel resolution allowing the identification of 122 soybean leaf proteins, 72 of which were not detected in other published soybean leaf 2-DE gel datasets, including a transcription factor and several signaling proteins. PMID:22802721

  18. Basophile: Accurate Fragment Charge State Prediction Improves Peptide Identification Rates

    DOE PAGES

    Wang, Dong; Dasari, Surendra; Chambers, Matthew C.; ...

    2013-03-07

    In shotgun proteomics, database search algorithms rely on fragmentation models to predict fragment ions that should be observed for a given peptide sequence. The most widely used strategy (Naive model) is oversimplified, cleaving all peptide bonds with equal probability to produce fragments of all charges below that of the precursor ion. More accurate models, based on fragmentation simulation, are too computationally intensive for on-the-fly use in database search algorithms. We have created an ordinal-regression-based model called Basophile that takes fragment size and basic residue distribution into account when determining the charge retention during CID/higher-energy collision induced dissociation (HCD) of chargedmore » peptides. This model improves the accuracy of predictions by reducing the number of unnecessary fragments that are routinely predicted for highly-charged precursors. Basophile increased the identification rates by 26% (on average) over the Naive model, when analyzing triply-charged precursors from ion trap data. Basophile achieves simplicity and speed by solving the prediction problem with an ordinal regression equation, which can be incorporated into any database search software for shotgun proteomic identification.« less

  19. Seed storage proteins as a system for teaching protein identification by mass spectrometry in biochemistry laboratory.

    PubMed

    Wilson, Karl A; Tan-Wilson, Anna

    2013-01-01

    Mass spectrometry (MS) has become an important tool in studying biological systems. One application is the identification of proteins and peptides by the matching of peptide and peptide fragment masses to the sequences of proteins in protein sequence databases. Often prior protein separation of complex protein mixtures by 2D-PAGE is needed, requiring more time and expertise than instructors of large laboratory classes can devote. We have developed an experimental module for our Biochemistry Laboratory course that engages students in MS-based protein identification following protein separation by one-dimensional SDS-PAGE, a technique that is usually taught in this type of course. The module is based on soybean seed storage proteins, a relatively simple mixture of proteins present in high levels in the seed, allowing the identification of the main protein bands by MS/MS and in some cases, even by peptide mass fingerprinting. Students can identify their protein bands using software available on the Internet, and are challenged to deduce post-translational modifications that have occurred upon germination. A collection of mass spectral data and tutorials that can be used as a stand-alone computer-based laboratory module were also assembled. Copyright © 2013 International Union of Biochemistry and Molecular Biology, Inc.

  20. Identification of Modules in Protein-Protein Interaction Networks

    NASA Astrophysics Data System (ADS)

    Erten, Sinan; Koyutürk, Mehmet

    In biological systems, most processes are carried out through orchestration of multiple interacting molecules. These interactions are often abstracted using network models. A key feature of cellular networks is their modularity, which contributes significantly to the robustness, as well as adaptability of biological systems. Therefore, modularization of cellular networks is likely to be useful in obtaining insights into the working principles of cellular systems, as well as building tractable models of cellular organization and dynamics. A common, high-throughput source of data on molecular interactions is in the form of physical interactions between proteins, which are organized into protein-protein interaction (PPI) networks. This chapter provides an overview on identification and analysis of functional modules in PPI networks, which has been an active area of research in the last decade.

  1. General M13 phage display: M13 phage display in identification and characterization of protein-protein interactions.

    PubMed

    Hertveldt, Kirsten; Beliën, Tim; Volckaert, Guido

    2009-01-01

    In M13 phage display, proteins and peptides are exposed on one of the surface proteins of filamentous phage particles and become accessible to affinity enrichment against a bait of interest. We describe the construction of fragmented whole genome and gene fragment phage display libraries and interaction selection by panning. This strategy allows the identification and characterization of interacting proteins on a genomic scale by screening the fragmented "proteome" against protein baits. Gene fragment libraries allow a more in depth characterization of the protein-protein interaction site by identification of the protein region involved in the interaction.

  2. Improved method for rapid and accurate isolation and identification of Streptococcus mutans and Streptococcus sobrinus from human plaque samples.

    PubMed

    Villhauer, Alissa L; Lynch, David J; Drake, David R

    2017-08-01

    Mutans streptococci (MS), specifically Streptococcus mutans (SM) and Streptococcus sobrinus (SS), are bacterial species frequently targeted for investigation due to their role in the etiology of dental caries. Differentiation of S. mutans and S. sobrinus is an essential part of exploring the role of these organisms in disease progression and the impact of the presence of either/both on a subject's caries experience. Of vital importance to the study of these organisms is an identification protocol that allows us to distinguish between the two species in an easy, accurate, and timely manner. While conducting a 5-year birth cohort study in a Northern Plains American Indian tribe, the need for a more rapid procedure for isolating and identifying high volumes of MS was recognized. We report here on the development of an accurate and rapid method for MS identification. Accuracy, ease of use, and material and time requirements for morphological differentiation on selective agar, biochemical tests, and various combinations of PCR primers were compared. The final protocol included preliminary identification based on colony morphology followed by PCR confirmation of species identification using primers targeting regions of the glucosyltransferase (gtf) genes of SM and SS. This method of isolation and identification was found to be highly accurate, more rapid than the previous methodology used, and easily learned. It resulted in more efficient use of both time and material resources. Copyright © 2017 Elsevier B.V. All rights reserved.

  3. Accurate prediction of protein-protein interactions by integrating potential evolutionary information embedded in PSSM profile and discriminative vector machine classifier.

    PubMed

    Li, Zheng-Wei; You, Zhu-Hong; Chen, Xing; Li, Li-Ping; Huang, De-Shuang; Yan, Gui-Ying; Nie, Ru; Huang, Yu-An

    2017-04-04

    Identification of protein-protein interactions (PPIs) is of critical importance for deciphering the underlying mechanisms of almost all biological processes of cell and providing great insight into the study of human disease. Although much effort has been devoted to identifying PPIs from various organisms, existing high-throughput biological techniques are time-consuming, expensive, and have high false positive and negative results. Thus it is highly urgent to develop in silico methods to predict PPIs efficiently and accurately in this post genomic era. In this article, we report a novel computational model combining our newly developed discriminative vector machine classifier (DVM) and an improved Weber local descriptor (IWLD) for the prediction of PPIs. Two components, differential excitation and orientation, are exploited to build evolutionary features for each protein sequence. The main characteristics of the proposed method lies in introducing an effective feature descriptor IWLD which can capture highly discriminative evolutionary information from position-specific scoring matrixes (PSSM) of protein data, and employing the powerful and robust DVM classifier. When applying the proposed method to Yeast and H. pylori data sets, we obtained excellent prediction accuracies as high as 96.52% and 91.80%, respectively, which are significantly better than the previous methods. Extensive experiments were then performed for predicting cross-species PPIs and the predictive results were also pretty promising. To further validate the performance of the proposed method, we compared it with the state-of-the-art support vector machine (SVM) classifier on Human data set. The experimental results obtained indicate that our method is highly effective for PPIs prediction and can be taken as a supplementary tool for future proteomics research.

  4. Evaluation of protein spectra cluster analysis for Streptococcus spp. identification from various swine clinical samples.

    PubMed

    Matajira, Carlos E C; Moreno, Luisa Z; Gomes, Vasco T M; Silva, Ana Paula S; Mesquita, Renan E; Doto, Daniela S; Calderaro, Franco F; de Souza, Fernando N; Christ, Ana Paula G; Sato, Maria Inês Z; Moreno, Andrea M

    2017-03-01

    Traditional microbiological methods enable genus-level identification of Streptococcus spp. isolates. However, as the species of this genus show broad phenotypic variation, species-level identification or even differentiation within the genus is difficult. Herein we report the evaluation of protein spectra cluster analysis for the identification of Streptococcus species associated with disease in swine by means of matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS). A total of 250 S. suis-like isolates obtained from pigs with clinical signs of encephalitis, arthritis, pneumonia, metritis, and urinary or septicemic infection were studied. The isolates came from pigs in different Brazilian states from 2001 to 2014. The MALDI-TOF MS analysis identified 86% (215 of 250) as S. suis and 14% (35 of 250) as S. alactolyticus, S. dysgalactiae, S. gallinaceus, S. gallolyticus, S. gordonii, S. henryi, S. hyointestinalis, S. hyovaginalis, S. mitis, S. oralis, S. pluranimalium, and S. sanguinis. The MALDI-TOF MS identification was confirmed in 99.2% of the isolates by 16S rDNA sequencing, with MALDI-TOF MS misidentifying 2 S. pluranimalium as S. hyovaginalis. Isolates were also tested by a biochemical automated system that correctly identified all isolates of 8 of the 10 species in the database. Neither the isolates of the 3 species not in the database ( S. gallinaceus, S. henryi, and S. hyovaginalis) nor the isolates of 2 species that were in the database ( S. oralis and S. pluranimalium) could be identified. The topology of the protein spectra cluster analysis appears to sustain the species phylogenetic similarities, further supporting identification by MALDI-TOF MS examination as a rapid and accurate alternative to 16S rDNA sequencing.

  5. Identification of ubiquitin/ubiquitin-like protein modification from tandem mass spectra with various PTMs

    PubMed Central

    2011-01-01

    Background Various solutions have been introduced for the identification of post-translational modification (PTM) from tandem mass spectrometry (MS/MS) in proteomics field but the identification of peptide modifiers, such as Ubiquitin (Ub) and ubiquitin-like proteins (Ubls), is still a challenge. The fragmentation of peptide modifier produce complex shifted ion mass patterns in combination with other PTMs, which makes it difficult to identify and locate the PTMs on a protein sequence. Currently, most PTM identification methods do not consider the complex fragmentation of peptide modifier or deals it separately from the other PTMs. Results We developed an advanced PTM identification method that inspects possible ion patterns of the most known peptide modifiers as well as other known biological and chemical PTMs to make more comprehensive and accurate conclusion. The proposed method searches all detectable mass differences of measured peaks from their theoretical values and the mass differences within mass tolerance range are grouped as mass shift classes. The most possible locations of multiple PTMs including peptide modifiers can be determined by evaluating all possible scenarios generated by the combination of the qualified mass shift classes.The proposed method showed excellent performance in the test with simulated spectra having various PTMs including peptide modifiers and in the comparison with recently developed methods such as QuickMod and SUMmOn. In the analysis of HUPO Brain Proteome Project (BPP) datasets, the proposed method could find the ubiquitin modification sites that were not identified by other conventional methods. Conclusions This work presents a novel method for identifying bothpeptide modifiers that generate complex fragmentation patternsand PTMs that are not fragmented during fragmentation processfrom tandem mass spectra. PMID:22373085

  6. MM-ISMSA: An Ultrafast and Accurate Scoring Function for Protein-Protein Docking.

    PubMed

    Klett, Javier; Núñez-Salgado, Alfonso; Dos Santos, Helena G; Cortés-Cabrera, Álvaro; Perona, Almudena; Gil-Redondo, Rubén; Abia, David; Gago, Federico; Morreale, Antonio

    2012-09-11

    An ultrafast and accurate scoring function for protein-protein docking is presented. It includes (1) a molecular mechanics (MM) part based on a 12-6 Lennard-Jones potential; (2) an electrostatic component based on an implicit solvent model (ISM) with individual desolvation penalties for each partner in the protein-protein complex plus a hydrogen bonding term; and (3) a surface area (SA) contribution to account for the loss of water contacts upon protein-protein complex formation. The accuracy and performance of the scoring function, termed MM-ISMSA, have been assessed by (1) comparing the total binding energies, the electrostatic term, and its components (charge-charge and individual desolvation energies), as well as the per residue contributions, to results obtained with well-established methods such as APBSA or MM-PB(GB)SA for a set of 1242 decoy protein-protein complexes and (2) testing its ability to recognize the docking solution closest to the experimental structure as that providing the most favorable total binding energy. For this purpose, a test set consisting of 15 protein-protein complexes with known 3D structure mixed with 10 decoys for each complex was used. The correlation between the values afforded by MM-ISMSA and those from the other methods is quite remarkable (r(2) ∼ 0.9), and only 0.2-5.0 s (depending on the number of residues) are spent on a single calculation including an all vs all pairwise energy decomposition. On the other hand, MM-ISMSA correctly identifies the best docking solution as that closest to the experimental structure in 80% of the cases. Finally, MM-ISMSA can process molecular dynamics trajectories and reports the results as averaged values with their standard deviations. MM-ISMSA has been implemented as a plugin to the widely used molecular graphics program PyMOL, although it can also be executed in command-line mode. MM-ISMSA is distributed free of charge to nonprofit organizations.

  7. Choosing an Optimal Database for Protein Identification from Tandem Mass Spectrometry Data.

    PubMed

    Kumar, Dhirendra; Yadav, Amit Kumar; Dash, Debasis

    2017-01-01

    Database searching is the preferred method for protein identification from digital spectra of mass to charge ratios (m/z) detected for protein samples through mass spectrometers. The search database is one of the major influencing factors in discovering proteins present in the sample and thus in deriving biological conclusions. In most cases the choice of search database is arbitrary. Here we describe common search databases used in proteomic studies and their impact on final list of identified proteins. We also elaborate upon factors like composition and size of the search database that can influence the protein identification process. In conclusion, we suggest that choice of the database depends on the type of inferences to be derived from proteomics data. However, making additional efforts to build a compact and concise database for a targeted question should generally be rewarding in achieving confident protein identifications.

  8. Direct Maximization of Protein Identifications from Tandem Mass Spectra*

    PubMed Central

    Spivak, Marina; Weston, Jason; Tomazela, Daniela; MacCoss, Michael J.; Noble, William Stafford

    2012-01-01

    The goal of many shotgun proteomics experiments is to determine the protein complement of a complex biological mixture. For many mixtures, most methodological approaches fall significantly short of this goal. Existing solutions to this problem typically subdivide the task into two stages: first identifying a collection of peptides with a low false discovery rate and then inferring from the peptides a corresponding set of proteins. In contrast, we formulate the protein identification problem as a single optimization problem, which we solve using machine learning methods. This approach is motivated by the observation that the peptide and protein level tasks are cooperative, and the solution to each can be improved by using information about the solution to the other. The resulting algorithm directly controls the relevant error rate, can incorporate a wide variety of evidence and, for complex samples, provides 18–34% more protein identifications than the current state of the art approaches. PMID:22052992

  9. Identification of Conserved Water Sites in Protein Structures for Drug Design.

    PubMed

    Jukič, Marko; Konc, Janez; Gobec, Stanislav; Janežič, Dušanka

    2017-12-26

    Identification of conserved waters in protein structures is a challenging task with applications in molecular docking and protein stability prediction. As an alternative to computationally demanding simulations of proteins in water, experimental cocrystallized waters in the Protein Data Bank (PDB) in combination with a local structure alignment algorithm can be used for reliable prediction of conserved water sites. We developed the ProBiS H2O approach based on the previously developed ProBiS algorithm, which enables identification of conserved water sites in proteins using experimental protein structures from the PDB or a set of custom protein structures available to the user. With a protein structure, a binding site, or an individual water molecule as a query, ProBiS H2O collects similar proteins from the PDB and performs local or binding site-specific superimpositions of the query structure with similar proteins using the ProBiS algorithm. It collects the experimental water molecules from the similar proteins and transposes them to the query protein. Transposed waters are clustered by their mutual proximity, which enables identification of discrete sites in the query protein with high water conservation. ProBiS H2O is a robust and fast new approach that uses existing experimental structural data to identify conserved water sites on the interfaces of protein complexes, for example protein-small molecule interfaces, and elsewhere on the protein structures. It has been successfully validated in several reported proteins in which conserved water molecules were found to play an important role in ligand binding with applications in drug design.

  10. Seed Storage Proteins as a System for Teaching Protein Identification by Mass Spectrometry in Biochemistry Laboratory

    ERIC Educational Resources Information Center

    Wilson, Karl A.; Tan-Wilson, Anna

    2013-01-01

    Mass spectrometry (MS) has become an important tool in studying biological systems. One application is the identification of proteins and peptides by the matching of peptide and peptide fragment masses to the sequences of proteins in protein sequence databases. Often prior protein separation of complex protein mixtures by 2D-PAGE is needed,…

  11. Identification of Potent ACE Inhibitory Peptides from Wild Almond Proteins.

    PubMed

    Mirzapour, Mozhgan; Rezaei, Karamatollah; Sentandreu, Miguel Angel

    2017-10-01

    In this study, the production, fractionation, purification and identification of ACE (angiotensin-I-converting enzyme) inhibitory peptides from wild almond (Amygdalus scoparia) proteins were investigated. Wild almond proteins were hydrolyzed using 5 different enzymes (pepsin, trypsin, chymotrypsin, alcalase and flavourzyme) and assayed for their ACE inhibitory activities. The degree of ACE inhibiting activity obtained after hydrolysis was found to be in the following order: alcalase > chymotrypsin > trypsin/pepsin > flavourzyme. The hydrolysates obtained from alcalase (IC 50 = 0.8 mg/mL) were fractionated by sequential ultrafiltration at 10 and 3 kDa cutoff values and the most active fraction (<3 kDa) was further separated using reversed phase high-performance liquid chromatography (RP-HPLC). Peptide sequence identifications were carried out on highly potential fractions obtained from RP-HPLC by means of liquid chromatography coupled to electrospray ionization and tandem mass spectrometry (LC-ESI-MS/MS). Sequencing of ACE inhibitory peptides present in the fraction 26 of RP-HPLC resulted in the identification of 3 peptide sequences (VVNE, VVTR, and VVGVD) not reported previously in the literature. Sequence identification of fractions 40 and 42 from RP-HPLC, which showed the highest ACE inhibitory activities (84.1% and 86.9%, respectively), resulted in the identification of more than 40 potential ACE inhibitory sequences. The results indicate that wild almond protein is a rich source of potential antihypertensive peptides and can be suggested for applications in functional foods and drinks with respect to hindrance and mitigation of hypertension after in vivo assessment. This study has shown the potential of wild almond proteins as good sources for producing ACE-inhibitory active peptides. According to this finding, peptides with higher ACE inhibitory activities could be released during the gastrointestinal digestion and contribute to the health- promoting

  12. Large-Scale Off-Target Identification Using Fast and Accurate Dual Regularized One-Class Collaborative Filtering and Its Application to Drug Repurposing.

    PubMed

    Lim, Hansaim; Poleksic, Aleksandar; Yao, Yuan; Tong, Hanghang; He, Di; Zhuang, Luke; Meng, Patrick; Xie, Lei

    2016-10-01

    Target-based screening is one of the major approaches in drug discovery. Besides the intended target, unexpected drug off-target interactions often occur, and many of them have not been recognized and characterized. The off-target interactions can be responsible for either therapeutic or side effects. Thus, identifying the genome-wide off-targets of lead compounds or existing drugs will be critical for designing effective and safe drugs, and providing new opportunities for drug repurposing. Although many computational methods have been developed to predict drug-target interactions, they are either less accurate than the one that we are proposing here or computationally too intensive, thereby limiting their capability for large-scale off-target identification. In addition, the performances of most machine learning based algorithms have been mainly evaluated to predict off-target interactions in the same gene family for hundreds of chemicals. It is not clear how these algorithms perform in terms of detecting off-targets across gene families on a proteome scale. Here, we are presenting a fast and accurate off-target prediction method, REMAP, which is based on a dual regularized one-class collaborative filtering algorithm, to explore continuous chemical space, protein space, and their interactome on a large scale. When tested in a reliable, extensive, and cross-gene family benchmark, REMAP outperforms the state-of-the-art methods. Furthermore, REMAP is highly scalable. It can screen a dataset of 200 thousands chemicals against 20 thousands proteins within 2 hours. Using the reconstructed genome-wide target profile as the fingerprint of a chemical compound, we predicted that seven FDA-approved drugs can be repurposed as novel anti-cancer therapies. The anti-cancer activity of six of them is supported by experimental evidences. Thus, REMAP is a valuable addition to the existing in silico toolbox for drug target identification, drug repurposing, phenotypic screening, and

  13. Large-Scale Off-Target Identification Using Fast and Accurate Dual Regularized One-Class Collaborative Filtering and Its Application to Drug Repurposing

    PubMed Central

    Poleksic, Aleksandar; Yao, Yuan; Tong, Hanghang; Meng, Patrick; Xie, Lei

    2016-01-01

    Target-based screening is one of the major approaches in drug discovery. Besides the intended target, unexpected drug off-target interactions often occur, and many of them have not been recognized and characterized. The off-target interactions can be responsible for either therapeutic or side effects. Thus, identifying the genome-wide off-targets of lead compounds or existing drugs will be critical for designing effective and safe drugs, and providing new opportunities for drug repurposing. Although many computational methods have been developed to predict drug-target interactions, they are either less accurate than the one that we are proposing here or computationally too intensive, thereby limiting their capability for large-scale off-target identification. In addition, the performances of most machine learning based algorithms have been mainly evaluated to predict off-target interactions in the same gene family for hundreds of chemicals. It is not clear how these algorithms perform in terms of detecting off-targets across gene families on a proteome scale. Here, we are presenting a fast and accurate off-target prediction method, REMAP, which is based on a dual regularized one-class collaborative filtering algorithm, to explore continuous chemical space, protein space, and their interactome on a large scale. When tested in a reliable, extensive, and cross-gene family benchmark, REMAP outperforms the state-of-the-art methods. Furthermore, REMAP is highly scalable. It can screen a dataset of 200 thousands chemicals against 20 thousands proteins within 2 hours. Using the reconstructed genome-wide target profile as the fingerprint of a chemical compound, we predicted that seven FDA-approved drugs can be repurposed as novel anti-cancer therapies. The anti-cancer activity of six of them is supported by experimental evidences. Thus, REMAP is a valuable addition to the existing in silico toolbox for drug target identification, drug repurposing, phenotypic screening, and

  14. Rapid identification of clinical mycobacterial isolates by protein profiling using matrix assisted laser desorption ionization-time of flight mass spectrometry.

    PubMed

    Panda, A; Kurapati, S; Samantaray, J C; Myneedu, V P; Verma, A; Srinivasan, A; Ahmad, H; Behera, D; Singh, U B

    2013-01-01

    The purpose of this study was to evaluate the identification of Mycobacterium tuberculosis which is often plagued with ambiguity. It is a time consuming process requiring 4-8 weeks after culture positivity, thereby delaying therapeutic intervention. For a successful treatment and disease management, timely diagnosis is imperative. We evaluated a rapid, proteomic based technique for identification of clinical mycobacterial isolates by protein profiling using matrix assisted laser desorption ionization-time of flight mass spectrometry (MALDI-TOF MS). Freshly grown mycobacterial isolates were used. Acetonitrile/trifluoroacetic acid extraction procedure was carried out, following which cinnamic acid charged plates were subjected to identification by MALDI-TOF MS. A comparative analysis of 42 clinical mycobacterial isolates using the MALDI-TOF MS and conventional techniques was carried out. Among these, 97.61% were found to corroborate with the standard methods at genus level and 85.36% were accurate till the species level. One out of 42 was not in accord with the conventional assays because MALDI-TOF MS established it as Mycobacterium tuberculosis (log (score)>2.0) and conventional methods established it to be non-tuberculous Mycobacterium. MALDI-TOF MS was found to be an accurate, rapid, cost effective and robust system for identification of mycobacterial species. This innovative approach holds promise for early therapeutic intervention leading to better patient care.

  15. Identification of AOSC-binding proteins in neurons

    NASA Astrophysics Data System (ADS)

    Liu, Ming; Nie, Qin; Xin, Xianliang; Geng, Meiyu

    2008-11-01

    Acidic oligosaccharide sugar chain (AOSC), a D-mannuronic acid oligosaccharide, derived from brown algae polysaccharide, has been completed Phase I clinical trial in China as an anti-Alzheimer’s Disease (AD) drug candidate. The identification of AOSC-binding protein(s) in neurons is very important for understanding its action mechanism. To determine the binding protein(s) of AOSC in neurons mediating its anti-AD activities, confocal microscopy, affinity chromatography, and liquid chromatography-tandem mass spectrometry (LC-MS/MS) analysis were used. Confocal microscopy analysis shows that AOSC binds to SH-SY5Y cells in concentration-, time-, and temperature-dependent fashions. The AOSC binding proteins were purified by affinity chromatography and identified by LC-MS/MS analysis. The results showed that there are 349 proteins binding AOSC, including clathrin, adaptor protein-2 (AP-2) and amyloid precursor protein (APP). These results suggest that the binding/entrance of AOSC to neurons is probably responsible for anti-AD activities.

  16. How accurately do force fields represent protein side chain ensembles?

    PubMed

    Petrović, Dušan; Wang, Xue; Strodel, Birgit

    2018-05-23

    Although the protein backbone is the most fundamental part of the structure, the fine-tuning of side-chain conformations is important for protein function, for example, in protein-protein and protein-ligand interactions, and also in enzyme catalysis. While several benchmarks testing the performance of protein force fields for side chain properties have already been published, they often considered only a few force fields and were not tested against the same experimental observables; hence, they are not directly comparable. In this work, we explore the ability of twelve force fields, which are different flavors of AMBER, CHARMM, OPLS, or GROMOS, to reproduce average rotamer angles and rotamer populations obtained from extensive NMR studies of the 3 J and residual dipolar coupling constants for two small proteins: ubiquitin and GB3. Based on a total of 196 μs sampling time, our results reveal that all force fields identify the correct side chain angles, while the AMBER and CHARMM force fields clearly outperform the OPLS and GROMOS force fields in estimating rotamer populations. The three best force fields for representing the protein side chain dynamics are AMBER 14SB, AMBER 99SB*-ILDN, and CHARMM36. Furthermore, we observe that the side chain ensembles of buried amino acid residues are generally more accurately represented than those of the surface exposed residues. This article is protected by copyright. All rights reserved. © 2018 Wiley Periodicals, Inc.

  17. iProphet: Multi-level Integrative Analysis of Shotgun Proteomic Data Improves Peptide and Protein Identification Rates and Error Estimates*

    PubMed Central

    Shteynberg, David; Deutsch, Eric W.; Lam, Henry; Eng, Jimmy K.; Sun, Zhi; Tasman, Natalie; Mendoza, Luis; Moritz, Robert L.; Aebersold, Ruedi; Nesvizhskii, Alexey I.

    2011-01-01

    The combination of tandem mass spectrometry and sequence database searching is the method of choice for the identification of peptides and the mapping of proteomes. Over the last several years, the volume of data generated in proteomic studies has increased dramatically, which challenges the computational approaches previously developed for these data. Furthermore, a multitude of search engines have been developed that identify different, overlapping subsets of the sample peptides from a particular set of tandem mass spectrometry spectra. We present iProphet, the new addition to the widely used open-source suite of proteomic data analysis tools Trans-Proteomics Pipeline. Applied in tandem with PeptideProphet, it provides more accurate representation of the multilevel nature of shotgun proteomic data. iProphet combines the evidence from multiple identifications of the same peptide sequences across different spectra, experiments, precursor ion charge states, and modified states. It also allows accurate and effective integration of the results from multiple database search engines applied to the same data. The use of iProphet in the Trans-Proteomics Pipeline increases the number of correctly identified peptides at a constant false discovery rate as compared with both PeptideProphet and another state-of-the-art tool Percolator. As the main outcome, iProphet permits the calculation of accurate posterior probabilities and false discovery rate estimates at the level of sequence identical peptide identifications, which in turn leads to more accurate probability estimates at the protein level. Fully integrated with the Trans-Proteomics Pipeline, it supports all commonly used MS instruments, search engines, and computer platforms. The performance of iProphet is demonstrated on two publicly available data sets: data from a human whole cell lysate proteome profiling experiment representative of typical proteomic data sets, and from a set of Streptococcus pyogenes experiments

  18. Identification of PDC-109-like protein(s) in buffalo seminal plasma.

    PubMed

    Harshan, Hiron M; Sankar, Surya; Singh, L P; Singh, Manish Kumar; Sudharani, S; Ansari, M R; Singh, S K; Majumdar, A C; Joshi, P

    2009-10-01

    The FN-2 family of seminal plasma proteins represents the major protein fraction of bovine seminal plasma. These proteins also constitute the major seminal plasma proteins fraction in horse, goat and bison seminal plasma and are present in pig, rat, mouse, hamster and human seminal plasma. BSP-A1 and BSP-A2, the predominant proteins of the FN-2 family, are collectively termed as PDC-109. Fn-2 proteins play an important role in fertilization, including sperm capacitation and formation of oviductal sperm reservoirs. Significantly, BSP proteins were also shown to have negative effects in the context of sperm storage. No conclusive evidence for the presence of buffalo seminal plasma protein(s) similar to PDC-109 exists. Studies with buffalo seminal plasma indicated that isolation and identification of PDC-109-like protein(s) from buffalo seminal plasma by conventional methods might be difficult. Thus, antibodies raised against PDC-109 isolated, and purified from cattle seminal plasma, were used for investigating the presence of PDC-109-like protein(s) in buffalo seminal plasma. Buffalo seminal plasma proteins were resolved on SDS-PAGE, blotted to nitro cellulose membranes and probed for the presence of PDC-109-like protein(s) using the PDC-109 antisera raised in rabbits. A distinct immunoreactive band well below the 20-kDa regions indicated the presence of PDC-109-like protein(s) in buffalo seminal plasma.

  19. Systematic Errors in Peptide and Protein Identification and Quantification by Modified Peptides*

    PubMed Central

    Bogdanow, Boris; Zauber, Henrik; Selbach, Matthias

    2016-01-01

    The principle of shotgun proteomics is to use peptide mass spectra in order to identify corresponding sequences in a protein database. The quality of peptide and protein identification and quantification critically depends on the sensitivity and specificity of this assignment process. Many peptides in proteomic samples carry biochemical modifications, and a large fraction of unassigned spectra arise from modified peptides. Spectra derived from modified peptides can erroneously be assigned to wrong amino acid sequences. However, the impact of this problem on proteomic data has not yet been investigated systematically. Here we use combinations of different database searches to show that modified peptides can be responsible for 20–50% of false positive identifications in deep proteomic data sets. These false positive hits are particularly problematic as they have significantly higher scores and higher intensities than other false positive matches. Furthermore, these wrong peptide assignments lead to hundreds of false protein identifications and systematic biases in protein quantification. We devise a “cleaned search” strategy to address this problem and show that this considerably improves the sensitivity and specificity of proteomic data. In summary, we show that modified peptides cause systematic errors in peptide and protein identification and quantification and should therefore be considered to further improve the quality of proteomic data annotation. PMID:27215553

  20. Identification of Abiotic Stress Protein Biomarkers by Proteomic Screening of Crop Cultivar Diversity

    PubMed Central

    Barkla, Bronwyn J.

    2016-01-01

    Modern day agriculture practice is narrowing the genetic diversity in our food supply. This may compromise the ability to obtain high yield under extreme climactic conditions, threatening food security for a rapidly growing world population. To identify genetic diversity, tolerance mechanisms of cultivars, landraces and wild relatives of major crops can be identified and ultimately exploited for yield improvement. Quantitative proteomics allows for the identification of proteins that may contribute to tolerance mechanisms by directly comparing protein abundance under stress conditions between genotypes differing in their stress responses. In this review, a summary is provided of the data accumulated from quantitative proteomic comparisons of crop genotypes/cultivars which present different stress tolerance responses when exposed to various abiotic stress conditions, including drought, salinity, high/low temperature, nutrient deficiency and UV-B irradiation. This field of research aims to identify molecular features that can be developed as biomarkers for crop improvement, however without accurate phenotyping, careful experimental design, statistical robustness and appropriate biomarker validation and verification it will be challenging to deliver what is promised. PMID:28248236

  1. Identification of Abiotic Stress Protein Biomarkers by Proteomic Screening of Crop Cultivar Diversity.

    PubMed

    Barkla, Bronwyn J

    2016-09-08

    Modern day agriculture practice is narrowing the genetic diversity in our food supply. This may compromise the ability to obtain high yield under extreme climactic conditions, threatening food security for a rapidly growing world population. To identify genetic diversity, tolerance mechanisms of cultivars, landraces and wild relatives of major crops can be identified and ultimately exploited for yield improvement. Quantitative proteomics allows for the identification of proteins that may contribute to tolerance mechanisms by directly comparing protein abundance under stress conditions between genotypes differing in their stress responses. In this review, a summary is provided of the data accumulated from quantitative proteomic comparisons of crop genotypes/cultivars which present different stress tolerance responses when exposed to various abiotic stress conditions, including drought, salinity, high/low temperature, nutrient deficiency and UV-B irradiation. This field of research aims to identify molecular features that can be developed as biomarkers for crop improvement, however without accurate phenotyping, careful experimental design, statistical robustness and appropriate biomarker validation and verification it will be challenging to deliver what is promised.

  2. Proteomic analysis of human aqueous humor using multidimensional protein identification technology

    PubMed Central

    Richardson, Matthew R.; Price, Marianne O.; Price, Francis W.; Pardo, Jennifer C.; Grandin, Juan C.; You, Jinsam; Wang, Mu

    2009-01-01

    Aqueous humor (AH) supports avascular tissues in the anterior segment of the eye, maintains intraocular pressure, and potentially influences the pathogenesis of ocular diseases. Nevertheless, the AH proteome is still poorly defined despite several previous efforts, which were hindered by interfering high abundance proteins, inadequate animal models, and limited proteomic technologies. To facilitate future investigations into AH function, the AH proteome was extensively characterized using an advanced proteomic approach. Samples from patients undergoing cataract surgery were pooled and depleted of interfering abundant proteins and thereby divided into two fractions: albumin-bound and albumin-depleted. Multidimensional Protein Identification Technology (MudPIT) was utilized for each fraction; this incorporates strong cation exchange chromatography to reduce sample complexity before reversed-phase liquid chromatography and tandem mass spectrometric analysis. Twelve proteins had multi-peptide, high confidence identifications in the albumin-bound fraction and 50 proteins had multi-peptide, high confidence identifications in the albumin-depleted fraction. Gene ontological analyses were performed to determine which cellular components and functions were enriched. Many proteins were previously identified in the AH and for several their potential role in the AH has been investigated; however, the majority of identified proteins were novel and only speculative roles can be suggested. The AH was abundant in anti-oxidant and immunoregulatory proteins as well as anti-angiogenic proteins, which may be involved in maintaining the avascular tissues. This is the first known report to extensively characterize and describe the human AH proteome and lays the foundation for future work regarding its function in homeostatic and pathologic states. PMID:20019884

  3. Final Progress Report: Isotope Identification Algorithm for Rapid and Accurate Determination of Radioisotopes Feasibility Study

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Rawool-Sullivan, Mohini; Bounds, John Alan; Brumby, Steven P.

    2012-04-30

    This is the final report of the project titled, 'Isotope Identification Algorithm for Rapid and Accurate Determination of Radioisotopes,' PMIS project number LA10-HUMANID-PD03. The goal of the work was to demonstrate principles of emulating a human analysis approach towards the data collected using radiation isotope identification devices (RIIDs). It summarizes work performed over the FY10 time period. The goal of the work was to demonstrate principles of emulating a human analysis approach towards the data collected using radiation isotope identification devices (RIIDs). Human analysts begin analyzing a spectrum based on features in the spectrum - lines and shapes that aremore » present in a given spectrum. The proposed work was to carry out a feasibility study that will pick out all gamma ray peaks and other features such as Compton edges, bremsstrahlung, presence/absence of shielding and presence of neutrons and escape peaks. Ultimately success of this feasibility study will allow us to collectively explain identified features and form a realistic scenario that produced a given spectrum in the future. We wanted to develop and demonstrate machine learning algorithms that will qualitatively enhance the automated identification capabilities of portable radiological sensors that are currently being used in the field.« less

  4. Fast and Accurate Multivariate Gaussian Modeling of Protein Families: Predicting Residue Contacts and Protein-Interaction Partners

    PubMed Central

    Feinauer, Christoph; Procaccini, Andrea; Zecchina, Riccardo; Weigt, Martin; Pagnani, Andrea

    2014-01-01

    In the course of evolution, proteins show a remarkable conservation of their three-dimensional structure and their biological function, leading to strong evolutionary constraints on the sequence variability between homologous proteins. Our method aims at extracting such constraints from rapidly accumulating sequence data, and thereby at inferring protein structure and function from sequence information alone. Recently, global statistical inference methods (e.g. direct-coupling analysis, sparse inverse covariance estimation) have achieved a breakthrough towards this aim, and their predictions have been successfully implemented into tertiary and quaternary protein structure prediction methods. However, due to the discrete nature of the underlying variable (amino-acids), exact inference requires exponential time in the protein length, and efficient approximations are needed for practical applicability. Here we propose a very efficient multivariate Gaussian modeling approach as a variant of direct-coupling analysis: the discrete amino-acid variables are replaced by continuous Gaussian random variables. The resulting statistical inference problem is efficiently and exactly solvable. We show that the quality of inference is comparable or superior to the one achieved by mean-field approximations to inference with discrete variables, as done by direct-coupling analysis. This is true for (i) the prediction of residue-residue contacts in proteins, and (ii) the identification of protein-protein interaction partner in bacterial signal transduction. An implementation of our multivariate Gaussian approach is available at the website http://areeweb.polito.it/ricerca/cmp/code. PMID:24663061

  5. Isolation and identification of peanut leaf proteins regulated by water stress.

    PubMed

    Akkasaeng, Chutipong; Tantisuwichwong, Napaporn; Chairam, Issariya; Prakrongrak, Narumon; Jogloy, Sanun; Pathanothai, Aran

    2007-05-15

    Water deficits trigger signaling cascades leading to modulation of protein expression in plant tissues. Identification of peanut leaf proteins regulated by water stress provides some insights of cellular and molecular response of peanut plants to drought stress. Peanut variety Khon Kaen 4, a water-stress sensitive variety, was grown in a growth chamber under controlled environment. Water stress was imposed on day 30 after seedling emergence by withholding watering peanut plants for 6 days as compared to plants adequately supplied with water. Total protein were prepared from a leaflet of fully expanded leaf on the main stem. Proteins were separated in duplicated gels using two-dimensional gel electrophoresis and visualized by silver nitrate staining. Image analysis was performed using ImageMaster 2D Platinum 5.0 to determine proteins regulated by water stress. Molecular mass and isoelectric point of each regulated protein were used in database queries for protein identification. One protein was induced under water stress and the homologous protein was identified as Serine/threonine-protein phosphatase PP 1. Five proteins were down-regulated by water deficit. The homologous proteins were chaperone protein DNAJ, auxin-responsive protein IAA29, peroxidase 43, caffeoyl-CoA O-methyltransferase and SNF1-related protein kinase regulatory subunit beta-2. Down-regulated proteins may be associated with sensitivity of the peanut variety to water stress.

  6. Experimental Methods for Protein Interaction Identification and Characterization

    NASA Astrophysics Data System (ADS)

    Uetz, Peter; Titz, Björn; Cagney, Gerard

    There are dozens of methods for the detection of protein-protein interactions but they fall into a few broad categories. Fragment complementation assays such as the yeast two-hybrid (Y2H) system are based on split proteins that are functionally reconstituted by fusions of interacting proteins. Biophysical methods include structure determination and mass spectrometric (MS) identification of proteins in complexes. Biochemical methods include methods such as far western blotting and peptide arrays. Only the Y2H and protein complex purification combined with MS have been used on a larger scale. Due to the lack of data it is still difficult to compare these methods with respect to their efficiency and error rates. Current data does not favor any particular method and thus multiple experimental approaches are necessary to maximally cover the interactome of any target cell or organism.

  7. A multi-objective optimization approach accurately resolves protein domain architectures

    PubMed Central

    Bernardes, J.S.; Vieira, F.R.J.; Zaverucha, G.; Carbone, A.

    2016-01-01

    Motivation: Given a protein sequence and a number of potential domains matching it, what are the domain content and the most likely domain architecture for the sequence? This problem is of fundamental importance in protein annotation, constituting one of the main steps of all predictive annotation strategies. On the other hand, when potential domains are several and in conflict because of overlapping domain boundaries, finding a solution for the problem might become difficult. An accurate prediction of the domain architecture of a multi-domain protein provides important information for function prediction, comparative genomics and molecular evolution. Results: We developed DAMA (Domain Annotation by a Multi-objective Approach), a novel approach that identifies architectures through a multi-objective optimization algorithm combining scores of domain matches, previously observed multi-domain co-occurrence and domain overlapping. DAMA has been validated on a known benchmark dataset based on CATH structural domain assignments and on the set of Plasmodium falciparum proteins. When compared with existing tools on both datasets, it outperforms all of them. Availability and implementation: DAMA software is implemented in C++ and the source code can be found at http://www.lcqb.upmc.fr/DAMA. Contact: juliana.silva_bernardes@upmc.fr or alessandra.carbone@lip6.fr Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26458889

  8. Strategies for the enrichment and identification of basic proteins in proteome projects.

    PubMed

    Bae, Soo-Han; Harris, Andrew G; Hains, Peter G; Chen, Hong; Garfin, David E; Hazell, Stuart L; Paik, Young-Ki; Walsh, Bradley J; Cordwell, Stuart J

    2003-05-01

    Two-dimensional gel electrophoresis (2-DE) is currently the method of choice for separating complex mixtures of proteins for visual comparison in proteome analysis. This technology, however, is biased against certain classes of proteins including low abundance and hydrophobic proteins. Proteins with extremely alkaline isoelectric points (pI) are often very poorly represented using 2-DE technology, even when complex mixtures are separated using commercially available pH 6-11 or pH 7-10 immobilized pH gradients. The genome of the human gut pathogen, Helicobacter pylori, is dominated by genes encoding basic proteins, and is therefore a useful model for examining methodology suitable for separating such proteins. H. pylori proteins were separated on pH 6-11 and novel pH 9-12 immobilized pH gradients and 65 protein spots were subjected to matrix-assisted laser desorption/ionization-time of flight mass spectrometry, leading to the identification of 49 unique proteins. No proteins were characterized with a theoretical pI of greater than 10.23. A second approach to examine extremely alkaline proteins (pI > 9.0) utilized a prefractionation isoelectric focusing. Proteins were separated into two fractions using Gradiflow technology, and the extremely basic fraction subjected to both sodium dodecyl sulphate-polyacrylamide gel electrophoresis and liquid chromatography (LC) - tandem mass spectrometry post-tryptic digest, allowing the identification of 17 and 13 proteins, respectively. Gradiflow separations were highly specific for proteins with pI > 9.0, however, a single LC separation only allowed the identification of peptides from highly abundant proteins. These methods and those encompassing multiple LC 'dimensions' may be a useful complement to 2-DE for 'near-to-total' proteome coverage in the alkaline pH range.

  9. Automated protein identification by the combination of MALDI MS and MS/MS spectra from different instruments.

    PubMed

    Levander, Fredrik; James, Peter

    2005-01-01

    The identification of proteins separated on two-dimensional gels is most commonly performed by trypsin digestion and subsequent matrix-assisted laser desorption ionization (MALDI) with time-of-flight (TOF). Recently, atmospheric pressure (AP) MALDI coupled to an ion trap (IT) has emerged as a convenient method to obtain tandem mass spectra (MS/MS) from samples on MALDI target plates. In the present work, we investigated the feasibility of using the two methodologies in line as a standard method for protein identification. In this setup, the high mass accuracy MALDI-TOF spectra are used to calibrate the peptide precursor masses in the lower mass accuracy AP-MALDI-IT MS/MS spectra. Several software tools were developed to automate the analysis process. Two sets of MALDI samples, consisting of 142 and 421 gel spots, respectively, were analyzed in a highly automated manner. In the first set, the protein identification rate increased from 61% for MALDI-TOF only to 85% for MALDI-TOF combined with AP-MALDI-IT. In the second data set the increase in protein identification rate was from 44% to 58%. AP-MALDI-IT MS/MS spectra were in general less effective than the MALDI-TOF spectra for protein identification, but the combination of the two methods clearly enhanced the confidence in protein identification.

  10. Identification of cell wall proteins in the flax (Linum usitatissimum) stem.

    PubMed

    Day, Arnaud; Fénart, Stéphane; Neutelings, Godfrey; Hawkins, Simon; Rolando, Christian; Tokarski, Caroline

    2013-03-01

    Sequential salt (CaCl2 , LiCl) extractions were used to obtain fractions enriched in cell wall proteins (CWPs) from the stem of 60-day-old flax (Linum usitatissimum) plants. High-resolution FT-ICR MS analysis and the use of recently published genomic data allowed the identification of 11 912 peptides corresponding to a total of 1418 different proteins. Subcellular localization using TargetP, Predotar, and WoLF PSORT led to the identification of 152 putative flax CWPs that were classified into nine different functional classes previously established for Arabidopsis thaliana. Examination of different functional classes revealed the presence of a number of proteins known to be involved in, or potentially involved in cell-wall metabolism in plants. The flax stem cell wall proteome was also compared with transcriptomic data previously obtained on comparable samples. This study represents a major contribution to the identification of CWPs in flax and will lead to a better understanding of cell wall biology in this species. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  11. Accurate Prediction of Contact Numbers for Multi-Spanning Helical Membrane Proteins

    PubMed Central

    Li, Bian; Mendenhall, Jeffrey; Nguyen, Elizabeth Dong; Weiner, Brian E.; Fischer, Axel W.; Meiler, Jens

    2017-01-01

    Prediction of the three-dimensional (3D) structures of proteins by computational methods is acknowledged as an unsolved problem. Accurate prediction of important structural characteristics such as contact number is expected to accelerate the otherwise slow progress being made in the prediction of 3D structure of proteins. Here, we present a dropout neural network-based method, TMH-Expo, for predicting the contact number of transmembrane helix (TMH) residues from sequence. Neuronal dropout is a strategy where certain neurons of the network are excluded from back-propagation to prevent co-adaptation of hidden-layer neurons. By using neuronal dropout, overfitting was significantly reduced and performance was noticeably improved. For multi-spanning helical membrane proteins, TMH-Expo achieved a remarkable Pearson correlation coefficient of 0.69 between predicted and experimental values and a mean absolute error of only 1.68. In addition, among those membrane protein–membrane protein interface residues, 76.8% were correctly predicted. Mapping of predicted contact numbers onto structures indicates that contact numbers predicted by TMH-Expo reflect the exposure patterns of TMHs and reveal membrane protein–membrane protein interfaces, reinforcing the potential of predicted contact numbers to be used as restraints for 3D structure prediction and protein–protein docking. TMH-Expo can be accessed via a Web server at www.meilerlab.org. PMID:26804342

  12. SIFTER search: a web server for accurate phylogeny-based protein function prediction

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sahraeian, Sayed M.; Luo, Kevin R.; Brenner, Steven E.

    We are awash in proteins discovered through high-throughput sequencing projects. As only a minuscule fraction of these have been experimentally characterized, computational methods are widely used for automated annotation. Here, we introduce a user-friendly web interface for accurate protein function prediction using the SIFTER algorithm. SIFTER is a state-of-the-art sequence-based gene molecular function prediction algorithm that uses a statistical model of function evolution to incorporate annotations throughout the phylogenetic tree. Due to the resources needed by the SIFTER algorithm, running SIFTER locally is not trivial for most users, especially for large-scale problems. The SIFTER web server thus provides access tomore » precomputed predictions on 16 863 537 proteins from 232 403 species. Users can explore SIFTER predictions with queries for proteins, species, functions, and homologs of sequences not in the precomputed prediction set. Lastly, the SIFTER web server is accessible at http://sifter.berkeley.edu/ and the source code can be downloaded.« less

  13. SIFTER search: a web server for accurate phylogeny-based protein function prediction

    DOE PAGES

    Sahraeian, Sayed M.; Luo, Kevin R.; Brenner, Steven E.

    2015-05-15

    We are awash in proteins discovered through high-throughput sequencing projects. As only a minuscule fraction of these have been experimentally characterized, computational methods are widely used for automated annotation. Here, we introduce a user-friendly web interface for accurate protein function prediction using the SIFTER algorithm. SIFTER is a state-of-the-art sequence-based gene molecular function prediction algorithm that uses a statistical model of function evolution to incorporate annotations throughout the phylogenetic tree. Due to the resources needed by the SIFTER algorithm, running SIFTER locally is not trivial for most users, especially for large-scale problems. The SIFTER web server thus provides access tomore » precomputed predictions on 16 863 537 proteins from 232 403 species. Users can explore SIFTER predictions with queries for proteins, species, functions, and homologs of sequences not in the precomputed prediction set. Lastly, the SIFTER web server is accessible at http://sifter.berkeley.edu/ and the source code can be downloaded.« less

  14. Proteomic identification of erythrocyte membrane protein deficiency in hereditary spherocytosis.

    PubMed

    Peker, Selen; Akar, Nejat; Demiralp, Duygu Ozel

    2012-03-01

    Hereditary spherocytosis (HS) is the most common congenital hemolytic anemia in Caucasians, with an estimated prevalence ranging from 1:2000 to 1:5000. The molecular defect in one of the erythrocytes (RBC) membrane proteins underlying HS like; spectrin-α, spectrin-β, ankyrin, band 3 and protein 4.2 that lead to membrane destabilization and vesiculation, may change the RBCs into denser and more rigid cells (spherocytes), which are removed by the spleen, leading to the development of hemolytic anemia. It is classified as mild, moderate and severe, according to the degree of the hemolytic anemia and the associated symptoms. Two-dimensional gel electrophoresis (2-DE) is potentially valuable method for studying heritable disorders as HS that involve membrane proteins. This separation technique of proteins based upon two biophysically unrelated parameters; molecular weight and charge, is a good option in clinical proteomics in terms of ability to separate complex mixtures, display post-translational modifications and changes after phosphorylation. In this study, we have used contemporary methods with some modifications for the solubilisation, separation and identification of erythrocyte membrane proteins in normal and in HS RBCs. Spectrin alpha and beta chain, ankyrin and band 3 proteins expression differences were found with PDQuest software 8.0.1. and peptide mass fingerprinting (PMF) analysis performed for identification of proteins in this study.

  15. A water-soluble conjugated polymer for protein identification and denaturation detection.

    PubMed

    Xu, Qingling; Wu, Chunxian; Zhu, Chunlei; Duan, Xinrui; Liu, Libing; Han, Yuchun; Wang, Yilin; Wang, Shu

    2010-12-03

    Rapid and sensitive methods to detect proteins and protein denaturation have become increasingly needful in the field of proteomics, medical diagnostics, and biology. In this paper, we have reported the synthesis of a new cationic water-soluble conjugated polymer that contains fluorene and diene moieties in the backbone (PFDE) for protein identification by sensing an array of PFDE solutions in different ionic strengths using the linear discriminant analysis technique (LDA). The PFDE can form complexes with proteins by electrostatic and/or hydrophobic interactions and exhibits different fluorescence response. Three main factors contribute to the fluorescence response of PFDE, namely, the net charge density on the protein surface, the hydrophobic nature of the protein, and the metalloprotein characteristics. The denaturation of proteins can also be detected using PFDE as a fluorescent probe. The interactions between PFDE and proteins were also studied by dynamic light scattering (DLS) and isothermal titration microcalorimetry (ITC) techniques. In contrast to other methods based on conjugated polymers, the synthesis of a series of quencher or dye-labeled acceptors or protein substrates has been avoided in our method, which significantly reduces the cost and the synthetic complexity. Our method provides promising applications on protein identification and denaturation detection in a simple, fast, and label-free manner based on non-specific interaction-induced perturbation of PFDE fluorescence response.

  16. Mass spectrometry and animal science: protein identification strategies and particularities of farm animal species.

    PubMed

    Soares, Renata; Franco, Catarina; Pires, Elisabete; Ventosa, Miguel; Palhinhas, Rui; Koci, Kamila; Martinho de Almeida, André; Varela Coelho, Ana

    2012-07-19

    Proteomic approaches are gaining increasing importance in the context of all fields of animal and veterinary sciences, including physiology, productive characterization, and disease/parasite tolerance, among others. Proteomic studies mainly aim the proteome characterization of a certain organ, tissue, cell type or organism, either in a specific condition or comparing protein differential expression within two or more selected situations. Due to the high complexity of samples, usually total protein extracts, proteomics relies heavily on separation procedures, being 2D-electrophoresis and HPLC the most common, as well as on protein identification using mass spectrometry (MS) based methodologies. Despite the increasing importance of MS in the context of animal and veterinary science studies, the usefulness of such tools is still poorly perceived by the animal science community. This is primarily due to the limited knowledge on mass spectrometry by animal scientists. Additionally, confidence and success in protein identification is hindered by the lack of information in public databases for most of farm animal species and their pathogens, with the exception of cattle (Bos taurus), pig (Sus scrofa) and chicken (Gallus gallus). In this article, we will briefly summarize the main methodologies available for protein identification using mass spectrometry providing a case study of specific applications in the field of animal science. We will also address the difficulties inherent to protein identification using MS, with particular reference to experiments using animal species poorly described in public databases. Additionally, we will suggest strategies to increase the rate of successful identifications when working with farm animal species. Copyright © 2012 Elsevier B.V. All rights reserved.

  17. Sad people are more accurate at expression identification with a smaller own-ethnicity bias than happy people.

    PubMed

    Hills, Peter J; Hill, Dominic M

    2017-07-12

    Sad individuals perform more accurately at face identity recognition (Hills, Werno, & Lewis, 2011), possibly because they scan more of the face during encoding. During expression identification tasks, sad individuals do not fixate on the eyes as much as happier individuals (Wu, Pu, Allen, & Pauli, 2012). Fixating on features other than the eyes leads to a reduced own-ethnicity bias (Hills & Lewis, 2006). This background indicates that sad individuals would not view the eyes as much as happy individuals and this would result in improved expression recognition and a reduced own-ethnicity bias. This prediction was tested using an expression identification task, with eye tracking. We demonstrate that sad-induced participants show enhanced expression recognition and a reduced own-ethnicity bias than happy-induced participants due to scanning more facial features. We conclude that mood affects eye movements and face encoding by causing a wider sampling strategy and deeper encoding of facial features diagnostic for expression identification.

  18. Accurate prediction of interfacial residues in two-domain proteins using evolutionary information: implications for three-dimensional modeling.

    PubMed

    Bhaskara, Ramachandra M; Padhi, Amrita; Srinivasan, Narayanaswamy

    2014-07-01

    With the preponderance of multidomain proteins in eukaryotic genomes, it is essential to recognize the constituent domains and their functions. Often function involves communications across the domain interfaces, and the knowledge of the interacting sites is essential to our understanding of the structure-function relationship. Using evolutionary information extracted from homologous domains in at least two diverse domain architectures (single and multidomain), we predict the interface residues corresponding to domains from the two-domain proteins. We also use information from the three-dimensional structures of individual domains of two-domain proteins to train naïve Bayes classifier model to predict the interfacial residues. Our predictions are highly accurate (∼85%) and specific (∼95%) to the domain-domain interfaces. This method is specific to multidomain proteins which contain domains in at least more than one protein architectural context. Using predicted residues to constrain domain-domain interaction, rigid-body docking was able to provide us with accurate full-length protein structures with correct orientation of domains. We believe that these results can be of considerable interest toward rational protein and interaction design, apart from providing us with valuable information on the nature of interactions. © 2013 Wiley Periodicals, Inc.

  19. Neural network and SVM classifiers accurately predict lipid binding proteins, irrespective of sequence homology.

    PubMed

    Bakhtiarizadeh, Mohammad Reza; Moradi-Shahrbabak, Mohammad; Ebrahimi, Mansour; Ebrahimie, Esmaeil

    2014-09-07

    Due to the central roles of lipid binding proteins (LBPs) in many biological processes, sequence based identification of LBPs is of great interest. The major challenge is that LBPs are diverse in sequence, structure, and function which results in low accuracy of sequence homology based methods. Therefore, there is a need for developing alternative functional prediction methods irrespective of sequence similarity. To identify LBPs from non-LBPs, the performances of support vector machine (SVM) and neural network were compared in this study. Comprehensive protein features and various techniques were employed to create datasets. Five-fold cross-validation (CV) and independent evaluation (IE) tests were used to assess the validity of the two methods. The results indicated that SVM outperforms neural network. SVM achieved 89.28% (CV) and 89.55% (IE) overall accuracy in identification of LBPs from non-LBPs and 92.06% (CV) and 92.90% (IE) (in average) for classification of different LBPs classes. Increasing the number and the range of extracted protein features as well as optimization of the SVM parameters significantly increased the efficiency of LBPs class prediction in comparison to the only previous report in this field. Altogether, the results showed that the SVM algorithm can be run on broad, computationally calculated protein features and offers a promising tool in detection of LBPs classes. The proposed approach has the potential to integrate and improve the common sequence alignment based methods. Copyright © 2014 Elsevier Ltd. All rights reserved.

  20. Pooled protein immunization for identification of cell surface antigens in Streptococcus sanguinis.

    PubMed

    Ge, Xiuchun; Kitten, Todd; Munro, Cindy L; Conrad, Daniel H; Xu, Ping

    2010-07-26

    Available bacterial genomes provide opportunities for screening vaccines by reverse vaccinology. Efficient identification of surface antigens is required to reduce time and animal cost in this technology. We developed an approach to identify surface antigens rapidly in Streptococcus sanguinis, a common infective endocarditis causative species. We applied bioinformatics for antigen prediction and pooled antigens for immunization. Forty-seven surface-exposed proteins including 28 lipoproteins and 19 cell wall-anchored proteins were chosen based on computer algorithms and comparative genomic analyses. Eight proteins among these candidates and 2 other proteins were pooled together to immunize rabbits. The antiserum reacted strongly with each protein and with S. sanguinis whole cells. Affinity chromatography was used to purify the antibodies to 9 of the antigen pool components. Competitive ELISA and FACS results indicated that these 9 proteins were exposed on S. sanguinis cell surfaces. The purified antibodies had demonstrable opsonic activity. The results indicate that immunization with pooled proteins, in combination with affinity purification, and comprehensive immunological assays may facilitate cell surface antigen identification to combat infectious diseases.

  1. Pooled Protein Immunization for Identification of Cell Surface Antigens in Streptococcus sanguinis

    PubMed Central

    Ge, Xiuchun; Kitten, Todd; Munro, Cindy L.; Conrad, Daniel H.; Xu, Ping

    2010-01-01

    Background Available bacterial genomes provide opportunities for screening vaccines by reverse vaccinology. Efficient identification of surface antigens is required to reduce time and animal cost in this technology. We developed an approach to identify surface antigens rapidly in Streptococcus sanguinis, a common infective endocarditis causative species. Methods and Findings We applied bioinformatics for antigen prediction and pooled antigens for immunization. Forty-seven surface-exposed proteins including 28 lipoproteins and 19 cell wall-anchored proteins were chosen based on computer algorithms and comparative genomic analyses. Eight proteins among these candidates and 2 other proteins were pooled together to immunize rabbits. The antiserum reacted strongly with each protein and with S. sanguinis whole cells. Affinity chromatography was used to purify the antibodies to 9 of the antigen pool components. Competitive ELISA and FACS results indicated that these 9 proteins were exposed on S. sanguinis cell surfaces. The purified antibodies had demonstrable opsonic activity. Conclusions The results indicate that immunization with pooled proteins, in combination with affinity purification, and comprehensive immunological assays may facilitate cell surface antigen identification to combat infectious diseases. PMID:20668678

  2. Resolution and identification of the protein components of the photosystem II antenna system of higher plants by reversed-phase liquid chromatography with electrospray-mass spectrometric detection.

    PubMed

    Corradini, D; Huber, C G; Timperio, A M; Zolla, L

    2000-07-21

    Reversed-phase liquid chromatography (RPLC) was interfaced to mass spectrometry (MS) with an electrospray ion (ESI) source for the separation and accurate molecular mass determination of the individual intrinsic membrane proteins that comprise the photosystem II (PS II) major light-harvesting complex (LHC II) and minor (CP24, CP26 and CP29) antenna system, whose molecular masses range between 22,000 and 29,000. PS II is a supramolecular complex intrinsic of the thylacoid membrane, which plays the important role in photosynthesis of capturing solar energy, and transferring it to photochemical reaction centers where energy conversion occurs. The protein components of the PS II major and minor antenna systems were extracted from spinach thylacoid membranes and separated using a butyl-silica column eluted by an acetonitrile gradient in 0.05% (v/v) aqueous trifluoroacetic acid. On-line electrospray MS allowed accurate molecular mass determination and identification of the protein components of PS II major and minor antenna system. The proposed RPLC-ESI-MS method holds several advantages over sodium dodecyl sulfate-polyacrylamide gel electrophoresis, the conventional technique for studying membrane proteins, including a better protein separation, mass accuracy, speed and efficiency.

  3. Performance of VITEK mass spectrometry V3.0 for rapid identification of clinical Aspergillus fumigatus in different culture conditions based on ribosomal proteins

    PubMed Central

    Zhou, Longrong; Chen, Yongquan; Xu, Yuanhong

    2017-01-01

    Fast and accurate discrimination of Aspergillus fumigatus is significant, since misidentification may lead to inappropriate clinical therapy. This study assessed VITEK mass spectrometry (MS) V3.0 for A. fumigatus identification using extracted fungal ribosomal proteins. A total of 52 isolates preliminarily identified as A. fumigatus by traditional morphological methods were inoculated in three different culture media and cultured at two different temperatures. The specific spectral fingerprints of different culture time points (48, 72, 96, and 120 h) were obtained. Of all strains, 88.5% (46/52) were discriminated as A. fumigatus, while the remaining 11.5% (6/52) produced results inconsistent with morphological analysis. Molecular sequencing, as a reference method for species identification, was used to validate the morphological analysis and matrix-assisted laser desorption/ionization time of flight MS. Chi-square tests (χ2 test, P=0.05) demonstrated that the culture medium and incubation temperature had no effects on identification accuracy; however, identification accuracy of the strains in the 48-h group was lower than that in other groups. In addition, we found that ribosomal proteins extracted from A. fumigatus can be stored in different environments for at least 1 week, with their profiles remaining stable and strain identification results showing no change. This is beneficial for medical institutions with no mass spectrometer at hand. Overall, this study showed the powerful ability of VITEK MS V 3.0 in identifying A. fumigatus. PMID:29263685

  4. Performance of VITEK mass spectrometry V3.0 for rapid identification of clinical Aspergillus fumigatus in different culture conditions based on ribosomal proteins.

    PubMed

    Zhou, Longrong; Chen, Yongquan; Xu, Yuanhong

    2017-01-01

    Fast and accurate discrimination of Aspergillus fumigatus is significant, since misidentification may lead to inappropriate clinical therapy. This study assessed VITEK mass spectrometry (MS) V3.0 for A. fumigatus identification using extracted fungal ribosomal proteins. A total of 52 isolates preliminarily identified as A. fumigatus by traditional morphological methods were inoculated in three different culture media and cultured at two different temperatures. The specific spectral fingerprints of different culture time points (48, 72, 96, and 120 h) were obtained. Of all strains, 88.5% (46/52) were discriminated as A. fumigatus , while the remaining 11.5% (6/52) produced results inconsistent with morphological analysis. Molecular sequencing, as a reference method for species identification, was used to validate the morphological analysis and matrix-assisted laser desorption/ionization time of flight MS. Chi-square tests ( χ 2 test, P =0.05) demonstrated that the culture medium and incubation temperature had no effects on identification accuracy; however, identification accuracy of the strains in the 48-h group was lower than that in other groups. In addition, we found that ribosomal proteins extracted from A. fumigatus can be stored in different environments for at least 1 week, with their profiles remaining stable and strain identification results showing no change. This is beneficial for medical institutions with no mass spectrometer at hand. Overall, this study showed the powerful ability of VITEK MS V 3.0 in identifying A. fumigatus .

  5. A combinatorial perspective of the protein inference problem.

    PubMed

    Yang, Chao; He, Zengyou; Yu, Weichuan

    2013-01-01

    In a shotgun proteomics experiment, proteins are the most biologically meaningful output. The success of proteomics studies depends on the ability to accurately and efficiently identify proteins. Many methods have been proposed to facilitate the identification of proteins from peptide identification results. However, the relationship between protein identification and peptide identification has not been thoroughly explained before. In this paper, we devote ourselves to a combinatorial perspective of the protein inference problem. We employ combinatorial mathematics to calculate the conditional protein probabilities (protein probability means the probability that a protein is correctly identified) under three assumptions, which lead to a lower bound, an upper bound, and an empirical estimation of protein probabilities, respectively. The combinatorial perspective enables us to obtain an analytical expression for protein inference. Our method achieves comparable results with ProteinProphet in a more efficient manner in experiments on two data sets of standard protein mixtures and two data sets of real samples. Based on our model, we study the impact of unique peptides and degenerate peptides (degenerate peptides are peptides shared by at least two proteins) on protein probabilities. Meanwhile, we also study the relationship between our model and ProteinProphet. We name our program ProteinInfer. Its Java source code, our supplementary document and experimental results are available at: >http://bioinformatics.ust.hk/proteininfer.

  6. Protein social behavior makes a stronger signal for partner identification than surface geometry

    PubMed Central

    Laine, Elodie

    2016-01-01

    ABSTRACT Cells are interactive living systems where proteins movements, interactions and regulation are substantially free from centralized management. How protein physico‐chemical and geometrical properties determine who interact with whom remains far from fully understood. We show that characterizing how a protein behaves with many potential interactors in a complete cross‐docking study leads to a sharp identification of its cellular/true/native partner(s). We define a sociability index, or S‐index, reflecting whether a protein likes or not to pair with other proteins. Formally, we propose a suitable normalization function that accounts for protein sociability and we combine it with a simple interface‐based (ranking) score to discriminate partners from non‐interactors. We show that sociability is an important factor and that the normalization permits to reach a much higher discriminative power than shape complementarity docking scores. The social effect is also observed with more sophisticated docking algorithms. Docking conformations are evaluated using experimental binding sites. These latter approximate in the best possible way binding sites predictions, which have reached high accuracy in recent years. This makes our analysis helpful for a global understanding of partner identification and for suggesting discriminating strategies. These results contradict previous findings claiming the partner identification problem being solvable solely with geometrical docking. Proteins 2016; 85:137–154. © 2016 Wiley Periodicals, Inc. PMID:27802579

  7. A Protein Preparation Method for the High-throughput Identification of Proteins Interacting with a Nuclear Cofactor Using LC-MS/MS Analysis.

    PubMed

    Tsuchiya, Megumi; Karim, M Rezaul; Matsumoto, Taro; Ogawa, Hidesato; Taniguchi, Hiroaki

    2017-01-24

    Transcriptional coregulators are vital to the efficient transcriptional regulation of nuclear chromatin structure. Coregulators play a variety of roles in regulating transcription. These include the direct interaction with transcription factors, the covalent modification of histones and other proteins, and the occasional chromatin conformation alteration. Accordingly, establishing relatively quick methods for identifying proteins that interact within this network is crucial to enhancing our understanding of the underlying regulatory mechanisms. LC-MS/MS-mediated protein binding partner identification is a validated technique used to analyze protein-protein interactions. By immunoprecipitating a previously-identified member of a protein complex with an antibody (occasionally with an antibody for a tagged protein), it is possible to identify its unknown protein interactions via mass spectrometry analysis. Here, we present a method of protein preparation for the LC-MS/MS-mediated high-throughput identification of protein interactions involving nuclear cofactors and their binding partners. This method allows for a better understanding of the transcriptional regulatory mechanisms of the targeted nuclear factors.

  8. PROTEOMIC IDENTIFICATION OF CARBONYLATED PROTEINS AND THEIR OXIDATION SITES

    PubMed Central

    Madian, Ashraf G.; Regnier, Fred E.

    2011-01-01

    Excessive oxidative stress leaves a protein carbonylation fingerprint in biological systems. Carbonylation is an irreversible post translational modification (PTM) that often leads to the loss of protein function and can be a component of multiple diseases. Protein carbonyl groups can be generated directly (by amino acids oxidation and the a-amidation pathway) or indirectly by forming adducts with lipid peroxidation products or glycation and advanced glycation end-products. Studies of oxidative stress are complicated by the low concentration of oxidation products and wide array of routes by which proteins are carbonylated. The development of new selection and enrichment techniques coupled with advances in mass spectrometry are allowing identification of hundreds of new carbonylated protein products from a broad range of proteins located at many sites in biological systems. The focus of this review is on the use of proteomics tools and methods to identify oxidized proteins along with specific sites of oxidative damage and the consequences of protein oxidation. PMID:20521848

  9. Protein social behavior makes a stronger signal for partner identification than surface geometry.

    PubMed

    Laine, Elodie; Carbone, Alessandra

    2017-01-01

    Cells are interactive living systems where proteins movements, interactions and regulation are substantially free from centralized management. How protein physico-chemical and geometrical properties determine who interact with whom remains far from fully understood. We show that characterizing how a protein behaves with many potential interactors in a complete cross-docking study leads to a sharp identification of its cellular/true/native partner(s). We define a sociability index, or S-index, reflecting whether a protein likes or not to pair with other proteins. Formally, we propose a suitable normalization function that accounts for protein sociability and we combine it with a simple interface-based (ranking) score to discriminate partners from non-interactors. We show that sociability is an important factor and that the normalization permits to reach a much higher discriminative power than shape complementarity docking scores. The social effect is also observed with more sophisticated docking algorithms. Docking conformations are evaluated using experimental binding sites. These latter approximate in the best possible way binding sites predictions, which have reached high accuracy in recent years. This makes our analysis helpful for a global understanding of partner identification and for suggesting discriminating strategies. These results contradict previous findings claiming the partner identification problem being solvable solely with geometrical docking. Proteins 2016; 85:137-154. © 2016 Wiley Periodicals, Inc. © 2016 The Authors Proteins: Structure, Function, and Bioinformatics Published by Wiley Periodicals, Inc.

  10. Identification of divergent protein domains by combining HMM-HMM comparisons and co-occurrence detection.

    PubMed

    Ghouila, Amel; Florent, Isabelle; Guerfali, Fatma Zahra; Terrapon, Nicolas; Laouini, Dhafer; Yahia, Sadok Ben; Gascuel, Olivier; Bréhélin, Laurent

    2014-01-01

    Identification of protein domains is a key step for understanding protein function. Hidden Markov Models (HMMs) have proved to be a powerful tool for this task. The Pfam database notably provides a large collection of HMMs which are widely used for the annotation of proteins in sequenced organisms. This is done via sequence/HMM comparisons. However, this approach may lack sensitivity when searching for domains in divergent species. Recently, methods for HMM/HMM comparisons have been proposed and proved to be more sensitive than sequence/HMM approaches in certain cases. However, these approaches are usually not used for protein domain discovery at a genome scale, and the benefit that could be expected from their utilization for this problem has not been investigated. Using proteins of P. falciparum and L. major as examples, we investigate the extent to which HMM/HMM comparisons can identify new domain occurrences not already identified by sequence/HMM approaches. We show that although HMM/HMM comparisons are much more sensitive than sequence/HMM comparisons, they are not sufficiently accurate to be used as a standalone complement of sequence/HMM approaches at the genome scale. Hence, we propose to use domain co-occurrence--the general domain tendency to preferentially appear along with some favorite domains in the proteins--to improve the accuracy of the approach. We show that the combination of HMM/HMM comparisons and co-occurrence domain detection boosts protein annotations. At an estimated False Discovery Rate of 5%, it revealed 901 and 1098 new domains in Plasmodium and Leishmania proteins, respectively. Manual inspection of part of these predictions shows that it contains several domain families that were missing in the two organisms. All new domain occurrences have been integrated in the EuPathDomains database, along with the GO annotations that can be deduced.

  11. Identification of Divergent Protein Domains by Combining HMM-HMM Comparisons and Co-Occurrence Detection

    PubMed Central

    Ghouila, Amel; Florent, Isabelle; Guerfali, Fatma Zahra; Terrapon, Nicolas; Laouini, Dhafer; Yahia, Sadok Ben; Gascuel, Olivier; Bréhélin, Laurent

    2014-01-01

    Identification of protein domains is a key step for understanding protein function. Hidden Markov Models (HMMs) have proved to be a powerful tool for this task. The Pfam database notably provides a large collection of HMMs which are widely used for the annotation of proteins in sequenced organisms. This is done via sequence/HMM comparisons. However, this approach may lack sensitivity when searching for domains in divergent species. Recently, methods for HMM/HMM comparisons have been proposed and proved to be more sensitive than sequence/HMM approaches in certain cases. However, these approaches are usually not used for protein domain discovery at a genome scale, and the benefit that could be expected from their utilization for this problem has not been investigated. Using proteins of P. falciparum and L. major as examples, we investigate the extent to which HMM/HMM comparisons can identify new domain occurrences not already identified by sequence/HMM approaches. We show that although HMM/HMM comparisons are much more sensitive than sequence/HMM comparisons, they are not sufficiently accurate to be used as a standalone complement of sequence/HMM approaches at the genome scale. Hence, we propose to use domain co-occurrence — the general domain tendency to preferentially appear along with some favorite domains in the proteins — to improve the accuracy of the approach. We show that the combination of HMM/HMM comparisons and co-occurrence domain detection boosts protein annotations. At an estimated False Discovery Rate of 5%, it revealed 901 and 1098 new domains in Plasmodium and Leishmania proteins, respectively. Manual inspection of part of these predictions shows that it contains several domain families that were missing in the two organisms. All new domain occurrences have been integrated in the EuPathDomains database, along with the GO annotations that can be deduced. PMID:24901648

  12. Recombinant blood group proteins for use in antibody screening and identification tests.

    PubMed

    Seltsam, Axel; Blasczyk, Rainer

    2009-11-01

    The present review elucidates the potentials of recombinant blood group proteins (BGPs) for red blood cell (RBC) antibody detection and identification in pretransfusion testing and the achievements in this field so far. Many BGPs have been eukaryotically and prokaryotically expressed in sufficient quantity and quality for RBC antibody testing. Recombinant BGPs can be incorporated in soluble protein reagents or solid-phase assays such as ELISA, color-coded microsphere and protein microarray chip-based techniques. Because novel recombinant protein-based assays use single antigens, a positive reaction of a serum with the recombinant protein directly indicates the presence and specificity of the target antibody. Inversely, conventional RBC-based assays use panels of human RBCs carrying a huge number of blood group antigens at the same time and require negative reactions of samples with antigen-negative cells for indirect determination of antibody specificity. Because of their capacity for single-step, direct RBC antibody determination, recombinant protein-based assays may greatly facilitate and accelerate the identification of common and rare RBC antibodies.

  13. High-throughput identification of proteins with AMPylation using self-assembled human protein (NAPPA) microarrays.

    PubMed

    Yu, Xiaobo; LaBaer, Joshua

    2015-05-01

    AMPylation (adenylylation) has been recognized as an important post-translational modification that is used by pathogens to regulate host cellular proteins and their associated signaling pathways. AMPylation has potential functions in various cellular processes, and it is widely conserved across both prokaryotes and eukaryotes. However, despite the identification of many AMPylators, relatively few candidate substrates of AMPylation are known. This is changing with the recent development of a robust and reliable method for identifying new substrates using protein microarrays, which can markedly expand the list of potential substrates. Here we describe procedures for detecting AMPylated and auto-AMPylated proteins in a sensitive, high-throughput and nonradioactive manner. The approach uses high-density protein microarrays fabricated using nucleic acid programmable protein array (NAPPA) technology, which enables the highly successful display of fresh recombinant human proteins in situ. The modification of target proteins is determined via copper-catalyzed azide-alkyne cycloaddition (CuAAC). The assay can be accomplished within 11 h.

  14. Identification of ATM Protein Kinase Phosphorylation Sites by Mass Spectrometry.

    PubMed

    Graham, Mark E; Lavin, Martin F; Kozlov, Sergei V

    2017-01-01

    ATM (ataxia-telangiectasia mutated) protein kinase is a key regulator of cellular responses to DNA damage and oxidative stress. DNA damage triggers complex cascade of signaling events leading to numerous posttranslational modification on multitude of proteins. Understanding the regulation of ATM kinase is therefore critical not only for understanding the human genetic disorder ataxia-telangiectasia and potential treatment strategies, but essential for deciphering physiological responses of cells to stress. These responses play an important role in carcinogenesis, neurodegeneration, and aging. We focus here on the identification of DNA damage inducible ATM phosphorylation sites to understand the importance of autophosphorylation in the mechanism of ATM kinase activation. We demonstrate the utility of using immunoprecipitated ATM in quantitative LC-MS/MS workflow with stable isotope dimethyl labeling of ATM peptides for identification of phosphorylation sites.

  15. Verification of Ribosomal Proteins of Aspergillus fumigatus for Use as Biomarkers in MALDI-TOF MS Identification.

    PubMed

    Nakamura, Sayaka; Sato, Hiroaki; Tanaka, Reiko; Yaguchi, Takashi

    2016-01-01

    We have previously proposed a rapid identification method for bacterial strains based on the profiles of their ribosomal subunit proteins (RSPs), observed using matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS). This method can perform phylogenetic characterization based on the mass of housekeeping RSP biomarkers, ideally calculated from amino acid sequence information registered in public protein databases. With the aim of extending its field of application to medical mycology, this study investigates the actual state of information of RSPs of eukaryotic fungi registered in public protein databases through the characterization of ribosomal protein fractions extracted from genome-sequenced Aspergillus fumigatus strains Af293 and A1163 as a model. In this process, we have found that the public protein databases harbor problems. The RSP names are in confusion, so we have provisionally unified them using the yeast naming system. The most serious problem is that many incorrect sequences are registered in the public protein databases. Surprisingly, more than half of the sequences are incorrect, due chiefly to mis-annotation of exon/intron structures. These errors could be corrected by a combination of in silico inspection by sequence homology analysis and MALDI-TOF MS measurements. We were also able to confirm conserved post-translational modifications in eleven RSPs. After these verifications, the masses of 31 expressed RSPs under 20,000 Da could be accurately confirmed. These RSPs have a potential to be useful biomarkers for identifying clinical isolates of A. fumigatus .

  16. Accurate identification of layer number for few-layer WS2 and WSe2 via spectroscopic study.

    PubMed

    Li, Yuanzheng; Li, Xinshu; Yu, Tong; Yang, Guochun; Chen, Heyu; Zhang, Cen; Feng, Qiushi; Ma, Jiangang; Liu, Weizhen; Xu, Haiyang; Liu, Yichun; Liu, Xinfeng

    2018-03-23

    Transition metal dichalcogenides (TMDs) with a typical layered structure are highly sensitive to their layer number in optical and electronic properties. Seeking a simple and effective method for layer number identification is very important to low-dimensional TMD samples. Herein, a rapid and accurate layer number identification of few-layer WS 2 and WSe 2 is proposed via locking their photoluminescence (PL) peak-positions. As the layer number of WS 2 /WSe 2 increases, it is found that indirect transition emission is more thickness-sensitive than direct transition emission, and the PL peak-position differences between the indirect and direct transitions can be regarded as fingerprints to identify their layer number. Theoretical calculation confirms that the notable thickness-sensitivity of indirect transition derives from the variations of electron density of states of W atom d-orbitals and chalcogen atom p-orbitals. Besides, the PL peak-position differences between the indirect and direct transitions are almost independent of different insulating substrates. This work not only proposes a new method for layer number identification via PL studies, but also provides a valuable insight into the thickness-dependent optical and electronic properties of W-based TMDs.

  17. Identification of novel lysosomal matrix proteins by proteome analysis.

    PubMed

    Kollmann, Katrin; Mutenda, Kudzai E; Balleininger, Martina; Eckermann, Ellen; von Figura, Kurt; Schmidt, Bernhard; Lübke, Torben

    2005-10-01

    The lysosomal matrix is estimated to contain about 50 different proteins. Most of the matrix proteins are acid hydrolases that depend on mannose 6-phosphate receptors (MPR) for targeting to lysosomes. Here, we describe a comprehensive proteome analysis of MPR-binding proteins from mouse. Mouse embryonic fibroblasts defective in both MPR (MPR 46-/- and MPR 300-/-) are known to secrete the lysosomal matrix proteins. Secretions of these cells were affinity purified using an affinity matrix derivatized with MPR46 and MPR300. In the protein fraction bound to the affinity matrix and eluted with mannose 6-phosphate, 34 known lysosomal matrix proteins, 4 candidate proteins of the lysosomal matrix and 4 non-lysosomal contaminants were identified by mass spectrometry after separation by two-dimensional gel electrophoresis or by multidimensional protein identification technology. For 3 of the candidate proteins, mammalian ependymin-related protein-2 (MERP-2), retinoid-inducible serine carboxypeptidase (RISC) and the hypothetical 66.3-kDa protein we could verify that C-terminally tagged forms bound in an M6P-dependent manner to an MPR-affinity matrix and were internalized via MPR-mediated endocytosis. Hence these 3 proteins are likely to represent hitherto unrecognized lysosomal matrix proteins.

  18. ProteinInferencer: Confident protein identification and multiple experiment comparison for large scale proteomics projects.

    PubMed

    Zhang, Yaoyang; Xu, Tao; Shan, Bing; Hart, Jonathan; Aslanian, Aaron; Han, Xuemei; Zong, Nobel; Li, Haomin; Choi, Howard; Wang, Dong; Acharya, Lipi; Du, Lisa; Vogt, Peter K; Ping, Peipei; Yates, John R

    2015-11-03

    Shotgun proteomics generates valuable information from large-scale and target protein characterizations, including protein expression, protein quantification, protein post-translational modifications (PTMs), protein localization, and protein-protein interactions. Typically, peptides derived from proteolytic digestion, rather than intact proteins, are analyzed by mass spectrometers because peptides are more readily separated, ionized and fragmented. The amino acid sequences of peptides can be interpreted by matching the observed tandem mass spectra to theoretical spectra derived from a protein sequence database. Identified peptides serve as surrogates for their proteins and are often used to establish what proteins were present in the original mixture and to quantify protein abundance. Two major issues exist for assigning peptides to their originating protein. The first issue is maintaining a desired false discovery rate (FDR) when comparing or combining multiple large datasets generated by shotgun analysis and the second issue is properly assigning peptides to proteins when homologous proteins are present in the database. Herein we demonstrate a new computational tool, ProteinInferencer, which can be used for protein inference with both small- or large-scale data sets to produce a well-controlled protein FDR. In addition, ProteinInferencer introduces confidence scoring for individual proteins, which makes protein identifications evaluable. This article is part of a Special Issue entitled: Computational Proteomics. Copyright © 2015. Published by Elsevier B.V.

  19. Real-Time and Accurate Identification of Single Oligonucleotide Photoisomers via an Aerolysin Nanopore.

    PubMed

    Hu, Zheng-Li; Li, Zi-Yuan; Ying, Yi-Lun; Zhang, Junji; Cao, Chan; Long, Yi-Tao; Tian, He

    2018-04-03

    Identification of the configuration for the photoresponsive oligonucleotide plays an important role in the ingenious design of DNA nanomolecules and nanodevices. Due to the limited resolution and sensitivity of present methods, it remains a challenge to determine the accurate configuration of photoresponsive oligonucleotides, much less a precise description of their photoconversion process. Here, we used an aerolysin (AeL) nanopore-based confined space for real-time determination and quantification of the absolute cis/ trans configuration of each azobenzene-modified oligonucleotide (Azo-ODN) with a single molecule resolution. The two completely separated current distributions with narrow peak widths at half height (<0.62 pA) are assigned to cis/ trans-Azo-ODN isomers, respectively. Due to the high current sensitivity, each isomer of Azo-ODN could be undoubtedly identified, which gives the accurate photostationary conversion values of 82.7% for trans-to- cis under UV irradiation and 82.5% for cis-to- trans under vis irradiation. Further real-time kinetic evaluation reveals that the photoresponsive rate constants of Azo-ODN from trans-to- cis and cis-to -trans are 0.43 and 0.20 min -1 , respectively. This study will promote the sophisticated design of photoresponsive ODN to achieve an efficient and applicable photocontrollable process.

  20. Rapid and accurate bacterial identification in probiotics and yoghurts by MALDI-TOF mass spectrometry.

    PubMed

    Angelakis, Emmanouil; Million, Matthieu; Henry, Mireille; Raoult, Didier

    2011-10-01

    Probiotic food is manufactured by adding probiotic strains simultaneously with starter cultures in fermentation tanks. Here, we investigate the accuracy and feasibility of matrix-assisted laser desorption/ionisation time-of-flight mass spectrometry (MALDI-TOF MS) for bacterial identification at the species level in probiotic food and yoghurts. Probiotic food and yoghurts were cultured in Columbia and Lactobacillus specific agar and tested by quantitative real-time PCR (qPCR) for the detection and quantification of Lactobacillus sp. Bacterial identification was performed by MALDI-TOF analysis and by amplification and sequencing of tuf and 16S rDNA genes. We tested 13 probiotic food and yoghurts and we identified by qPCR that they presented 10(6) to 10(7) copies of Lactobacillus spp. DNA/g. All products contained very large numbers of living bacteria varying from 10(6) to 10(9) colony forming units/g. These bacteria were identified as Lactobacillus casei, Lactococcus lactis, Bifidobacterium animalis, Lactobacillus delbrueckii, and Streptococcus thermophilus. MALDI-TOF MS presented 92% specificity compared to the molecular assays. In one product we found L. lactis, instead of Bifidus spp. which was mentioned on the label and for another L. delbrueckii and S. thermophilus instead of Bifidus spp. MALDI-TOF MS allows a rapid and accurate bacterial identification at the species level in probiotic food and yoghurts. Although the safety and functionality of probiotics are species and strain dependent, we found a discrepancy between the bacterial strain announced on the label and the strain identified. Practical Application:  MALDI-TOF MS is rapid and specific for the identification of bacteria in probiotic food and yoghurts. Although the safety and functionality of probiotics are species and strain dependent, we found a discrepancy between the bacterial strain announced on the label and the strain identified. © 2011 Institute of Food Technologists®

  1. Modified filter-aided sample preparation (FASP) method increases peptide and protein identifications for shotgun proteomics.

    PubMed

    Ni, Mao-Wei; Wang, Lu; Chen, Wei; Mou, Han-Zhou; Zhou, Jie; Zheng, Zhi-Guo

    2017-01-30

    Mass spectrometry (MS)-based protein identification depends mainly on protein extraction and digestion. Although sodium dodecyl sulfate (SDS) can preclude enzymatic digestion and interfere with MS analysis, it is still the most widely used surfactant in these steps. To overcome these disadvantages, a SDS-compatible proteomic technique for SDS removal prior to MS-based analyses was developed, namely filter-aided sample preparation (FASP). Herein, based on the effectiveness of sodium deoxycholate and a detergent removal spin column, we developed a modified FASP (mFASP) method and compared its overall performance, total number of peptides and proteins identified for shotgun proteomic experiments with that of the FASP method. Identification of 4570 ± 392 and 9139 ± 317 peptides and description of 862 ± 46 and 1377 ± 33 protein groups with two or more peptides from the ovarian cancer cell line A2780 was accomplished by FASP and mFASP methods, respectively. The mFASP method (21.2 ± 0.2%) had higher average peptide to protein coverage than FASP method (13.2 ± 0.5%). More hydrophobic peptides were identified by mFASP than by FASP, as indicated by the GRAVY score distribution. The reported method enables reliable and efficient identification of proteins and peptides in whole-cell extracts containing SDS. The new approach allows for higher throughput (the simultaneous identification of more proteins), a more comprehensive investigation of proteins, and potentially the discovery of new biomarkers. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  2. SITEHOUND-web: a server for ligand binding site identification in protein structures.

    PubMed

    Hernandez, Marylens; Ghersi, Dario; Sanchez, Roberto

    2009-07-01

    SITEHOUND-web (http://sitehound.sanchezlab.org) is a binding-site identification server powered by the SITEHOUND program. Given a protein structure in PDB format SITEHOUND-web will identify regions of the protein characterized by favorable interactions with a probe molecule. These regions correspond to putative ligand binding sites. Depending on the probe used in the calculation, sites with preference for different ligands will be identified. Currently, a carbon probe for identification of binding sites for drug-like molecules, and a phosphate probe for phosphorylated ligands (ATP, phoshopeptides, etc.) have been implemented. SITEHOUND-web will display the results in HTML pages including an interactive 3D representation of the protein structure and the putative sites using the Jmol java applet. Various downloadable data files are also provided for offline data analysis.

  3. Accurate De Novo Prediction of Protein Contact Map by Ultra-Deep Learning Model.

    PubMed

    Wang, Sheng; Sun, Siqi; Li, Zhen; Zhang, Renyu; Xu, Jinbo

    2017-01-01

    Protein contacts contain key information for the understanding of protein structure and function and thus, contact prediction from sequence is an important problem. Recently exciting progress has been made on this problem, but the predicted contacts for proteins without many sequence homologs is still of low quality and not very useful for de novo structure prediction. This paper presents a new deep learning method that predicts contacts by integrating both evolutionary coupling (EC) and sequence conservation information through an ultra-deep neural network formed by two deep residual neural networks. The first residual network conducts a series of 1-dimensional convolutional transformation of sequential features; the second residual network conducts a series of 2-dimensional convolutional transformation of pairwise information including output of the first residual network, EC information and pairwise potential. By using very deep residual networks, we can accurately model contact occurrence patterns and complex sequence-structure relationship and thus, obtain higher-quality contact prediction regardless of how many sequence homologs are available for proteins in question. Our method greatly outperforms existing methods and leads to much more accurate contact-assisted folding. Tested on 105 CASP11 targets, 76 past CAMEO hard targets, and 398 membrane proteins, the average top L long-range prediction accuracy obtained by our method, one representative EC method CCMpred and the CASP11 winner MetaPSICOV is 0.47, 0.21 and 0.30, respectively; the average top L/10 long-range accuracy of our method, CCMpred and MetaPSICOV is 0.77, 0.47 and 0.59, respectively. Ab initio folding using our predicted contacts as restraints but without any force fields can yield correct folds (i.e., TMscore>0.6) for 203 of the 579 test proteins, while that using MetaPSICOV- and CCMpred-predicted contacts can do so for only 79 and 62 of them, respectively. Our contact-assisted models also have

  4. Accurate De Novo Prediction of Protein Contact Map by Ultra-Deep Learning Model

    PubMed Central

    Li, Zhen; Zhang, Renyu

    2017-01-01

    Motivation Protein contacts contain key information for the understanding of protein structure and function and thus, contact prediction from sequence is an important problem. Recently exciting progress has been made on this problem, but the predicted contacts for proteins without many sequence homologs is still of low quality and not very useful for de novo structure prediction. Method This paper presents a new deep learning method that predicts contacts by integrating both evolutionary coupling (EC) and sequence conservation information through an ultra-deep neural network formed by two deep residual neural networks. The first residual network conducts a series of 1-dimensional convolutional transformation of sequential features; the second residual network conducts a series of 2-dimensional convolutional transformation of pairwise information including output of the first residual network, EC information and pairwise potential. By using very deep residual networks, we can accurately model contact occurrence patterns and complex sequence-structure relationship and thus, obtain higher-quality contact prediction regardless of how many sequence homologs are available for proteins in question. Results Our method greatly outperforms existing methods and leads to much more accurate contact-assisted folding. Tested on 105 CASP11 targets, 76 past CAMEO hard targets, and 398 membrane proteins, the average top L long-range prediction accuracy obtained by our method, one representative EC method CCMpred and the CASP11 winner MetaPSICOV is 0.47, 0.21 and 0.30, respectively; the average top L/10 long-range accuracy of our method, CCMpred and MetaPSICOV is 0.77, 0.47 and 0.59, respectively. Ab initio folding using our predicted contacts as restraints but without any force fields can yield correct folds (i.e., TMscore>0.6) for 203 of the 579 test proteins, while that using MetaPSICOV- and CCMpred-predicted contacts can do so for only 79 and 62 of them, respectively. Our contact

  5. Identification of immunodominant proteins of the microalgae Prototheca by proteomic analysis

    PubMed Central

    Irrgang, A.; Weise, C.; Murugaiyan, J.; Roesler, U.

    2014-01-01

    Prototheca zopfii associated with bovine mastitis and human protothecosis exists as two genotypes, of which genotype 1 is considered as non-infectious and genotype 2 as infectious. The mechanism of infection has not yet been described. The present study was aimed to identify genotype 2-specific immunodominant proteins. Prototheca proteins were separated using two-dimensional gel electrophoresis. Subsequent western blotting with rabbit hyperimmune serum revealed 28 protein spots. Matrix-assisted laser desorption ionization time-of-flight mass spectrometry analysis resulted in the identification of 15 proteins including malate dehydrogenase, elongation factor 1-alpha, heat shock protein 70, and 14-3-3 protein, which were previously described as immunogenic proteins of other eukaryotic pathogens. PMID:25755891

  6. Advances in identification and validation of protein targets of natural products without chemical modification.

    PubMed

    Chang, J; Kim, Y; Kwon, H J

    2016-05-04

    Covering: up to February 2016Identification of the target proteins of natural products is pivotal to understanding the mechanisms of action to develop natural products for use as molecular probes and potential therapeutic drugs. Affinity chromatography of immobilized natural products has been conventionally used to identify target proteins, and has yielded good results. However, this method has limitations, in that labeling or tagging for immobilization and affinity purification often result in reduced or altered activity of the natural product. New strategies have recently been developed and applied to identify the target proteins of natural products and synthetic small molecules without chemical modification of the natural product. These direct and indirect methods for target identification of label-free natural products include drug affinity responsive target stability (DARTS), stability of proteins from rates of oxidation (SPROX), cellular thermal shift assay (CETSA), thermal proteome profiling (TPP), and bioinformatics-based analysis of connectivity. This review focuses on and reports case studies of the latest advances in target protein identification methods for label-free natural products. The integration of newly developed technologies will provide new insights and highlight the value of natural products for use as biological probes and new drug candidates.

  7. An unexpected way forward: towards a more accurate and rigorous protein-protein binding affinity scoring function by eliminating terms from an already simple scoring function.

    PubMed

    Swanson, Jon; Audie, Joseph

    2018-01-01

    A fundamental and unsolved problem in biophysical chemistry is the development of a computationally simple, physically intuitive, and generally applicable method for accurately predicting and physically explaining protein-protein binding affinities from protein-protein interaction (PPI) complex coordinates. Here, we propose that the simplification of a previously described six-term PPI scoring function to a four term function results in a simple expression of all physically and statistically meaningful terms that can be used to accurately predict and explain binding affinities for a well-defined subset of PPIs that are characterized by (1) crystallographic coordinates, (2) rigid-body association, (3) normal interface size, and hydrophobicity and hydrophilicity, and (4) high quality experimental binding affinity measurements. We further propose that the four-term scoring function could be regarded as a core expression for future development into a more general PPI scoring function. Our work has clear implications for PPI modeling and structure-based drug design.

  8. Targeted nanodiamonds for identification of subcellular protein assemblies in mammalian cells

    PubMed Central

    Lake, Michael P.; Bouchard, Louis-S.

    2017-01-01

    Transmission electron microscopy (TEM) can be used to successfully determine the structures of proteins. However, such studies are typically done ex situ after extraction of the protein from the cellular environment. Here we describe an application for nanodiamonds as targeted intensity contrast labels in biological TEM, using the nuclear pore complex (NPC) as a model macroassembly. We demonstrate that delivery of antibody-conjugated nanodiamonds to live mammalian cells using maltotriose-conjugated polypropylenimine dendrimers results in efficient localization of nanodiamonds to the intended cellular target. We further identify signatures of nanodiamonds under TEM that allow for unambiguous identification of individual nanodiamonds from a resin-embedded, OsO4-stained environment. This is the first demonstration of nanodiamonds as labels for nanoscale TEM-based identification of subcellular protein assemblies. These results, combined with the unique fluorescence properties and biocompatibility of nanodiamonds, represent an important step toward the use of nanodiamonds as markers for correlated optical/electron bioimaging. PMID:28636640

  9. Identification of proteins from tuberculin purified protein derivative (PPD) by LC-MS/MS.

    PubMed

    Borsuk, Sibele; Newcombe, Jane; Mendum, Tom A; Dellagostin, Odir A; McFadden, Johnjoe

    2009-11-01

    The tuberculin purified protein derivative (PPD) is a widely used diagnostic antigen for tuberculosis, however it is poorly defined. Most mycobacterial proteins are extensively denatured by the procedure employed in its preparation, which explains previous difficulties in identifying constituents from PPD to characterize their behaviour in B- and T-cell reactions. We here described a proteomics-based characterization of PPD from several different sources by LC-MS/MS, which combines the solute separation power of HPLC, with the detection power of a mass spectrometer. The technique is able to identify proteins from complex mixtures of peptide fragments. A total of 171 different proteins were identified among the four PPD samples (two bovine PPD and two avium PPD) from Brazil and UK. The majority of the proteins were cytoplasmic (77.9%) and involved in intermediary metabolism and respiration (24.25%) but there was a preponderance of proteins involved in lipid metabolism. We identified a group of 21 proteins that are present in both bovine PPD but were not detected in avium PPD preparation. In addition, four proteins found in bovine PPD are absent in Mycobacterium bovis BCG vaccine strain. This study provides a better understanding of the tuberculin PPD components leading to the identification of additional antigens useful as reagents for specific diagnosis of tuberculosis.

  10. An accurate and computationally efficient algorithm for ground peak identification in large footprint waveform LiDAR data

    NASA Astrophysics Data System (ADS)

    Zhuang, Wei; Mountrakis, Giorgos

    2014-09-01

    Large footprint waveform LiDAR sensors have been widely used for numerous airborne studies. Ground peak identification in a large footprint waveform is a significant bottleneck in exploring full usage of the waveform datasets. In the current study, an accurate and computationally efficient algorithm was developed for ground peak identification, called Filtering and Clustering Algorithm (FICA). The method was evaluated on Land, Vegetation, and Ice Sensor (LVIS) waveform datasets acquired over Central NY. FICA incorporates a set of multi-scale second derivative filters and a k-means clustering algorithm in order to avoid detecting false ground peaks. FICA was tested in five different land cover types (deciduous trees, coniferous trees, shrub, grass and developed area) and showed more accurate results when compared to existing algorithms. More specifically, compared with Gaussian decomposition, the RMSE ground peak identification by FICA was 2.82 m (5.29 m for GD) in deciduous plots, 3.25 m (4.57 m for GD) in coniferous plots, 2.63 m (2.83 m for GD) in shrub plots, 0.82 m (0.93 m for GD) in grass plots, and 0.70 m (0.51 m for GD) in plots of developed areas. FICA performance was also relatively consistent under various slope and canopy coverage (CC) conditions. In addition, FICA showed better computational efficiency compared to existing methods. FICA's major computational and accuracy advantage is a result of the adopted multi-scale signal processing procedures that concentrate on local portions of the signal as opposed to the Gaussian decomposition that uses a curve-fitting strategy applied in the entire signal. The FICA algorithm is a good candidate for large-scale implementation on future space-borne waveform LiDAR sensors.

  11. Leptospiral outer membrane protein microarray, a novel approach to identification of host ligand-binding proteins.

    PubMed

    Pinne, Marija; Matsunaga, James; Haake, David A

    2012-11-01

    Leptospirosis is a zoonosis with worldwide distribution caused by pathogenic spirochetes belonging to the genus Leptospira. The leptospiral life cycle involves transmission via freshwater and colonization of the renal tubules of their reservoir hosts. Infection requires adherence to cell surfaces and extracellular matrix components of host tissues. These host-pathogen interactions involve outer membrane proteins (OMPs) expressed on the bacterial surface. In this study, we developed an Leptospira interrogans serovar Copenhageni strain Fiocruz L1-130 OMP microarray containing all predicted lipoproteins and transmembrane OMPs. A total of 401 leptospiral genes or their fragments were transcribed and translated in vitro and printed on nitrocellulose-coated glass slides. We investigated the potential of this protein microarray to screen for interactions between leptospiral OMPs and fibronectin (Fn). This approach resulted in the identification of the recently described fibronectin-binding protein, LIC10258 (MFn8, Lsa66), and 14 novel Fn-binding proteins, denoted Microarray Fn-binding proteins (MFns). We confirmed Fn binding of purified recombinant LIC11612 (MFn1), LIC10714 (MFn2), LIC11051 (MFn6), LIC11436 (MFn7), LIC10258 (MFn8, Lsa66), and LIC10537 (MFn9) by far-Western blot assays. Moreover, we obtained specific antibodies to MFn1, MFn7, MFn8 (Lsa66), and MFn9 and demonstrated that MFn1, MFn7, and MFn9 are expressed and surface exposed under in vitro growth conditions. Further, we demonstrated that MFn1, MFn4 (LIC12631, Sph2), and MFn7 enable leptospires to bind fibronectin when expressed in the saprophyte, Leptospira biflexa. Protein microarrays are valuable tools for high-throughput identification of novel host ligand-binding proteins that have the potential to play key roles in the virulence mechanisms of pathogens.

  12. A standardized framing for reporting protein identifications in mzIdentML 1.2

    PubMed Central

    Seymour, Sean L.; Farrah, Terry; Binz, Pierre-Alain; Chalkley, Robert J.; Cottrell, John S.; Searle, Brian C.; Tabb, David L.; Vizcaíno, Juan Antonio; Prieto, Gorka; Uszkoreit, Julian; Eisenacher, Martin; Martínez-Bartolomé, Salvador; Ghali, Fawaz; Jones, Andrew R.

    2015-01-01

    Inferring which protein species have been detected in bottom-up proteomics experiments has been a challenging problem for which solutions have been maturing over the past decade. While many inference approaches now function well in isolation, comparing and reconciling the results generated across different tools remains difficult. It presently stands as one of the greatest barriers in collaborative efforts such as the Human Proteome Project and public repositories like the PRoteomics IDEntifications (PRIDE) database. Here we present a framework for reporting protein identifications that seeks to improve capabilities for comparing results generated by different inference tools. This framework standardizes the terminology for describing protein identification results, associated with the HUPO-Proteomics Standards Initiative (PSI) mzIdentML standard, while still allowing for differing methodologies to reach that final state. It is proposed that developers of software for reporting identification results will adopt this terminology in their outputs. While the new terminology does not require any changes to the core mzIdentML model, it represents a significant change in practice, and, as such, the rules will be released via a new version of the mzIdentML specification (version 1.2) so that consumers of files are able to determine whether the new guidelines have been adopted by export software. PMID:25092112

  13. DNA barcode data accurately assign higher spider taxa

    PubMed Central

    Coddington, Jonathan A.; Agnarsson, Ingi; Cheng, Ren-Chung; Čandek, Klemen; Driskell, Amy; Frick, Holger; Gregorič, Matjaž; Kostanjšek, Rok; Kropf, Christian; Kweskin, Matthew; Lokovšek, Tjaša; Pipan, Miha; Vidergar, Nina

    2016-01-01

    The use of unique DNA sequences as a method for taxonomic identification is no longer fundamentally controversial, even though debate continues on the best markers, methods, and technology to use. Although both existing databanks such as GenBank and BOLD, as well as reference taxonomies, are imperfect, in best case scenarios “barcodes” (whether single or multiple, organelle or nuclear, loci) clearly are an increasingly fast and inexpensive method of identification, especially as compared to manual identification of unknowns by increasingly rare expert taxonomists. Because most species on Earth are undescribed, a complete reference database at the species level is impractical in the near term. The question therefore arises whether unidentified species can, using DNA barcodes, be accurately assigned to more inclusive groups such as genera and families—taxonomic ranks of putatively monophyletic groups for which the global inventory is more complete and stable. We used a carefully chosen test library of CO1 sequences from 49 families, 313 genera, and 816 species of spiders to assess the accuracy of genus and family-level assignment. We used BLAST queries of each sequence against the entire library and got the top ten hits. The percent sequence identity was reported from these hits (PIdent, range 75–100%). Accurate assignment of higher taxa (PIdent above which errors totaled less than 5%) occurred for genera at PIdent values >95 and families at PIdent values ≥ 91, suggesting these as heuristic thresholds for accurate generic and familial identifications in spiders. Accuracy of identification increases with numbers of species/genus and genera/family in the library; above five genera per family and fifteen species per genus all higher taxon assignments were correct. We propose that using percent sequence identity between conventional barcode sequences may be a feasible and reasonably accurate method to identify animals to family/genus. However, the quality of

  14. Automated selected reaction monitoring software for accurate label-free protein quantification.

    PubMed

    Teleman, Johan; Karlsson, Christofer; Waldemarson, Sofia; Hansson, Karin; James, Peter; Malmström, Johan; Levander, Fredrik

    2012-07-06

    Selected reaction monitoring (SRM) is a mass spectrometry method with documented ability to quantify proteins accurately and reproducibly using labeled reference peptides. However, the use of labeled reference peptides becomes impractical if large numbers of peptides are targeted and when high flexibility is desired when selecting peptides. We have developed a label-free quantitative SRM workflow that relies on a new automated algorithm, Anubis, for accurate peak detection. Anubis efficiently removes interfering signals from contaminating peptides to estimate the true signal of the targeted peptides. We evaluated the algorithm on a published multisite data set and achieved results in line with manual data analysis. In complex peptide mixtures from whole proteome digests of Streptococcus pyogenes we achieved a technical variability across the entire proteome abundance range of 6.5-19.2%, which was considerably below the total variation across biological samples. Our results show that the label-free SRM workflow with automated data analysis is feasible for large-scale biological studies, opening up new possibilities for quantitative proteomics and systems biology.

  15. Verification of Ribosomal Proteins of Aspergillus fumigatus for Use as Biomarkers in MALDI-TOF MS Identification

    PubMed Central

    Nakamura, Sayaka; Sato, Hiroaki; Tanaka, Reiko; Yaguchi, Takashi

    2016-01-01

    We have previously proposed a rapid identification method for bacterial strains based on the profiles of their ribosomal subunit proteins (RSPs), observed using matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS). This method can perform phylogenetic characterization based on the mass of housekeeping RSP biomarkers, ideally calculated from amino acid sequence information registered in public protein databases. With the aim of extending its field of application to medical mycology, this study investigates the actual state of information of RSPs of eukaryotic fungi registered in public protein databases through the characterization of ribosomal protein fractions extracted from genome-sequenced Aspergillus fumigatus strains Af293 and A1163 as a model. In this process, we have found that the public protein databases harbor problems. The RSP names are in confusion, so we have provisionally unified them using the yeast naming system. The most serious problem is that many incorrect sequences are registered in the public protein databases. Surprisingly, more than half of the sequences are incorrect, due chiefly to mis-annotation of exon/intron structures. These errors could be corrected by a combination of in silico inspection by sequence homology analysis and MALDI-TOF MS measurements. We were also able to confirm conserved post-translational modifications in eleven RSPs. After these verifications, the masses of 31 expressed RSPs under 20,000 Da could be accurately confirmed. These RSPs have a potential to be useful biomarkers for identifying clinical isolates of A. fumigatus. PMID:27843740

  16. A novel method for accurate needle-tip identification in trans-rectal ultrasound-based high-dose-rate prostate brachytherapy.

    PubMed

    Zheng, Dandan; Todor, Dorin A

    2011-01-01

    In real-time trans-rectal ultrasound (TRUS)-based high-dose-rate prostate brachytherapy, the accurate identification of needle-tip position is critical for treatment planning and delivery. Currently, needle-tip identification on ultrasound images can be subject to large uncertainty and errors because of ultrasound image quality and imaging artifacts. To address this problem, we developed a method based on physical measurements with simple and practical implementation to improve the accuracy and robustness of needle-tip identification. Our method uses measurements of the residual needle length and an off-line pre-established coordinate transformation factor, to calculate the needle-tip position on the TRUS images. The transformation factor was established through a one-time systematic set of measurements of the probe and template holder positions, applicable to all patients. To compare the accuracy and robustness of the proposed method and the conventional method (ultrasound detection), based on the gold-standard X-ray fluoroscopy, extensive measurements were conducted in water and gel phantoms. In water phantom, our method showed an average tip-detection accuracy of 0.7 mm compared with 1.6 mm of the conventional method. In gel phantom (more realistic and tissue-like), our method maintained its level of accuracy while the uncertainty of the conventional method was 3.4mm on average with maximum values of over 10mm because of imaging artifacts. A novel method based on simple physical measurements was developed to accurately detect the needle-tip position for TRUS-based high-dose-rate prostate brachytherapy. The method demonstrated much improved accuracy and robustness over the conventional method. Copyright © 2011 American Brachytherapy Society. Published by Elsevier Inc. All rights reserved.

  17. Rapid and accurate prediction and scoring of water molecules in protein binding sites.

    PubMed

    Ross, Gregory A; Morris, Garrett M; Biggin, Philip C

    2012-01-01

    Water plays a critical role in ligand-protein interactions. However, it is still challenging to predict accurately not only where water molecules prefer to bind, but also which of those water molecules might be displaceable. The latter is often seen as a route to optimizing affinity of potential drug candidates. Using a protocol we call WaterDock, we show that the freely available AutoDock Vina tool can be used to predict accurately the binding sites of water molecules. WaterDock was validated using data from X-ray crystallography, neutron diffraction and molecular dynamics simulations and correctly predicted 97% of the water molecules in the test set. In addition, we combined data-mining, heuristic and machine learning techniques to develop probabilistic water molecule classifiers. When applied to WaterDock predictions in the Astex Diverse Set of protein ligand complexes, we could identify whether a water molecule was conserved or displaced to an accuracy of 75%. A second model predicted whether water molecules were displaced by polar groups or by non-polar groups to an accuracy of 80%. These results should prove useful for anyone wishing to undertake rational design of new compounds where the displacement of water molecules is being considered as a route to improved affinity.

  18. Accurate prediction of protein–protein interactions from sequence alignments using a Bayesian method

    PubMed Central

    Burger, Lukas; van Nimwegen, Erik

    2008-01-01

    Accurate and large-scale prediction of protein–protein interactions directly from amino-acid sequences is one of the great challenges in computational biology. Here we present a new Bayesian network method that predicts interaction partners using only multiple alignments of amino-acid sequences of interacting protein domains, without tunable parameters, and without the need for any training examples. We first apply the method to bacterial two-component systems and comprehensively reconstruct two-component signaling networks across all sequenced bacteria. Comparisons of our predictions with known interactions show that our method infers interaction partners genome-wide with high accuracy. To demonstrate the general applicability of our method we show that it also accurately predicts interaction partners in a recent dataset of polyketide synthases. Analysis of the predicted genome-wide two-component signaling networks shows that cognates (interacting kinase/regulator pairs, which lie adjacent on the genome) and orphans (which lie isolated) form two relatively independent components of the signaling network in each genome. In addition, while most genes are predicted to have only a small number of interaction partners, we find that 10% of orphans form a separate class of ‘hub' nodes that distribute and integrate signals to and from up to tens of different interaction partners. PMID:18277381

  19. A rapid identification system for metallothionein proteins using expert system

    PubMed Central

    Praveen, Bhoopathi; Vincent, Savariar; Murty, Upadhyayula Suryanarayana; Krishna, Amirapu Radha; Jamil, Kaiser

    2005-01-01

    Metallothioneins (MT) are low molecular weight proteins mostly rich in cysteine residues with high metal content. Generally, MT proteins are responsible for regulating the intracellular supply of biologically essential metal ions and they protect cells from the deleterious effects of non-essential polarizable transition and post-transition metal ions. Due to their biological importance, proper characterization of MT is necessary. Here we describe a computer program (ID3 algorithm, a part of Artificial Intelligence) developed using available data for the rapid identification of MT. Tissue samples contains several low molecular weight proteins with different physical, chemical and biological characteristics. The described software solution proposes to categorize MT proteins without aromatic amino acids and high metal content. The proposed solution can be expanded to other types of proteins with specific known characteristics. PMID:17597844

  20. Ensemble MD simulations restrained via crystallographic data: Accurate structure leads to accurate dynamics

    PubMed Central

    Xue, Yi; Skrynnikov, Nikolai R

    2014-01-01

    Currently, the best existing molecular dynamics (MD) force fields cannot accurately reproduce the global free-energy minimum which realizes the experimental protein structure. As a result, long MD trajectories tend to drift away from the starting coordinates (e.g., crystallographic structures). To address this problem, we have devised a new simulation strategy aimed at protein crystals. An MD simulation of protein crystal is essentially an ensemble simulation involving multiple protein molecules in a crystal unit cell (or a block of unit cells). To ensure that average protein coordinates remain correct during the simulation, we introduced crystallography-based restraints into the MD protocol. Because these restraints are aimed at the ensemble-average structure, they have only minimal impact on conformational dynamics of the individual protein molecules. So long as the average structure remains reasonable, the proteins move in a native-like fashion as dictated by the original force field. To validate this approach, we have used the data from solid-state NMR spectroscopy, which is the orthogonal experimental technique uniquely sensitive to protein local dynamics. The new method has been tested on the well-established model protein, ubiquitin. The ensemble-restrained MD simulations produced lower crystallographic R factors than conventional simulations; they also led to more accurate predictions for crystallographic temperature factors, solid-state chemical shifts, and backbone order parameters. The predictions for 15N R1 relaxation rates are at least as accurate as those obtained from conventional simulations. Taken together, these results suggest that the presented trajectories may be among the most realistic protein MD simulations ever reported. In this context, the ensemble restraints based on high-resolution crystallographic data can be viewed as protein-specific empirical corrections to the standard force fields. PMID:24452989

  1. Identification of Major Outer Surface Proteins of Streptococcus agalactiae

    PubMed Central

    Hughes, Martin J. G.; Moore, Joanne C.; Lane, Jonathan D.; Wilson, Rebecca; Pribul, Philippa K.; Younes, Zabin N.; Dobson, Richard J.; Everest, Paul; Reason, Andrew J.; Redfern, Joanne M.; Greer, Fiona M.; Paxton, Thanai; Panico, Maria; Morris, Howard R.; Feldman, Robert G.; Santangelo, Joseph D.

    2002-01-01

    To identify the major outer surface proteins of Streptococcus agalactiae (group B streptococcus), a proteomic analysis was undertaken. An extract of the outer surface proteins was separated by two-dimensional electrophoresis. The visualized spots were identified through a combination of peptide sequencing and reverse genetic methodologies. Of the 30 major spots identified as S. agalactiae specific, 27 have been identified. Six of these proteins, previously unidentified in S. agalactiae, were sequenced and cloned. These were ornithine carbamoyltransferase, phosphoglycerate kinase, nonphosphorylating glyceraldehyde-3-phosphate dehydrogenase, purine nucleoside phosphorylase, enolase, and glucose-6-phosphate isomerase. Using a gram-positive expression system, we have overexpressed two of these proteins in an in vitro system. These recombinant, purified proteins were used to raise antisera. The identification of these proteins as residing on the outer surface was confirmed by the ability of the antisera to react against whole, live bacteria. Further, in a neonatal-animal model system, we demonstrate that some of these sera are protective against lethal doses of bacteria. These studies demonstrate the successful application of proteomics as a technique for identifying vaccine candidates. PMID:11854208

  2. Identification of "Known Unknowns" Utilizing Accurate Mass Data and ChemSpider

    NASA Astrophysics Data System (ADS)

    Little, James L.; Williams, Antony J.; Pshenichnov, Alexey; Tkachenko, Valery

    2012-01-01

    In many cases, an unknown to an investigator is actually known in the chemical literature, a reference database, or an internet resource. We refer to these types of compounds as "known unknowns." ChemSpider is a very valuable internet database of known compounds useful in the identification of these types of compounds in commercial, environmental, forensic, and natural product samples. The database contains over 26 million entries from hundreds of data sources and is provided as a free resource to the community. Accurate mass mass spectrometry data is used to query the database by either elemental composition or a monoisotopic mass. Searching by elemental composition is the preferred approach. However, it is often difficult to determine a unique elemental composition for compounds with molecular weights greater than 600 Da. In these cases, searching by the monoisotopic mass is advantageous. In either case, the search results are refined by sorting the number of references associated with each compound in descending order. This raises the most useful candidates to the top of the list for further evaluation. These approaches were shown to be successful in identifying "known unknowns" noted in our laboratory and for compounds of interest to others.

  3. Rapid calculation of accurate atomic charges for proteins via the electronegativity equalization method.

    PubMed

    Ionescu, Crina-Maria; Geidl, Stanislav; Svobodová Vařeková, Radka; Koča, Jaroslav

    2013-10-28

    We focused on the parametrization and evaluation of empirical models for fast and accurate calculation of conformationally dependent atomic charges in proteins. The models were based on the electronegativity equalization method (EEM), and the parametrization procedure was tailored to proteins. We used large protein fragments as reference structures and fitted the EEM model parameters using atomic charges computed by three population analyses (Mulliken, Natural, iterative Hirshfeld), at the Hartree-Fock level with two basis sets (6-31G*, 6-31G**) and in two environments (gas phase, implicit solvation). We parametrized and successfully validated 24 EEM models. When tested on insulin and ubiquitin, all models reproduced quantum mechanics level charges well and were consistent with respect to population analysis and basis set. Specifically, the models showed on average a correlation of 0.961, RMSD 0.097 e, and average absolute error per atom 0.072 e. The EEM models can be used with the freely available EEM implementation EEM_SOLVER.

  4. Rapid and accurate identification of Mycobacterium tuberculosis complex and common non-tuberculous mycobacteria by multiplex real-time PCR targeting different housekeeping genes.

    PubMed

    Nasr Esfahani, Bahram; Rezaei Yazdi, Hadi; Moghim, Sharareh; Ghasemian Safaei, Hajieh; Zarkesh Esfahani, Hamid

    2012-11-01

    Rapid and accurate identification of mycobacteria isolates from primary culture is important due to timely and appropriate antibiotic therapy. Conventional methods for identification of Mycobacterium species based on biochemical tests needs several weeks and may remain inconclusive. In this study, a novel multiplex real-time PCR was developed for rapid identification of Mycobacterium genus, Mycobacterium tuberculosis complex (MTC) and the most common non-tuberculosis mycobacteria species including M. abscessus, M. fortuitum, M. avium complex, M. kansasii, and the M. gordonae in three reaction tubes but under same PCR condition. Genetic targets for primer designing included the 16S rDNA gene, the dnaJ gene, the gyrB gene and internal transcribed spacer (ITS). Multiplex real-time PCR was setup with reference Mycobacterium strains and was subsequently tested with 66 clinical isolates. Results of multiplex real-time PCR were analyzed with melting curves and melting temperature (T (m)) of Mycobacterium genus, MTC, and each of non-tuberculosis Mycobacterium species were determined. Multiplex real-time PCR results were compared with amplification and sequencing of 16S-23S rDNA ITS for identification of Mycobacterium species. Sensitivity and specificity of designed primers were each 100 % for MTC, M. abscessus, M. fortuitum, M. avium complex, M. kansasii, and M. gordonae. Sensitivity and specificity of designed primer for genus Mycobacterium was 96 and 100 %, respectively. According to the obtained results, we conclude that this multiplex real-time PCR with melting curve analysis and these novel primers can be used for rapid and accurate identification of genus Mycobacterium, MTC, and the most common non-tuberculosis Mycobacterium species.

  5. Identification and accurate quantification of structurally related peptide impurities in synthetic human C-peptide by liquid chromatography-high resolution mass spectrometry.

    PubMed

    Li, Ming; Josephs, Ralf D; Daireaux, Adeline; Choteau, Tiphaine; Westwood, Steven; Wielgosz, Robert I; Li, Hongmei

    2018-06-04

    Peptides are an increasingly important group of biomarkers and pharmaceuticals. The accurate purity characterization of peptide calibrators is critical for the development of reference measurement systems for laboratory medicine and quality control of pharmaceuticals. The peptides used for these purposes are increasingly produced through peptide synthesis. Various approaches (for example mass balance, amino acid analysis, qNMR, and nitrogen determination) can be applied to accurately value assign the purity of peptide calibrators. However, all purity assessment approaches require a correction for structurally related peptide impurities in order to avoid biases. Liquid chromatography coupled to high resolution mass spectrometry (LC-hrMS) has become the key technique for the identification and accurate quantification of structurally related peptide impurities in intact peptide calibrator materials. In this study, LC-hrMS-based methods were developed and validated in-house for the identification and quantification of structurally related peptide impurities in a synthetic human C-peptide (hCP) material, which served as a study material for an international comparison looking at the competencies of laboratories to perform peptide purity mass fraction assignments. More than 65 impurities were identified, confirmed, and accurately quantified by using LC-hrMS. The total mass fraction of all structurally related peptide impurities in the hCP study material was estimated to be 83.3 mg/g with an associated expanded uncertainty of 3.0 mg/g (k = 2). The calibration hierarchy concept used for the quantification of individual impurities is described in detail. Graphical abstract ᅟ.

  6. Mass spectrometry-based protein identification by integrating de novo sequencing with database searching.

    PubMed

    Wang, Penghao; Wilson, Susan R

    2013-01-01

    Mass spectrometry-based protein identification is a very challenging task. The main identification approaches include de novo sequencing and database searching. Both approaches have shortcomings, so an integrative approach has been developed. The integrative approach firstly infers partial peptide sequences, known as tags, directly from tandem spectra through de novo sequencing, and then puts these sequences into a database search to see if a close peptide match can be found. However the current implementation of this integrative approach has several limitations. Firstly, simplistic de novo sequencing is applied and only very short sequence tags are used. Secondly, most integrative methods apply an algorithm similar to BLAST to search for exact sequence matches and do not accommodate sequence errors well. Thirdly, by applying these methods the integrated de novo sequencing makes a limited contribution to the scoring model which is still largely based on database searching. We have developed a new integrative protein identification method which can integrate de novo sequencing more efficiently into database searching. Evaluated on large real datasets, our method outperforms popular identification methods.

  7. Towards accurate modeling of noncovalent interactions for protein rigidity analysis.

    PubMed

    Fox, Naomi; Streinu, Ileana

    2013-01-01

    Protein rigidity analysis is an efficient computational method for extracting flexibility information from static, X-ray crystallography protein data. Atoms and bonds are modeled as a mechanical structure and analyzed with a fast graph-based algorithm, producing a decomposition of the flexible molecule into interconnected rigid clusters. The result depends critically on noncovalent atomic interactions, primarily on how hydrogen bonds and hydrophobic interactions are computed and modeled. Ongoing research points to the stringent need for benchmarking rigidity analysis software systems, towards the goal of increasing their accuracy and validating their results, either against each other and against biologically relevant (functional) parameters. We propose two new methods for modeling hydrogen bonds and hydrophobic interactions that more accurately reflect a mechanical model, without being computationally more intensive. We evaluate them using a novel scoring method, based on the B-cubed score from the information retrieval literature, which measures how well two cluster decompositions match. To evaluate the modeling accuracy of KINARI, our pebble-game rigidity analysis system, we use a benchmark data set of 20 proteins, each with multiple distinct conformations deposited in the Protein Data Bank. Cluster decompositions for them were previously determined with the RigidFinder method from Gerstein's lab and validated against experimental data. When KINARI's default tuning parameters are used, an improvement of the B-cubed score over a crude baseline is observed in 30% of this data. With our new modeling options, improvements were observed in over 70% of the proteins in this data set. We investigate the sensitivity of the cluster decomposition score with case studies on pyruvate phosphate dikinase and calmodulin. To substantially improve the accuracy of protein rigidity analysis systems, thorough benchmarking must be performed on all current systems and future

  8. Towards accurate modeling of noncovalent interactions for protein rigidity analysis

    PubMed Central

    2013-01-01

    Background Protein rigidity analysis is an efficient computational method for extracting flexibility information from static, X-ray crystallography protein data. Atoms and bonds are modeled as a mechanical structure and analyzed with a fast graph-based algorithm, producing a decomposition of the flexible molecule into interconnected rigid clusters. The result depends critically on noncovalent atomic interactions, primarily on how hydrogen bonds and hydrophobic interactions are computed and modeled. Ongoing research points to the stringent need for benchmarking rigidity analysis software systems, towards the goal of increasing their accuracy and validating their results, either against each other and against biologically relevant (functional) parameters. We propose two new methods for modeling hydrogen bonds and hydrophobic interactions that more accurately reflect a mechanical model, without being computationally more intensive. We evaluate them using a novel scoring method, based on the B-cubed score from the information retrieval literature, which measures how well two cluster decompositions match. Results To evaluate the modeling accuracy of KINARI, our pebble-game rigidity analysis system, we use a benchmark data set of 20 proteins, each with multiple distinct conformations deposited in the Protein Data Bank. Cluster decompositions for them were previously determined with the RigidFinder method from Gerstein's lab and validated against experimental data. When KINARI's default tuning parameters are used, an improvement of the B-cubed score over a crude baseline is observed in 30% of this data. With our new modeling options, improvements were observed in over 70% of the proteins in this data set. We investigate the sensitivity of the cluster decomposition score with case studies on pyruvate phosphate dikinase and calmodulin. Conclusion To substantially improve the accuracy of protein rigidity analysis systems, thorough benchmarking must be performed on all

  9. Discovering functional interdependence relationship in PPI networks for protein complex identification.

    PubMed

    Lam, Winnie W M; Chan, Keith C C

    2012-04-01

    Protein molecules interact with each other in protein complexes to perform many vital functions, and different computational techniques have been developed to identify protein complexes in protein-protein interaction (PPI) networks. These techniques are developed to search for subgraphs of high connectivity in PPI networks under the assumption that the proteins in a protein complex are highly interconnected. While these techniques have been shown to be quite effective, it is also possible that the matching rate between the protein complexes they discover and those that are previously determined experimentally be relatively low and the "false-alarm" rate can be relatively high. This is especially the case when the assumption of proteins in protein complexes being more highly interconnected be relatively invalid. To increase the matching rate and reduce the false-alarm rate, we have developed a technique that can work effectively without having to make this assumption. The name of the technique called protein complex identification by discovering functional interdependence (PCIFI) searches for protein complexes in PPI networks by taking into consideration both the functional interdependence relationship between protein molecules and the network topology of the network. The PCIFI works in several steps. The first step is to construct a multiple-function protein network graph by labeling each vertex with one or more of the molecular functions it performs. The second step is to filter out protein interactions between protein pairs that are not functionally interdependent of each other in the statistical sense. The third step is to make use of an information-theoretic measure to determine the strength of the functional interdependence between all remaining interacting protein pairs. Finally, the last step is to try to form protein complexes based on the measure of the strength of functional interdependence and the connectivity between proteins. For performance evaluation

  10. Accurate pan-specific prediction of peptide-MHC class II binding affinity with improved binding core identification.

    PubMed

    Andreatta, Massimo; Karosiene, Edita; Rasmussen, Michael; Stryhn, Anette; Buus, Søren; Nielsen, Morten

    2015-11-01

    A key event in the generation of a cellular response against malicious organisms through the endocytic pathway is binding of peptidic antigens by major histocompatibility complex class II (MHC class II) molecules. The bound peptide is then presented on the cell surface where it can be recognized by T helper lymphocytes. NetMHCIIpan is a state-of-the-art method for the quantitative prediction of peptide binding to any human or mouse MHC class II molecule of known sequence. In this paper, we describe an updated version of the method with improved peptide binding register identification. Binding register prediction is concerned with determining the minimal core region of nine residues directly in contact with the MHC binding cleft, a crucial piece of information both for the identification and design of CD4(+) T cell antigens. When applied to a set of 51 crystal structures of peptide-MHC complexes with known binding registers, the new method NetMHCIIpan-3.1 significantly outperformed the earlier 3.0 version. We illustrate the impact of accurate binding core identification for the interpretation of T cell cross-reactivity using tetramer double staining with a CMV epitope and its variants mapped to the epitope binding core. NetMHCIIpan is publicly available at http://www.cbs.dtu.dk/services/NetMHCIIpan-3.1 .

  11. Free fatty acid particles in protein formulations, part 1: microspectroscopic identification.

    PubMed

    Cao, Xiaolin; Fesinmeyer, R Matthew; Pierini, Christopher J; Siska, Christine C; Litowski, Jennifer R; Brych, Stephen; Wen, Zai-Qing; Kleemann, Gerd R

    2015-02-01

    We report, for the first time, the identification of fatty acid particles in formulations containing the surfactant polysorbate 20. These fatty acid particles were observed in multiple mAb formulations during their expected shelf life under recommended storage conditions. The fatty acid particles were granular or sand-like in morphology and were several microns in size. They could be identified by distinct IR bands, with additional confirmation from energy-dispersive X-ray spectroscopy analysis. The particles were readily distinguishable from protein particles by these methods. In addition, particles containing a mixture of protein and fatty acids were also identified, suggesting that the particulation pathways for the two particle types may not be distinct. The techniques and observations described will be useful for the correct identification of proteinaceous versus nonproteinaceous particles in pharmaceutical products. © 2014 Wiley Periodicals, Inc. and the American Pharmacists Association.

  12. On the nature of cavities on protein surfaces: application to the identification of drug-binding sites.

    PubMed

    Nayal, Murad; Honig, Barry

    2006-06-01

    In this article we introduce a new method for the identification and the accurate characterization of protein surface cavities. The method is encoded in the program SCREEN (Surface Cavity REcognition and EvaluatioN). As a first test of the utility of our approach we used SCREEN to locate and analyze the surface cavities of a nonredundant set of 99 proteins cocrystallized with drugs. We find that this set of proteins has on average about 14 distinct cavities per protein. In all cases, a drug is bound at one (and sometimes more than one) of these cavities. Using cavity size alone as a criterion for predicting drug-binding sites yields a high balanced error rate of 15.7%, with only 71.7% coverage. Here we characterize each surface cavity by computing a comprehensive set of 408 physicochemical, structural, and geometric attributes. By applying modern machine learning techniques (Random Forests) we were able to develop a classifier that can identify drug-binding cavities with a balanced error rate of 7.2% and coverage of 88.9%. Only 18 of the 408 cavity attributes had a statistically significant role in the prediction. Of these 18 important attributes, almost all involved size and shape rather than physicochemical properties of the surface cavity. The implications of these results are discussed. A SCREEN Web server is available at http://interface.bioc.columbia.edu/screen. 2006 Wiley-Liss, Inc.

  13. A Graph-Centric Approach for Metagenome-Guided Peptide and Protein Identification in Metaproteomics

    PubMed Central

    Tang, Haixu; Li, Sujun; Ye, Yuzhen

    2016-01-01

    Metaproteomic studies adopt the common bottom-up proteomics approach to investigate the protein composition and the dynamics of protein expression in microbial communities. When matched metagenomic and/or metatranscriptomic data of the microbial communities are available, metaproteomic data analyses often employ a metagenome-guided approach, in which complete or fragmental protein-coding genes are first directly predicted from metagenomic (and/or metatranscriptomic) sequences or from their assemblies, and the resulting protein sequences are then used as the reference database for peptide/protein identification from MS/MS spectra. This approach is often limited because protein coding genes predicted from metagenomes are incomplete and fragmental. In this paper, we present a graph-centric approach to improving metagenome-guided peptide and protein identification in metaproteomics. Our method exploits the de Bruijn graph structure reported by metagenome assembly algorithms to generate a comprehensive database of protein sequences encoded in the community. We tested our method using several public metaproteomic datasets with matched metagenomic and metatranscriptomic sequencing data acquired from complex microbial communities in a biological wastewater treatment plant. The results showed that many more peptides and proteins can be identified when assembly graphs were utilized, improving the characterization of the proteins expressed in the microbial communities. The additional proteins we identified contribute to the characterization of important pathways such as those involved in degradation of chemical hazards. Our tools are released as open-source software on github at https://github.com/COL-IU/Graph2Pro. PMID:27918579

  14. Fast tandem mass spectra-based protein identification regardless of the number of spectra or potential modifications examined.

    PubMed

    Falkner, Jayson; Andrews, Philip

    2005-05-15

    Comparing tandem mass spectra (MSMS) against a known dataset of protein sequences is a common method for identifying unknown proteins; however, the processing of MSMS by current software often limits certain applications, including comprehensive coverage of post-translational modifications, non-specific searches and real-time searches to allow result-dependent instrument control. This problem deserves attention as new mass spectrometers provide the ability for higher throughput and as known protein datasets rapidly grow in size. New software algorithms need to be devised in order to address the performance issues of conventional MSMS protein dataset-based protein identification. This paper describes a novel algorithm based on converting a collection of monoisotopic, centroided spectra to a new data structure, named 'peptide finite state machine' (PFSM), which may be used to rapidly search a known dataset of protein sequences, regardless of the number of spectra searched or the number of potential modifications examined. The algorithm is verified using a set of commercially available tryptic digest protein standards analyzed using an ABI 4700 MALDI TOFTOF mass spectrometer, and a free, open source PFSM implementation. It is illustrated that a PFSM can accurately search large collections of spectra against large datasets of protein sequences (e.g. NCBI nr) using a regular desktop PC; however, this paper only details the method for identifying peptide and subsequently protein candidates from a dataset of known protein sequences. The concept of using a PFSM as a peptide pre-screening technique for MSMS-based search engines is validated by using PFSM with Mascot and XTandem. Complete source code, documentation and examples for the reference PFSM implementation are freely available at the Proteome Commons, http://www.proteomecommons.org and source code may be used both commercially and non-commercially as long as the original authors are credited for their work.

  15. MP3: a software tool for the prediction of pathogenic proteins in genomic and metagenomic data.

    PubMed

    Gupta, Ankit; Kapil, Rohan; Dhakan, Darshan B; Sharma, Vineet K

    2014-01-01

    The identification of virulent proteins in any de-novo sequenced genome is useful in estimating its pathogenic ability and understanding the mechanism of pathogenesis. Similarly, the identification of such proteins could be valuable in comparing the metagenome of healthy and diseased individuals and estimating the proportion of pathogenic species. However, the common challenge in both the above tasks is the identification of virulent proteins since a significant proportion of genomic and metagenomic proteins are novel and yet unannotated. The currently available tools which carry out the identification of virulent proteins provide limited accuracy and cannot be used on large datasets. Therefore, we have developed an MP3 standalone tool and web server for the prediction of pathogenic proteins in both genomic and metagenomic datasets. MP3 is developed using an integrated Support Vector Machine (SVM) and Hidden Markov Model (HMM) approach to carry out highly fast, sensitive and accurate prediction of pathogenic proteins. It displayed Sensitivity, Specificity, MCC and accuracy values of 92%, 100%, 0.92 and 96%, respectively, on blind dataset constructed using complete proteins. On the two metagenomic blind datasets (Blind A: 51-100 amino acids and Blind B: 30-50 amino acids), it displayed Sensitivity, Specificity, MCC and accuracy values of 82.39%, 97.86%, 0.80 and 89.32% for Blind A and 71.60%, 94.48%, 0.67 and 81.86% for Blind B, respectively. In addition, the performance of MP3 was validated on selected bacterial genomic and real metagenomic datasets. To our knowledge, MP3 is the only program that specializes in fast and accurate identification of partial pathogenic proteins predicted from short (100-150 bp) metagenomic reads and also performs exceptionally well on complete protein sequences. MP3 is publicly available at http://metagenomics.iiserb.ac.in/mp3/index.php.

  16. MP3: A Software Tool for the Prediction of Pathogenic Proteins in Genomic and Metagenomic Data

    PubMed Central

    Gupta, Ankit; Kapil, Rohan; Dhakan, Darshan B.; Sharma, Vineet K.

    2014-01-01

    The identification of virulent proteins in any de-novo sequenced genome is useful in estimating its pathogenic ability and understanding the mechanism of pathogenesis. Similarly, the identification of such proteins could be valuable in comparing the metagenome of healthy and diseased individuals and estimating the proportion of pathogenic species. However, the common challenge in both the above tasks is the identification of virulent proteins since a significant proportion of genomic and metagenomic proteins are novel and yet unannotated. The currently available tools which carry out the identification of virulent proteins provide limited accuracy and cannot be used on large datasets. Therefore, we have developed an MP3 standalone tool and web server for the prediction of pathogenic proteins in both genomic and metagenomic datasets. MP3 is developed using an integrated Support Vector Machine (SVM) and Hidden Markov Model (HMM) approach to carry out highly fast, sensitive and accurate prediction of pathogenic proteins. It displayed Sensitivity, Specificity, MCC and accuracy values of 92%, 100%, 0.92 and 96%, respectively, on blind dataset constructed using complete proteins. On the two metagenomic blind datasets (Blind A: 51–100 amino acids and Blind B: 30–50 amino acids), it displayed Sensitivity, Specificity, MCC and accuracy values of 82.39%, 97.86%, 0.80 and 89.32% for Blind A and 71.60%, 94.48%, 0.67 and 81.86% for Blind B, respectively. In addition, the performance of MP3 was validated on selected bacterial genomic and real metagenomic datasets. To our knowledge, MP3 is the only program that specializes in fast and accurate identification of partial pathogenic proteins predicted from short (100–150 bp) metagenomic reads and also performs exceptionally well on complete protein sequences. MP3 is publicly available at http://metagenomics.iiserb.ac.in/mp3/index.php. PMID:24736651

  17. Advances in the Study of Aptamer-Protein Target Identification Using the Chromatographic Approach.

    PubMed

    Drabik, Anna; Ner-Kluza, Joanna; Mielczarek, Przemyslaw; Civit, Laia; Mayer, Günter; Silberring, Jerzy

    2018-06-01

    Ever since the development of the process known as the systematic evolution of ligands by exponential enrichment (SELEX), aptamers have been widely used in a variety of studies, including the exploration of new diagnostic tools and the discovery of new treatment methods. Aptamers' ability to bind to proteins with high affinity and specificity, often compared to that of antibodies, enables the search for potential cancer biomarkers and helps us understand the mechanisms of carcinogenesis. The blind spot of those investigations is usually the difficulty in the selective extraction of targets attached to the aptamer. There are many studies describing the cell SELEX for the prime choice of aptamers toward living cancer cells or even whole tumors in the animal models. However, a dilemma arises when a large number of proteins are being identified as potential targets, which is often the case. In this article, we present a new analytical approach designed to selectively target proteins bound to aptamers. During studies, we have focused on the unambiguous identification of the molecular targets of aptamers characterized by high specificity to the prostate cancer cells. We have compared four assay approaches using electrophoretic and chromatographic methods for "fishing out" aptamer protein targets followed by mass spectrometry identification. We have established a new methodology, based on the fluorescent-tagged oligonucleotides commonly used for flow-cytometry experiments or as optic aptasensors, that allowed the detection of specific aptamer-protein interactions by mass spectrometry. The use of atto488-labeled aptamers for the tracking of the formation of specific aptamer-target complexes provides the possibility of studying putative protein counterparts without needing to apply enrichment techniques. Significantly, changes in the hydrophobic properties of atto488-labeled aptamer-protein complexes facilitate their separation by reverse-phase chromatography combined with

  18. Yeast Inner-Subunit PA-NZ-1 Labeling Strategy for Accurate Subunit Identification in a Macromolecular Complex through Cryo-EM Analysis.

    PubMed

    Wang, Huping; Han, Wenyu; Takagi, Junichi; Cong, Yao

    2018-05-11

    Cryo-electron microscopy (cryo-EM) has been established as one of the central tools in the structural study of macromolecular complexes. Although intermediate- or low-resolution structural information through negative staining or cryo-EM analysis remains highly valuable, we lack general and efficient ways to achieve unambiguous subunit identification in these applications. Here, we took advantage of the extremely high affinity between a dodecapeptide "PA" tag and the NZ-1 antibody Fab fragment to develop an efficient "yeast inner-subunit PA-NZ-1 labeling" strategy that when combined with cryo-EM could precisely identify subunits in macromolecular complexes. Using this strategy combined with cryo-EM 3D reconstruction, we were able to visualize the characteristic NZ-1 Fab density attached to the PA tag inserted into a surface-exposed loop in the middle of the sequence of CCT6 subunit present in the Saccharomyces cerevisiae group II chaperonin TRiC/CCT. This procedure facilitated the unambiguous localization of CCT6 in the TRiC complex. The PA tag was designed to contain only 12 amino acids and a tight turn configuration; when inserted into a loop, it usually has a high chance of maintaining the epitope structure and low likelihood of perturbing the native structure and function of the target protein compared to other tagging systems. We also found that the association between PA and NZ-1 can sustain the cryo freezing conditions, resulting in very high occupancy of the Fab in the final cryo-EM images. Our study demonstrated the robustness of this strategy combined with cryo-EM in efficient and accurate subunit identification in challenging multi-component complexes. Copyright © 2018 Elsevier Ltd. All rights reserved.

  19. Identification of Small RNA-Protein Partners in Plant Symbiotic Bacteria.

    PubMed

    Robledo, Marta; Matia-González, Ana M; García-Tomsig, Natalia I; Jiménez-Zurdo, José I

    2018-01-01

    The identification of the protein partners of bacterial small noncoding RNAs (sRNAs) is essential to understand the mechanistic principles and functions of riboregulation in prokaryotic cells. Here, we describe an optimized affinity chromatography protocol that enables purification of in vivo formed sRNA-protein complexes in Sinorhizobium meliloti, a genetically tractable nitrogen-fixing plant symbiotic bacterium. The procedure requires the tagging of the desired sRNA with the MS2 aptamer, which is affinity-captured by the MS2-MBP protein conjugated to an amylose resin. As proof of principle, we show recovery of the RNA chaperone Hfq associated to the strictly Hfq-dependent AbcR2 trans-sRNA. This method can be applied for the investigation of sRNA-protein interactions on a broad range of genetically tractable α-proteobacteria.

  20. Identification of novel direct protein-protein interactions by irradiating living cells with femtosecond UV laser pulses.

    PubMed

    Itri, Francesco; Monti, Daria Maria; Chino, Marco; Vinciguerra, Roberto; Altucci, Carlo; Lombardi, Angela; Piccoli, Renata; Birolo, Leila; Arciello, Angela

    2017-10-07

    The identification of protein-protein interaction networks in living cells is becoming increasingly fundamental to elucidate main biological processes and to understand disease molecular bases on a system-wide level. We recently described a method (LUCK, Laser UV Cross-linKing) to cross-link interacting protein surfaces in living cells by UV laser irradiation. By using this innovative methodology, that does not require any protein modification or cell engineering, here we demonstrate that, upon UV laser irradiation of HeLa cells, a direct interaction between GAPDH and alpha-enolase was "frozen" by a cross-linking event. We validated the occurrence of this direct interaction by co-immunoprecipitation and Immuno-FRET analyses. This represents a proof of principle of the LUCK capability to reveal direct protein interactions in their physiological environment. Copyright © 2017 Elsevier Inc. All rights reserved.

  1. Protein markers for identification of Yersinia pestis and their variation related to culture

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wunschel, David S.; Engelmann, Heather E.; Victry, Kristin D.

    2013-12-11

    The detection of high consequence pathogens, such as Yersinia pestis, is well established in biodefense laboratories for bioterror situations. Laboratory protocols are well established using specified culture media and a growth temperature of 37 °C for expression of specific antigens. Direct detection of Y. pestis protein markers, without prior culture, depends on their expression. Unfortunately protein expression can be impacted by the culture medium which cannot be predicted ahead of time. Furthermore, higher biomass yields are obtained at the optimal growth temperature (i.e. 28 °C–30 °C) and therefore are more likely to be used for bulk production. Analysis of Y.more » pestis grown on several types of media at 30 °C showed that several protein markers were found to be differentially detected in different media. Analysis of the identified proteins against a comprehensive database provided an additional level of organism identification. Peptides corresponding to variable regions of some proteins could separate large groups of strains and aid in organism identification. This work illustrates the need to understand variability of protein expression for detection targets. The potential for relating expression changes of known proteins to specific media factors, even in nutrient rich and chemically complex culture medium, may provide the opportunity to draw forensic information from protein profiles.« less

  2. Identification of Trypanosome Proteins in Plasma from African Sleeping Sickness Patients Infected with T. b. rhodesiense

    PubMed Central

    Enyaru, John C.; Carr, Steven A.; Pearson, Terry W.

    2013-01-01

    Control of human African sleeping sickness, caused by subspecies of the protozoan parasite Trypanosoma brucei, is based on preventing transmission by elimination of the tsetse vector and by active diagnostic screening and treatment of infected patients. To identify trypanosome proteins that have potential as biomarkers for detection and monitoring of African sleeping sickness, we have used a ‘deep-mining” proteomics approach to identify trypanosome proteins in human plasma. Abundant human plasma proteins were removed by immunodepletion. Depleted plasma samples were then digested to peptides with trypsin, fractionated by basic reversed phase and each fraction analyzed by liquid chromatography-tandem mass spectrometry (LC-MS/MS). This sample processing and analysis method enabled identification of low levels of trypanosome proteins in pooled plasma from late stage sleeping sickness patients infected with Trypanosoma brucei rhodesiense. A total of 254 trypanosome proteins were confidently identified. Many of the parasite proteins identified were of unknown function, although metabolic enzymes, chaperones, proteases and ubiquitin-related/acting proteins were found. This approach to the identification of conserved, soluble trypanosome proteins in human plasma offers a possible route to improved disease diagnosis and monitoring, since these molecules are potential biomarkers for the development of a new generation of antigen-detection assays. The combined immuno-depletion/mass spectrometric approach can be applied to a variety of infectious diseases for unbiased biomarker identification. PMID:23951171

  3. Identification of Trypanosome proteins in plasma from African sleeping sickness patients infected with T. b. rhodesiense.

    PubMed

    Eyford, Brett A; Ahmad, Rushdy; Enyaru, John C; Carr, Steven A; Pearson, Terry W

    2013-01-01

    Control of human African sleeping sickness, caused by subspecies of the protozoan parasite Trypanosoma brucei, is based on preventing transmission by elimination of the tsetse vector and by active diagnostic screening and treatment of infected patients. To identify trypanosome proteins that have potential as biomarkers for detection and monitoring of African sleeping sickness, we have used a 'deep-mining" proteomics approach to identify trypanosome proteins in human plasma. Abundant human plasma proteins were removed by immunodepletion. Depleted plasma samples were then digested to peptides with trypsin, fractionated by basic reversed phase and each fraction analyzed by liquid chromatography-tandem mass spectrometry (LC-MS/MS). This sample processing and analysis method enabled identification of low levels of trypanosome proteins in pooled plasma from late stage sleeping sickness patients infected with Trypanosoma brucei rhodesiense. A total of 254 trypanosome proteins were confidently identified. Many of the parasite proteins identified were of unknown function, although metabolic enzymes, chaperones, proteases and ubiquitin-related/acting proteins were found. This approach to the identification of conserved, soluble trypanosome proteins in human plasma offers a possible route to improved disease diagnosis and monitoring, since these molecules are potential biomarkers for the development of a new generation of antigen-detection assays. The combined immuno-depletion/mass spectrometric approach can be applied to a variety of infectious diseases for unbiased biomarker identification.

  4. Enhancing Membrane Protein Identification Using a Simplified Centrifugation and Detergent-Based Membrane Extraction Approach.

    PubMed

    Zhou, Yanting; Gao, Jing; Zhu, Hongwen; Xu, Jingjing; He, Han; Gu, Lei; Wang, Hui; Chen, Jie; Ma, Danjun; Zhou, Hu; Zheng, Jing

    2018-02-20

    Membrane proteins may act as transporters, receptors, enzymes, and adhesion-anchors, accounting for nearly 70% of pharmaceutical drug targets. Difficulties in efficient enrichment, extraction, and solubilization still exist because of their relatively low abundance and poor solubility. A simplified membrane protein extraction approach with advantages of user-friendly sample processing procedures, good repeatability and significant effectiveness was developed in the current research for enhancing enrichment and identification of membrane proteins. This approach combining centrifugation and detergent along with LC-MS/MS successfully identified higher proportion of membrane proteins, integral proteins and transmembrane proteins in membrane fraction (76.6%, 48.1%, and 40.6%) than in total cell lysate (41.6%, 16.4%, and 13.5%), respectively. Moreover, our method tended to capture membrane proteins with high degree of hydrophobicity and number of transmembrane domains as 486 out of 2106 (23.0%) had GRAVY > 0 in membrane fraction, 488 out of 2106 (23.1%) had TMs ≥ 2. It also provided for improved identification of membrane proteins as more than 60.6% of the commonly identified membrane proteins in two cell samples were better identified in membrane fraction with higher sequence coverage. Data are available via ProteomeXchange with identifier PXD008456.

  5. Identification of novel phosphatidic acid-binding proteins in the rat brain.

    PubMed

    Park, ChiHu; Kang, Du-Seock; Shin, Geon-Hoon; Seo, Jeongkon; Kim, Hyein; Suh, Pann-Ghill; Bae, Chang-Dae; Shin, Joo-Ho

    2015-05-19

    Phosphatidic acid (PA) is an abundant negatively-charged phospholipid and has long been considered to be an important signaling molecule in diverse cellular events. Thus, the identification of proteins that specifically interact with PA is of considerable interest to understand the regulatory roles of PA. Herein, lipid-affinity purification and mass spectrometric analysis reveals 43 proteins, 19 known and 24 novel, as PA-binding proteins. A lipid-protein overlay assay confirmed that GDI1, PACSIN1, and DPYSL2 interact with not only with PA but also with other phospholipids. These results might be helpful for deciphering the functional effect of PA in the brain. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  6. Identification of a putative protein profile associated with tamoxifen therapy resistance in breast cancer.

    PubMed

    Umar, Arzu; Kang, Hyuk; Timmermans, Annemieke M; Look, Maxime P; Meijer-van Gelder, Marion E; den Bakker, Michael A; Jaitly, Navdeep; Martens, John W M; Luider, Theo M; Foekens, John A; Pasa-Tolić, Ljiljana

    2009-06-01

    Tamoxifen resistance is a major cause of death in patients with recurrent breast cancer. Current clinical factors can correctly predict therapy response in only half of the treated patients. Identification of proteins that are associated with tamoxifen resistance is a first step toward better response prediction and tailored treatment of patients. In the present study we intended to identify putative protein biomarkers indicative of tamoxifen therapy resistance in breast cancer using nano-LC coupled with FTICR MS. Comparative proteome analysis was performed on approximately 5,500 pooled tumor cells (corresponding to approximately 550 ng of protein lysate/analysis) obtained through laser capture microdissection (LCM) from two independently processed data sets (n = 24 and n = 27) containing both tamoxifen therapy-sensitive and therapy-resistant tumors. Peptides and proteins were identified by matching mass and elution time of newly acquired LC-MS features to information in previously generated accurate mass and time tag reference databases. A total of 17,263 unique peptides were identified that corresponded to 2,556 non-redundant proteins identified with > or = 2 peptides. 1,713 overlapping proteins between the two data sets were used for further analysis. Comparative proteome analysis revealed 100 putatively differentially abundant proteins between tamoxifen-sensitive and tamoxifen-resistant tumors. The presence and relative abundance for 47 differentially abundant proteins were verified by targeted nano-LC-MS/MS in a selection of unpooled, non-microdissected discovery set tumor tissue extracts. ENPP1, EIF3E, and GNB4 were significantly associated with progression-free survival upon tamoxifen treatment for recurrent disease. Differential abundance of our top discriminating protein, extracellular matrix metalloproteinase inducer, was validated by tissue microarray in an independent patient cohort (n = 156). Extracellular matrix metalloproteinase inducer levels were

  7. Differential identification of Candida species and other yeasts by analysis of (/sup 35/S)methionine-labeled polypeptide profiles

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Shen, H.D.; Choo, K.B.; Tsai, W.C.

    1988-12-01

    This paper describes a scheme for differential identification of Candida species and other yeasts based on autoradiographic analysis of protein profiles of (/sup 35/S)methionine-labeled cellular proteins separated by sodium dodecyl sulfate-polyacrylamide gel electrophoresis. Using ATCC strains as references, protein profile analysis showed that different Candida and other yeast species produced distinctively different patterns. Good agreement in results obtained with this approach and with other conventional systems was observed. Being accurate and reproducible, this approach provides a basis for the development of an alternative method for the identification of yeasts isolated from clinical specimens.

  8. Identification of shed proteins from Chinese hamster ovary cells: Application of statistical confidence using human and mouse protein databases

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ahram, Mamoun; Strittmatter, Eric F.; Monroe, Matthew E.

    The shedding process releases ligands, receptors, and other proteins from the surface of the cell and is a mechanism whereby cells communicate. Even though altered regulation of this process has been implicated in several diseases, global approaches to evaluate shed proteins have not been developed. A goal of this study was to identify global changes in shed proteins in media taken from cells exposed to low-doses of radiation in an effort to develop a fundamental understanding of the bystander response. CHO cells were chosen for this study because they have been widely used for radiation studies and since they havemore » been reported to respond to radiation by releasing factors into the media that cause genomic instability and cytotoxicity in unexposed cells, i.e., a bystander effect. Media samples taken for irradiated cells were evaluated using a combination of tandem- and FTICR-mass spectrometry analysis. Since the hamster genome has not been sequenced, mass spectrometry data was searched against the mouse and human proteins databases. Nearly 150 proteins that were identified by tandem mass spectrometry were confirmed by FTICR. When both types of mass spectrometry data were evaluated with a new confidence scoring tool, which is based on discriminant analyses, about 500 protein were identified. Approximately 20% of these identifications were either integral membrane proteins or membrane associated proteins, suggesting that they were derived from the cell surface, hence were likely shed. However, estimates of quantitative changes, based on two independent mass spectrometry approaches, did not identify any protein abundance changes attributable to the bystander effect. Results from this study demonstrate the feasibility of global evaluation of shed proteins using mass spectrometry in conjunction with cross-species protein databases and that significant improvement in peptide/protein identifications is provided by the confidence scoring tool.« less

  9. Identification of Host Proteins Associated with Retroviral Vector Particles by Proteomic Analysis of Highly Purified Vector Preparations▿

    PubMed Central

    Segura, María Mercedes; Garnier, Alain; Di Falco, Marcos Rafael; Whissell, Gavin; Meneses-Acosta, Angélica; Arcand, Normand; Kamen, Amine

    2008-01-01

    The Moloney murine leukemia virus (MMLV) belongs to the Retroviridae family of enveloped viruses, which is known to acquire minute amounts of host cellular proteins both on the surface and inside the virion. Despite the extensive use of retroviral vectors in experimental and clinical applications, the repertoire of host proteins incorporated into MMLV vector particles remains unexplored. We report here the identification of host proteins from highly purified retroviral vector preparations obtained by rate-zonal ultracentrifugation. Viral proteins were fractionated by one-dimensional sodium dodecyl sulfate-polyacrylamide gel electrophoresis, in-gel tryptic digested, and subjected to liquid chromatography/tandem mass spectrometry analysis. Immunogold electron microscopy studies confirmed the presence of several host membrane proteins exposed at the vector surface. These studies led to the identification of 27 host proteins on MMLV vector particles derived from 293 HEK cells, including 5 proteins previously described as part of wild-type MMLV. Nineteen host proteins identified corresponded to intracellular proteins. A total of eight host membrane proteins were identified, including cell adhesion proteins integrin β1 (fibronectin receptor subunit beta) and HMFG-E8, tetraspanins CD81 and CD9, and late endosomal markers CD63 and Lamp-2. Identification of membrane proteins on the retroviral surface is particularly attractive, since they can serve as anchoring sites for the insertion of tags for targeting or purification purposes. The implications of our findings for retrovirus-mediated gene therapy are discussed. PMID:18032515

  10. Improving automatic peptide mass fingerprint protein identification by combining many peak sets.

    PubMed

    Rögnvaldsson, Thorsteinn; Häkkinen, Jari; Lindberg, Claes; Marko-Varga, György; Potthast, Frank; Samuelsson, Jim

    2004-08-05

    An automated peak picking strategy is presented where several peak sets with different signal-to-noise levels are combined to form a more reliable statement on the protein identity. The strategy is compared against both manual peak picking and industry standard automated peak picking on a set of mass spectra obtained after tryptic in gel digestion of 2D-gel samples from human fetal fibroblasts. The set of spectra contain samples ranging from strong to weak spectra, and the proposed multiple-scale method is shown to be much better on weak spectra than the industry standard method and a human operator, and equal in performance to these on strong and medium strong spectra. It is also demonstrated that peak sets selected by a human operator display a considerable variability and that it is impossible to speak of a single "true" peak set for a given spectrum. The described multiple-scale strategy both avoids time-consuming parameter tuning and exceeds the human operator in protein identification efficiency. The strategy therefore promises reliable automated user-independent protein identification using peptide mass fingerprints.

  11. Accurate optimization of amino acid form factors for computing small-angle X-ray scattering intensity of atomistic protein structures

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tong, Dudu; Yang, Sichun; Lu, Lanyuan

    2016-06-20

    Structure modellingviasmall-angle X-ray scattering (SAXS) data generally requires intensive computations of scattering intensity from any given biomolecular structure, where the accurate evaluation of SAXS profiles using coarse-grained (CG) methods is vital to improve computational efficiency. To date, most CG SAXS computing methods have been based on a single-bead-per-residue approximation but have neglected structural correlations between amino acids. To improve the accuracy of scattering calculations, accurate CG form factors of amino acids are now derived using a rigorous optimization strategy, termed electron-density matching (EDM), to best fit electron-density distributions of protein structures. This EDM method is compared with and tested againstmore » other CG SAXS computing methods, and the resulting CG SAXS profiles from EDM agree better with all-atom theoretical SAXS data. By including the protein hydration shell represented by explicit CG water molecules and the correction of protein excluded volume, the developed CG form factors also reproduce the selected experimental SAXS profiles with very small deviations. Taken together, these EDM-derived CG form factors present an accurate and efficient computational approach for SAXS computing, especially when higher molecular details (represented by theqrange of the SAXS data) become necessary for effective structure modelling.« less

  12. Rapid and Accurate Identification of Animal Species in Natural Leather Goods by Liquid Chromatography/Mass Spectrometry

    PubMed Central

    Izuchi, Yukari; Takashima, Tsuneo; Hatano, Naoya

    2016-01-01

    The demand for leather goods has grown globally in recent years. Industry revenue is forecast to reach $91.2 billion by 2018. There is an ongoing labelling problem in the leather items market, in that it is currently impossible to identify the species that a given piece of leather is derived from. To address this issue, we developed a rapid and simple method for the specific identification of leather derived from cattle, horses, pigs, sheep, goats, and deer by analysing peptides produced by the trypsin-digestion of proteins contained in leather goods using liquid chromatography/mass spectrometry. We determined species-specific amino acid sequences by liquid chromatography/tandem mass spectrometry analysis using the Mascot software program and demonstrated that collagen α-1(I), collagen α-2(I), and collagen α-1(III) from the dermal layer of the skin are particularly useful in species identification. PMID:27313979

  13. Method for identification of rigid domains and hinge residues in proteins based on exhaustive enumeration.

    PubMed

    Sim, Jaehyun; Sim, Jun; Park, Eunsung; Lee, Julian

    2015-06-01

    Many proteins undergo large-scale motions where relatively rigid domains move against each other. The identification of rigid domains, as well as the hinge residues important for their relative movements, is important for various applications including flexible docking simulations. In this work, we develop a method for protein rigid domain identification based on an exhaustive enumeration of maximal rigid domains, the rigid domains not fully contained within other domains. The computation is performed by mapping the problem to that of finding maximal cliques in a graph. A minimal set of rigid domains are then selected, which cover most of the protein with minimal overlap. In contrast to the results of existing methods that partition a protein into non-overlapping domains using approximate algorithms, the rigid domains obtained from exact enumeration naturally contain overlapping regions, which correspond to the hinges of the inter-domain bending motion. The performance of the algorithm is demonstrated on several proteins. © 2015 Wiley Periodicals, Inc.

  14. A Comprehensive Strategy to Construct In-house Database for Accurate and Batch Identification of Small Molecular Metabolites.

    PubMed

    Zhao, Xinjie; Zeng, Zhongda; Chen, Aiming; Lu, Xin; Zhao, Chunxia; Hu, Chunxiu; Zhou, Lina; Liu, Xinyu; Wang, Xiaolin; Hou, Xiaoli; Ye, Yaorui; Xu, Guowang

    2018-05-29

    Identification of the metabolites is an essential step in metabolomics study to interpret regulatory mechanism of pathological and physiological processes. However, it is still a big headache in LC-MSn-based studies because of the complexity of mass spectrometry, chemical diversity of metabolites, and deficiency of standards database. In this work, a comprehensive strategy is developed for accurate and batch metabolite identification in non-targeted metabolomics studies. First, a well defined procedure was applied to generate reliable and standard LC-MS2 data including tR, MS1 and MS2 information at a standard operational procedure (SOP). An in-house database including about 2000 metabolites was constructed and used to identify the metabolites in non-targeted metabolic profiling by retention time calibration using internal standards, precursor ion alignment and ion fusion, auto-MS2 information extraction and selection, and database batch searching and scoring. As an application example, a pooled serum sample was analyzed to deliver the strategy, 202 metabolites were identified in the positive ion mode. It shows our strategy is useful for LC-MSn-based non-targeted metabolomics study.

  15. Rapid identification and typing of Yersinia pestis and other Yersinia species by matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry

    PubMed Central

    2010-01-01

    Background Accurate identification is necessary to discriminate harmless environmental Yersinia species from the food-borne pathogens Yersinia enterocolitica and Yersinia pseudotuberculosis and from the group A bioterrorism plague agent Yersinia pestis. In order to circumvent the limitations of current phenotypic and PCR-based identification methods, we aimed to assess the usefulness of matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) protein profiling for accurate and rapid identification of Yersinia species. As a first step, we built a database of 39 different Yersinia strains representing 12 different Yersinia species, including 13 Y. pestis isolates representative of the Antiqua, Medievalis and Orientalis biotypes. The organisms were deposited on the MALDI-TOF plate after appropriate ethanol-based inactivation, and a protein profile was obtained within 6 minutes for each of the Yersinia species. Results When compared with a 3,025-profile database, every Yersinia species yielded a unique protein profile and was unambiguously identified. In the second step of analysis, environmental and clinical isolates of Y. pestis (n = 2) and Y. enterocolitica (n = 11) were compared to the database and correctly identified. In particular, Y. pestis was unambiguously identified at the species level, and MALDI-TOF was able to successfully differentiate the three biotypes. Conclusion These data indicate that MALDI-TOF can be used as a rapid and accurate first-line method for the identification of Yersinia isolates. PMID:21073689

  16. Rapid identification and typing of Yersinia pestis and other Yersinia species by matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry.

    PubMed

    Ayyadurai, Saravanan; Flaudrops, Christophe; Raoult, Didier; Drancourt, Michel

    2010-11-12

    Accurate identification is necessary to discriminate harmless environmental Yersinia species from the food-borne pathogens Yersinia enterocolitica and Yersinia pseudotuberculosis and from the group A bioterrorism plague agent Yersinia pestis. In order to circumvent the limitations of current phenotypic and PCR-based identification methods, we aimed to assess the usefulness of matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) protein profiling for accurate and rapid identification of Yersinia species. As a first step, we built a database of 39 different Yersinia strains representing 12 different Yersinia species, including 13 Y. pestis isolates representative of the Antiqua, Medievalis and Orientalis biotypes. The organisms were deposited on the MALDI-TOF plate after appropriate ethanol-based inactivation, and a protein profile was obtained within 6 minutes for each of the Yersinia species. When compared with a 3,025-profile database, every Yersinia species yielded a unique protein profile and was unambiguously identified. In the second step of analysis, environmental and clinical isolates of Y. pestis (n = 2) and Y. enterocolitica (n = 11) were compared to the database and correctly identified. In particular, Y. pestis was unambiguously identified at the species level, and MALDI-TOF was able to successfully differentiate the three biotypes. These data indicate that MALDI-TOF can be used as a rapid and accurate first-line method for the identification of Yersinia isolates.

  17. Identification and Characterization of Arabidopsis Seed Coat Mucilage Proteins.

    PubMed

    Tsai, Allen Yi-Lun; Kunieda, Tadashi; Rogalski, Jason; Foster, Leonard J; Ellis, Brian E; Haughn, George W

    2017-02-01

    Plant cell wall proteins are important regulators of cell wall architecture and function. However, because cell wall proteins are difficult to extract and analyze, they are generally poorly understood. Here, we describe the identification and characterization of proteins integral to the Arabidopsis (Arabidopsis thaliana) seed coat mucilage, a specialized layer of the extracellular matrix composed of plant cell wall carbohydrates that is used as a model for cell wall research. The proteins identified in mucilage include those previously identified by genetic analysis, and several mucilage proteins are reduced in mucilage-deficient mutant seeds, suggesting that these proteins are genuinely associated with the mucilage. Arabidopsis mucilage has both nonadherent and adherent layers. Both layers have similar protein profiles except for proteins involved in lipid metabolism, which are present exclusively in the adherent mucilage. The most abundant mucilage proteins include a family of proteins named TESTA ABUNDANT1 (TBA1) to TBA3; a less abundant fourth homolog was named TBA-LIKE (TBAL). TBA and TBAL transcripts and promoter activities were detected in developing seed coats, and their expression requires seed coat differentiation regulators. TBA proteins are secreted to the mucilage pocket during differentiation. Although reverse genetics failed to identify a function for TBAs/TBAL, the TBA promoters are highly expressed and cell type specific and so should be very useful tools for targeting proteins to the seed coat epidermis. Altogether, these results highlight the mucilage proteome as a model for cell walls in general, as it shares similarities with other cell wall proteomes while also containing mucilage-specific features. © 2017 American Society of Plant Biologists. All Rights Reserved.

  18. EpHLA software: a timesaving and accurate tool for improving identification of acceptable mismatches for clinical purposes.

    PubMed

    Filho, Herton Luiz Alves Sales; da Mata Sousa, Luiz Claudio Demes; von Glehn, Cristina de Queiroz Carrascosa; da Silva, Adalberto Socorro; dos Santos Neto, Pedro de Alcântara; do Nascimento, Ferraz; de Castro, Adail Fonseca; do Nascimento, Liliane Machado; Kneib, Carolina; Bianchi Cazarote, Helena; Mayumi Kitamura, Daniele; Torres, Juliane Roberta Dias; da Cruz Lopes, Laiane; Barros, Aryela Loureiro; da Silva Edlin, Evelin Nildiane; de Moura, Fernanda Sá Leal; Watanabe, Janine Midori Figueiredo; do Monte, Semiramis Jamil Hadad

    2012-06-01

    The HLAMatchmaker algorithm, which allows the identification of “safe” acceptable mismatches (AMMs) for recipients of solid organ and cell allografts, is rarely used in part due to the difficulty in using it in the current Excel format. The automation of this algorithm may universalize its use to benefit the allocation of allografts. Recently, we have developed a new software called EpHLA, which is the first computer program automating the use of the HLAMatchmaker algorithm. Herein, we present the experimental validation of the EpHLA program by showing the time efficiency and the quality of operation. The same results, obtained by a single antigen bead assay with sera from 10 sensitized patients waiting for kidney transplants, were analyzed either by conventional HLAMatchmaker or by automated EpHLA method. Users testing these two methods were asked to record: (i) time required for completion of the analysis (in minutes); (ii) number of eplets obtained for class I and class II HLA molecules; (iii) categorization of eplets as reactive or non-reactive based on the MFI cutoff value; and (iv) determination of AMMs based on eplets' reactivities. We showed that although both methods had similar accuracy, the automated EpHLA method was over 8 times faster in comparison to the conventional HLAMatchmaker method. In particular the EpHLA software was faster and more reliable but equally accurate as the conventional method to define AMMs for allografts. The EpHLA software is an accurate and quick method for the identification of AMMs and thus it may be a very useful tool in the decision-making process of organ allocation for highly sensitized patients as well as in many other applications.

  19. Identification of Phosphorylated Proteins on a Global Scale.

    PubMed

    Iliuk, Anton

    2018-05-31

    Liquid chromatography (LC) coupled with tandem mass spectrometry (MS/MS) has enabled researchers to analyze complex biological samples with unprecedented depth. It facilitates the identification and quantification of modifications within thousands of proteins in a single large-scale proteomic experiment. Analysis of phosphorylation, one of the most common and important post-translational modifications, has particularly benefited from such progress in the field. Here, detailed protocols are provided for a few well-regarded, common sample preparation methods for an effective phosphoproteomic experiment. © 2018 by John Wiley & Sons, Inc. Copyright © 2018 John Wiley & Sons, Inc.

  20. Peptide Array X-Linking (PAX): A New Peptide-Protein Identification Approach

    PubMed Central

    Okada, Hirokazu; Uezu, Akiyoshi; Soderblom, Erik J.; Moseley, M. Arthur; Gertler, Frank B.; Soderling, Scott H.

    2012-01-01

    Many protein interaction domains bind short peptides based on canonical sequence consensus motifs. Here we report the development of a peptide array-based proteomics tool to identify proteins directly interacting with ligand peptides from cell lysates. Array-formatted bait peptides containing an amino acid-derived cross-linker are photo-induced to crosslink with interacting proteins from lysates of interest. Indirect associations are removed by high stringency washes under denaturing conditions. Covalently trapped proteins are subsequently identified by LC-MS/MS and screened by cluster analysis and domain scanning. We apply this methodology to peptides with different proline-containing consensus sequences and show successful identifications from brain lysates of known and novel proteins containing polyproline motif-binding domains such as EH, EVH1, SH3, WW domains. These results suggest the capacity of arrayed peptide ligands to capture and subsequently identify proteins by mass spectrometry is relatively broad and robust. Additionally, the approach is rapid and applicable to cell or tissue fractions from any source, making the approach a flexible tool for initial protein-protein interaction discovery. PMID:22606326

  1. WegoLoc: accurate prediction of protein subcellular localization using weighted Gene Ontology terms.

    PubMed

    Chi, Sang-Mun; Nam, Dougu

    2012-04-01

    We present an accurate and fast web server, WegoLoc for predicting subcellular localization of proteins based on sequence similarity and weighted Gene Ontology (GO) information. A term weighting method in the text categorization process is applied to GO terms for a support vector machine classifier. As a result, WegoLoc surpasses the state-of-the-art methods for previously used test datasets. WegoLoc supports three eukaryotic kingdoms (animals, fungi and plants) and provides human-specific analysis, and covers several sets of cellular locations. In addition, WegoLoc provides (i) multiple possible localizations of input protein(s) as well as their corresponding probability scores, (ii) weights of GO terms representing the contribution of each GO term in the prediction, and (iii) a BLAST E-value for the best hit with GO terms. If the similarity score does not meet a given threshold, an amino acid composition-based prediction is applied as a backup method. WegoLoc and User's guide are freely available at the website http://www.btool.org/WegoLoc smchiks@ks.ac.kr; dougnam@unist.ac.kr Supplementary data is available at http://www.btool.org/WegoLoc.

  2. Identification of mycobacterial surface proteins released into subcellular compartments of infected macrophages.

    PubMed

    Beatty, W L; Russell, D G

    2000-12-01

    Considerable effort has focused on the identification of proteins secreted from Mycobacterium spp. that contribute to the development of protective immunity. Little is known, however, about the release of mycobacterial proteins from the bacterial phagosome and the potential role of these molecules in chronically infected macrophages. In the present study, the release of mycobacterial surface proteins from the bacterial phagosome into subcellular compartments of infected macrophages was analyzed. Mycobacterium bovis BCG was surface labeled with fluorescein-tagged succinimidyl ester, an amine-reactive probe. The fluorescein tag was then used as a marker for the release of bacterial proteins in infected macrophages. Fractionation studies revealed bacterial proteins within subcellular compartments distinct from mycobacteria and mycobacterial phagosomes. To identify these proteins, subcellular fractions free of bacteria were probed with mycobacterium-specific antibodies. The fibronectin attachment protein and proteins of the antigen 85-kDa complex were identified among the mycobacterial proteins released from the bacterial phagosome.

  3. Identification of Mycobacterial Surface Proteins Released into Subcellular Compartments of Infected Macrophages

    PubMed Central

    Beatty, Wandy L.; Russell, David G.

    2000-01-01

    Considerable effort has focused on the identification of proteins secreted from Mycobacterium spp. that contribute to the development of protective immunity. Little is known, however, about the release of mycobacterial proteins from the bacterial phagosome and the potential role of these molecules in chronically infected macrophages. In the present study, the release of mycobacterial surface proteins from the bacterial phagosome into subcellular compartments of infected macrophages was analyzed. Mycobacterium bovis BCG was surface labeled with fluorescein-tagged succinimidyl ester, an amine-reactive probe. The fluorescein tag was then used as a marker for the release of bacterial proteins in infected macrophages. Fractionation studies revealed bacterial proteins within subcellular compartments distinct from mycobacteria and mycobacterial phagosomes. To identify these proteins, subcellular fractions free of bacteria were probed with mycobacterium-specific antibodies. The fibronectin attachment protein and proteins of the antigen 85-kDa complex were identified among the mycobacterial proteins released from the bacterial phagosome. PMID:11083824

  4. Identification of host proteins, Spata3 and Dkk2, interacting with Toxoplasma gondii micronemal protein MIC3.

    PubMed

    Wang, Yifan; Fang, Rui; Yuan, Yuan; Pan, Ming; Hu, Min; Zhou, Yanqin; Shen, Bang; Zhao, Junlong

    2016-07-01

    As an obligate intracellular protozoan, Toxoplasma gondii is a successful pathogen infecting a variety of animals, including humans. As an adhesin involving in host invasion, the micronemal protein MIC3 plays important roles in host cell attachment, as well as modulation of host EGFR signaling cascade. However, the specific host proteins that interact with MIC3 are unknown and the identification of such proteins will increase our understanding of how MIC3 exerts its functions. This study was designed to identify host proteins interacting with MIC3 by yeast two-hybrid screens. Using MIC3 as bait, a library expressing mouse proteins was screened, uncovering eight mouse proteins that showed positive interactions with MIC3. Two of which, spermatogenesis-associated protein 3 (Spata3) and dickkopf-related protein 2 (Dkk2), were further confirmed to interact with MIC3 by additional protein-protein interaction tests. The results also revealed that the tandem repeat EGF domains of MIC3 were critical in mediating the interactions with the identified host proteins. This is the first study to show that MIC3 interacts with host proteins that are involved in reproduction, growth, and development. The results will provide a clearer understanding of the functions of adhesion-associated micronemal proteins in T. gondii.

  5. Identification of proteins in renaissance paintings by proteomics.

    PubMed

    Tokarski, Caroline; Martin, Elisabeth; Rolando, Christian; Cren-Olivé, Cécile

    2006-03-01

    The presented work proposes a new methodology based on proteomics techniques to identify proteins in old art paintings. The main challenging tasks of this work were (i) to find appropriate conditions for extracting proteins from the binding media without protein hydrolysis in amino acids and (ii) to develop analytical methods adapted to the small sample quantity available. Starting from microsamples of painting models (ovalbumin-based, which is the major egg white protein, and egg-based paintings), multiple extraction solutions (HCl, HCOOH, NH3, NaOH) and conditions (ultrasonic bath, mortar and pestle, grinding resin) were evaluated. The best results were obtained using a commercial kit including a synthetic resin, mortar and pestle to grind the sample in an aqueous solution acidified with trifluoroacetic acid at 1% with additional multiple steps of ultrasonic baths. The resulting supernatant was analyzed by MALDI-TOF in linear mode to verify the efficiency of the extraction solution. An enzymatic hydrolysis step was also performed for protein identification; the peptide mixture was analyzed by nanoLC/nanoESI/Q-q-TOF MS/MS with an adapted chromatographic run for the low sample quantity. Finally, the developed methodology was successfully applied to Renaissance art painting microsamples of approximately 10 microg from Benedetto Bonfigli's triptych, The Virgin and Child, St. John the Baptist, St. Sebastian (XVth century), and Niccolo di Pietro Gerini's painting, The Virgin and Child (XIVth century), identifying, for the first time and without ambiguity, the presence of whole egg proteins (egg yolk and egg white) in a painting binder.

  6. A knowledge-based potential with an accurate description of local interactions improves discrimination between native and near-native protein conformations.

    PubMed

    Ferrada, Evandro; Vergara, Ismael A; Melo, Francisco

    2007-01-01

    The correct discrimination between native and near-native protein conformations is essential for achieving accurate computer-based protein structure prediction. However, this has proven to be a difficult task, since currently available physical energy functions, empirical potentials and statistical scoring functions are still limited in achieving this goal consistently. In this work, we assess and compare the ability of different full atom knowledge-based potentials to discriminate between native protein structures and near-native protein conformations generated by comparative modeling. Using a benchmark of 152 near-native protein models and their corresponding native structures that encompass several different folds, we demonstrate that the incorporation of close non-bonded pairwise atom terms improves the discriminating power of the empirical potentials. Since the direct and unbiased derivation of close non-bonded terms from current experimental data is not possible, we obtained and used those terms from the corresponding pseudo-energy functions of a non-local knowledge-based potential. It is shown that this methodology significantly improves the discrimination between native and near-native protein conformations, suggesting that a proper description of close non-bonded terms is important to achieve a more complete and accurate description of native protein conformations. Some external knowledge-based energy functions that are widely used in model assessment performed poorly, indicating that the benchmark of models and the specific discrimination task tested in this work constitutes a difficult challenge.

  7. Identification of Contractile Vacuole Proteins in Trypanosoma cruzi

    PubMed Central

    Park, Miyoung; Martins, Vicente P.; Atwood, James; Moles, Kristen; Collins, Dalis; Rohloff, Peter; Tarleton, Rick; Moreno, Silvia N. J.; Orlando, Ron; Docampo, Roberto

    2011-01-01

    Contractile vacuole complexes are critical components of cell volume regulation and have been shown to have other functional roles in several free-living protists. However, very little is known about the functions of the contractile vacuole complex of the parasite Trypanosoma cruzi, the etiologic agent of Chagas disease, other than a role in osmoregulation. Identification of the protein composition of these organelles is important for understanding their physiological roles. We applied a combined proteomic and bioinfomatic approach to identify proteins localized to the contractile vacuole. Proteomic analysis of a T. cruzi fraction enriched for contractile vacuoles and analyzed by one-dimensional gel electrophoresis and LC-MS/MS resulted in the addition of 109 newly detected proteins to the group of expressed proteins of epimastigotes. We also identified different peptides that map to at least 39 members of the dispersed gene family 1 (DGF-1) providing evidence that many members of this family are simultaneously expressed in epimastigotes. Of the proteins present in the fraction we selected several homologues with known localizations in contractile vacuoles of other organisms and others that we expected to be present in these vacuoles on the basis of their potential roles. We determined the localization of each by expression as GFP-fusion proteins or with specific antibodies. Six of these putative proteins (Rab11, Rab32, AP180, ATPase subunit B, VAMP1, and phosphate transporter) predominantly localized to the vacuole bladder. TcSNARE2.1, TcSNARE2.2, and calmodulin localized to the spongiome. Calmodulin was also cytosolic. Our results demonstrate the utility of combining subcellular fractionation, proteomic analysis, and bioinformatic approaches for localization of organellar proteins that are difficult to detect with whole cell methodologies. The CV localization of the proteins investigated revealed potential novel roles of these organelles in phosphate metabolism

  8. Exploring the Universe of Protein Structures beyond the Protein Data Bank

    PubMed Central

    Cossio, Pilar; Trovato, Antonio; Pietrucci, Fabio; Seno, Flavio; Maritan, Amos; Laio, Alessandro

    2010-01-01

    It is currently believed that the atlas of existing protein structures is faithfully represented in the Protein Data Bank. However, whether this atlas covers the full universe of all possible protein structures is still a highly debated issue. By using a sophisticated numerical approach, we performed an exhaustive exploration of the conformational space of a 60 amino acid polypeptide chain described with an accurate all-atom interaction potential. We generated a database of around 30,000 compact folds with at least of secondary structure corresponding to local minima of the potential energy. This ensemble plausibly represents the universe of protein folds of similar length; indeed, all the known folds are represented in the set with good accuracy. However, we discover that the known folds form a rather small subset, which cannot be reproduced by choosing random structures in the database. Rather, natural and possible folds differ by the contact order, on average significantly smaller in the former. This suggests the presence of an evolutionary bias, possibly related to kinetic accessibility, towards structures with shorter loops between contacting residues. Beside their conceptual relevance, the new structures open a range of practical applications such as the development of accurate structure prediction strategies, the optimization of force fields, and the identification and design of novel folds. PMID:21079678

  9. Genome-Wide Identification of Arabidopsis Coiled-Coil Proteins and Establishment of the ARABI-COIL Database1

    PubMed Central

    Rose, Annkatrin; Manikantan, Sankaraganesh; Schraegle, Shannon J.; Maloy, Michael A.; Stahlberg, Eric A.; Meier, Iris

    2004-01-01

    Increasing evidence demonstrates the importance of long coiled-coil proteins for the spatial organization of cellular processes. Although several protein classes with long coiled-coil domains have been studied in animals and yeast, our knowledge about plant long coiled-coil proteins is very limited. The repeat nature of the coiled-coil sequence motif often prevents the simple identification of homologs of animal coiled-coil proteins by generic sequence similarity searches. As a consequence, counterparts of many animal proteins with long coiled-coil domains, like lamins, golgins, or microtubule organization center components, have not been identified yet in plants. Here, all Arabidopsis proteins predicted to contain long stretches of coiled-coil domains were identified by applying the algorithm MultiCoil to a genome-wide screen. A searchable protein database, ARABI-COIL (http://www.coiled-coil.org/arabidopsis), was established that integrates information on number, size, and position of predicted coiled-coil domains with subcellular localization signals, transmembrane domains, and available functional annotations. ARABI-COIL serves as a tool to sort and browse Arabidopsis long coiled-coil proteins to facilitate the identification and selection of candidate proteins of potential interest for specific research areas. Using the database, candidate proteins were identified for Arabidopsis membrane-bound, nuclear, and organellar long coiled-coil proteins. PMID:15020757

  10. Highly efficient classification and identification of human pathogenic bacteria by MALDI-TOF MS.

    PubMed

    Hsieh, Sen-Yung; Tseng, Chiao-Li; Lee, Yun-Shien; Kuo, An-Jing; Sun, Chien-Feng; Lin, Yen-Hsiu; Chen, Jen-Kun

    2008-02-01

    Accurate and rapid identification of pathogenic microorganisms is of critical importance in disease treatment and public health. Conventional work flows are time-consuming, and procedures are multifaceted. MS can be an alternative but is limited by low efficiency for amino acid sequencing as well as low reproducibility for spectrum fingerprinting. We systematically analyzed the feasibility of applying MS for rapid and accurate bacterial identification. Directly applying bacterial colonies without further protein extraction to MALDI-TOF MS analysis revealed rich peak contents and high reproducibility. The MS spectra derived from 57 isolates comprising six human pathogenic bacterial species were analyzed using both unsupervised hierarchical clustering and supervised model construction via the Genetic Algorithm. Hierarchical clustering analysis categorized the spectra into six groups precisely corresponding to the six bacterial species. Precise classification was also maintained in an independently prepared set of bacteria even when the numbers of m/z values were reduced to six. In parallel, classification models were constructed via Genetic Algorithm analysis. A model containing 18 m/z values accurately classified independently prepared bacteria and identified those species originally not used for model construction. Moreover bacteria fewer than 10(4) cells and different species in bacterial mixtures were identified using the classification model approach. In conclusion, the application of MALDI-TOF MS in combination with a suitable model construction provides a highly accurate method for bacterial classification and identification. The approach can identify bacteria with low abundance even in mixed flora, suggesting that a rapid and accurate bacterial identification using MS techniques even before culture can be attained in the near future.

  11. enDNA-Prot: identification of DNA-binding proteins by applying ensemble learning.

    PubMed

    Xu, Ruifeng; Zhou, Jiyun; Liu, Bin; Yao, Lin; He, Yulan; Zou, Quan; Wang, Xiaolong

    2014-01-01

    DNA-binding proteins are crucial for various cellular processes, such as recognition of specific nucleotide, regulation of transcription, and regulation of gene expression. Developing an effective model for identifying DNA-binding proteins is an urgent research problem. Up to now, many methods have been proposed, but most of them focus on only one classifier and cannot make full use of the large number of negative samples to improve predicting performance. This study proposed a predictor called enDNA-Prot for DNA-binding protein identification by employing the ensemble learning technique. Experiential results showed that enDNA-Prot was comparable with DNA-Prot and outperformed DNAbinder and iDNA-Prot with performance improvement in the range of 3.97-9.52% in ACC and 0.08-0.19 in MCC. Furthermore, when the benchmark dataset was expanded with negative samples, the performance of enDNA-Prot outperformed the three existing methods by 2.83-16.63% in terms of ACC and 0.02-0.16 in terms of MCC. It indicated that enDNA-Prot is an effective method for DNA-binding protein identification and expanding training dataset with negative samples can improve its performance. For the convenience of the vast majority of experimental scientists, we developed a user-friendly web-server for enDNA-Prot which is freely accessible to the public.

  12. Analysis of hydraulic fracturing flowback and produced waters using accurate mass: identification of ethoxylated surfactants.

    PubMed

    Thurman, E Michael; Ferrer, Imma; Blotevogel, Jens; Borch, Thomas

    2014-10-07

    Two series of ethylene oxide (EO) surfactants, polyethylene glycols (PEGs from EO3 to EO33) and linear alkyl ethoxylates (LAEs C-9 to C-15 with EO3-EO28), were identified in hydraulic fracturing flowback and produced water using a new application of the Kendrick mass defect and liquid chromatography/quadrupole-time-of-flight mass spectrometry. The Kendrick mass defect differentiates the proton, ammonium, and sodium adducts in both singly and doubly charged forms. A structural model of adduct formation is presented, and binding constants are calculated, which is based on a spherical cagelike conformation, where the central cation (NH4(+) or Na(+)) is coordinated with ether oxygens. A major purpose of the study was the identification of the ethylene oxide (EO) surfactants and the construction of a database with accurate masses and retention times in order to unravel the mass spectral complexity of surfactant mixtures used in hydraulic fracturing fluids. For example, over 500 accurate mass assignments are made in a few seconds of computer time, which then is used as a fingerprint chromatogram of the water samples. This technique is applied to a series of flowback and produced water samples to illustrate the usefulness of ethoxylate "fingerprinting", in a first application to monitor water quality that results from fluids used in hydraulic fracturing.

  13. Identification of bovine sperm acrosomal proteins that interact with a 32-kDa acrosomal matrix protein.

    PubMed

    Nagdas, Subir K; Smith, Linda; Medina-Ortiz, Ilza; Hernandez-Encarnacion, Luisa; Raychoudhury, Samir

    2016-03-01

    Mammalian fertilization is accomplished by the interaction between sperm and egg. Previous studies from this laboratory have identified a stable acrosomal matrix assembly from the bovine sperm acrosome termed the outer acrosomal membrane-matrix complex (OMC). This stable matrix assembly exhibits precise binding activity for acrosin and N-acetylglucosaminidase. A highly purified OMC fraction comprises three major (54, 50, and 45 kDa) and several minor (38-19 kDa) polypeptides. The set of minor polypeptides (38-19 kDa) termed "OMCrpf polypeptides" is selectively solubilized by high-pH extraction (pH 10.5), while the three major polypeptides (55, 50, and 45 kDa) remain insoluble. Proteomic identification of the OMC32 polypeptide (32 kDa polypeptide isolated from high-pH soluble fraction of OMC) yielded two peptides that matched the NCBI database sequence of acrosin-binding protein. Anti-OMC32 recognized an antigenically related family of polypeptides (OMCrpf polypeptides) in the 38-19-kDa range with isoelectric points ranging between 4.0 and 5.1. Other than glycohydrolases, OMC32 may also be complexed to other acrosomal proteins. The present study was undertaken to identify and localize the OMC32 binding polypeptides and to elucidate the potential role of the acrosomal protein complex in sperm function. OMC32 affinity chromatography of a detergent-soluble fraction of bovine cauda sperm acrosome followed by mass spectrometry-based identification of bound proteins identified acrosin, lactadherin, SPACA3, and IZUMO1. Co-immunoprecipitation analysis also demonstrated the interaction of OMC32 with acrosin, lactadherin, SPACA3, and IZUMO1. Our immunofluorescence studies revealed the presence of SPACA3 and lactadherin over the apical segment, whereas IZUMO1 is localized over the equatorial segment of Triton X-100 permeabilized cauda sperm. Immunoblot analysis showed that a significant portion of SPACA3 was released after the lysophosphatidylcholine (LPC)-induced acrosome

  14. Identification of protein-interacting nucleotides in a RNA sequence using composition profile of tri-nucleotides.

    PubMed

    Panwar, Bharat; Raghava, Gajendra P S

    2015-04-01

    The RNA-protein interactions play a diverse role in the cells, thus identification of RNA-protein interface is essential for the biologist to understand their function. In the past, several methods have been developed for predicting RNA interacting residues in proteins, but limited efforts have been made for the identification of protein-interacting nucleotides in RNAs. In order to discriminate protein-interacting and non-interacting nucleotides, we used various classifiers (NaiveBayes, NaiveBayesMultinomial, BayesNet, ComplementNaiveBayes, MultilayerPerceptron, J48, SMO, RandomForest, SMO and SVM(light)) for prediction model development using various features and achieved highest 83.92% sensitivity, 84.82 specificity, 84.62% accuracy and 0.62 Matthew's correlation coefficient by SVM(light) based models. We observed that certain tri-nucleotides like ACA, ACC, AGA, CAC, CCA, GAG, UGA, and UUU preferred in protein-interaction. All the models have been developed using a non-redundant dataset and are evaluated using five-fold cross validation technique. A web-server called RNApin has been developed for the scientific community (http://crdd.osdd.net/raghava/rnapin/). Copyright © 2015 Elsevier Inc. All rights reserved.

  15. Fast and accurate resonance assignment of small-to-large proteins by combining automated and manual approaches.

    PubMed

    Niklasson, Markus; Ahlner, Alexandra; Andresen, Cecilia; Marsh, Joseph A; Lundström, Patrik

    2015-01-01

    The process of resonance assignment is fundamental to most NMR studies of protein structure and dynamics. Unfortunately, the manual assignment of residues is tedious and time-consuming, and can represent a significant bottleneck for further characterization. Furthermore, while automated approaches have been developed, they are often limited in their accuracy, particularly for larger proteins. Here, we address this by introducing the software COMPASS, which, by combining automated resonance assignment with manual intervention, is able to achieve accuracy approaching that from manual assignments at greatly accelerated speeds. Moreover, by including the option to compensate for isotope shift effects in deuterated proteins, COMPASS is far more accurate for larger proteins than existing automated methods. COMPASS is an open-source project licensed under GNU General Public License and is available for download from http://www.liu.se/forskning/foass/tidigare-foass/patrik-lundstrom/software?l=en. Source code and binaries for Linux, Mac OS X and Microsoft Windows are available.

  16. Fast and Accurate Resonance Assignment of Small-to-Large Proteins by Combining Automated and Manual Approaches

    PubMed Central

    Niklasson, Markus; Ahlner, Alexandra; Andresen, Cecilia; Marsh, Joseph A.; Lundström, Patrik

    2015-01-01

    The process of resonance assignment is fundamental to most NMR studies of protein structure and dynamics. Unfortunately, the manual assignment of residues is tedious and time-consuming, and can represent a significant bottleneck for further characterization. Furthermore, while automated approaches have been developed, they are often limited in their accuracy, particularly for larger proteins. Here, we address this by introducing the software COMPASS, which, by combining automated resonance assignment with manual intervention, is able to achieve accuracy approaching that from manual assignments at greatly accelerated speeds. Moreover, by including the option to compensate for isotope shift effects in deuterated proteins, COMPASS is far more accurate for larger proteins than existing automated methods. COMPASS is an open-source project licensed under GNU General Public License and is available for download from http://www.liu.se/forskning/foass/tidigare-foass/patrik-lundstrom/software?l=en. Source code and binaries for Linux, Mac OS X and Microsoft Windows are available. PMID:25569628

  17. YahO protein as a calibrant for top-down proteomic identification of Shiga toxin using MALDI-TOF-TOF-MS/MS and post-source decay

    USDA-ARS?s Scientific Manuscript database

    Matrix-assisted laser desorption/ionization tandem time-of-flight (MALDI-TOF-TOF) mass spectrometry is increasingly utilized for rapid top-down proteomic identification of proteins. This identification may involve analysis of either a pure protein or a protein mixture. For analysis of a pure protein...

  18. Effective Identification of Akt Interacting Proteins by Two-Step Chemical Crosslinking, Co-Immunoprecipitation and Mass Spectrometry

    PubMed Central

    Huang, Bill X.; Kim, Hee-Yong

    2013-01-01

    Akt is a critical protein for cell survival and known to interact with various proteins. However, Akt binding partners that modulate or regulate Akt activation have not been fully elucidated. Identification of Akt-interacting proteins has been customarily achieved by co-immunoprecipitation combined with western blot and/or MS analysis. An intrinsic problem of the method is loss of interacting proteins during procedures to remove non-specific proteins. Moreover, antibody contamination often interferes with the detection of less abundant proteins. Here, we developed a novel two-step chemical crosslinking strategy to overcome these problems which resulted in a dramatic improvement in identifying Akt interacting partners. Akt antibody was first immobilized on protein A/G beads using disuccinimidyl suberate and allowed to bind to cellular Akt along with its interacting proteins. Subsequently, dithiobis[succinimidylpropionate], a cleavable crosslinker, was introduced to produce stable complexes between Akt and binding partners prior to the SDS-PAGE and nanoLC-MS/MS analysis. This approach enabled identification of ten Akt partners from cell lysates containing as low as 1.5 mg proteins, including two new potential Akt interacting partners. None of these but one protein was detectable without crosslinking procedures. The present method provides a sensitive and effective tool to probe Akt-interacting proteins. This strategy should also prove useful for other protein interactions, particularly those involving less abundant or weakly associating partners. PMID:23613850

  19. Accurate prediction of cellular co-translational folding indicates proteins can switch from post- to co-translational folding

    PubMed Central

    Nissley, Daniel A.; Sharma, Ajeet K.; Ahmed, Nabeel; Friedrich, Ulrike A.; Kramer, Günter; Bukau, Bernd; O'Brien, Edward P.

    2016-01-01

    The rates at which domains fold and codons are translated are important factors in determining whether a nascent protein will co-translationally fold and function or misfold and malfunction. Here we develop a chemical kinetic model that calculates a protein domain's co-translational folding curve during synthesis using only the domain's bulk folding and unfolding rates and codon translation rates. We show that this model accurately predicts the course of co-translational folding measured in vivo for four different protein molecules. We then make predictions for a number of different proteins in yeast and find that synonymous codon substitutions, which change translation-elongation rates, can switch some protein domains from folding post-translationally to folding co-translationally—a result consistent with previous experimental studies. Our approach explains essential features of co-translational folding curves and predicts how varying the translation rate at different codon positions along a transcript's coding sequence affects this self-assembly process. PMID:26887592

  20. Mass spectrometric identification of proteins in complex post-genomic projects. Soluble proteins of the metabolically versatile, denitrifying 'Aromatoleum' sp. strain EbN1.

    PubMed

    Hufnagel, Peter; Rabus, Ralf

    2006-01-01

    The rapidly developing proteomics technologies help to advance the global understanding of physiological and cellular processes. The lifestyle of a study organism determines the type and complexity of a given proteomic project. The complexity of this study is characterized by a broad collection of pathway-specific subproteomes, reflecting the metabolic versatility as well as the regulatory potential of the aromatic-degrading, denitrifying bacterium 'Aromatoleum' sp. strain EbN1. Differences in protein profiles were determined using a gel-based approach. Protein identification was based on a progressive application of MALDI-TOF-MS, MALDI-TOF-MS/MS and LC-ESI-MS/MS. This progression was result-driven and automated by software control. The identification rate was increased by the assembly of a project-specific list of background signals that was used for internal calibration of the MS spectra, and by the combination of two search engines using a dedicated MetaScoring algorithm. In total, intelligent bioinformatics could increase the identification yield from 53 to 70% of the analyzed 5,050 gel spots; a total of 556 different proteins were identified. MS identification was highly reproducible: most proteins were identified more than twice from parallel 2DE gels with an average sequence coverage of >50% and rather restrictive score thresholds (Mascot >or=95, ProFound >or=2.2, MetaScore >or=97). The MS technologies and bioinformatics tools that were implemented and integrated to handle this complex proteomic project are presented. In addition, we describe the basic principles and current developments of the applied technologies and provide an overview over the current state of microbial proteome research. Copyright (c) 2006 S. Karger AG, Basel.

  1. PconsD: ultra rapid, accurate model quality assessment for protein structure prediction.

    PubMed

    Skwark, Marcin J; Elofsson, Arne

    2013-07-15

    Clustering methods are often needed for accurately assessing the quality of modeled protein structures. Recent blind evaluation of quality assessment methods in CASP10 showed that there is little difference between many different methods as far as ranking models and selecting best model are concerned. When comparing many models, the computational cost of the model comparison can become significant. Here, we present PconsD, a fast, stream-computing method for distance-driven model quality assessment that runs on consumer hardware. PconsD is at least one order of magnitude faster than other methods of comparable accuracy. The source code for PconsD is freely available at http://d.pcons.net/. Supplementary benchmarking data are also available there. arne@bioinfo.se Supplementary data are available at Bioinformatics online.

  2. Ribosomal subunit protein typing using matrix-assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF MS) for the identification and discrimination of Aspergillus species.

    PubMed

    Nakamura, Sayaka; Sato, Hiroaki; Tanaka, Reiko; Kusuya, Yoko; Takahashi, Hiroki; Yaguchi, Takashi

    2017-04-26

    Accurate identification of Aspergillus species is a very important subject. Mass spectral fingerprinting using matrix-assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF MS) is generally employed for the rapid identification of fungal isolates. However, the results are based on simple mass spectral pattern-matching, with no peak assignment and no taxonomic input. We propose here a ribosomal subunit protein (RSP) typing technique using MALDI-TOF MS for the identification and discrimination of Aspergillus species. The results are concluded to be phylogenetic in that they reflect the molecular evolution of housekeeping RSPs. The amino acid sequences of RSPs of genome-sequenced strains of Aspergillus species were first verified and compared to compile a reliable biomarker list for the identification of Aspergillus species. In this process, we revealed that many amino acid sequences of RSPs (about 10-60%, depending on strain) registered in the public protein databases needed to be corrected or newly added. The verified RSPs were allocated to RSP types based on their mass. Peak assignments of RSPs of each sample strain as observed by MALDI-TOF MS were then performed to set RSP type profiles, which were then further processed by means of cluster analysis. The resulting dendrogram based on RSP types showed a relatively good concordance with the tree based on β-tubulin gene sequences. RSP typing was able to further discriminate the strains belonging to Aspergillus section Fumigati. The RSP typing method could be applied to identify Aspergillus species, even for species within section Fumigati. The discrimination power of RSP typing appears to be comparable to conventional β-tubulin gene analysis. This method would therefore be suitable for species identification and discrimination at the strain to species level. Because RSP typing can characterize the strains within section Fumigati, this method has potential as a powerful and reliable tool in

  3. Identification of Differentially Abundant Proteins of Edwardsiella ictaluri during Iron Restriction

    PubMed Central

    Dumpala, Pradeep R.; Peterson, Brian C.; Lawrence, Mark L.; Karsi, Attila

    2015-01-01

    Edwardsiella ictaluri is a Gram-negative facultative anaerobe intracellular bacterium that causes enteric septicemia in channel catfish. Iron is an essential inorganic nutrient of bacteria and is crucial for bacterial invasion. Reduced availability of iron by the host may cause significant stress for bacterial pathogens and is considered a signal that leads to significant alteration in virulence gene expression. However, the precise effect of iron-restriction on E. ictaluri protein abundance is unknown. The purpose of this study was to identify differentially abundant proteins of E. ictaluri during in vitro iron-restricted conditions. We applied two-dimensional difference in gel electrophoresis (2D-DIGE) for determining differentially abundant proteins and matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI TOF/TOF MS) for protein identification. Gene ontology and pathway-based functional modeling of differentially abundant proteins was also conducted. A total of 50 unique differentially abundant proteins at a minimum of 2-fold (p ≤ 0.05) difference in abundance due to iron-restriction were detected. The numbers of up- and down-regulated proteins were 37 and 13, respectively. We noted several proteins, including EsrB, LamB, MalM, MalE, FdaA, and TonB-dependent heme/hemoglobin receptor family proteins responded to iron restriction in E. ictaluri. PMID:26168192

  4. [Identification of proteins interacting with the circadian clock protein PER1 in tumors using bacterial two-hybrid system technique].

    PubMed

    Zhang, Yu; Yao, Youlin; Jiang, Siyuan; Lu, Yilu; Liu, Yunqiang; Tao, Dachang; Zhang, Sizhong; Ma, Yongxin

    2015-04-01

    To identify protein-protein interaction partners of PER1 (period circadian protein homolog 1), key component of the molecular oscillation system of the circadian rhythm in tumors using bacterial two-hybrid system technique. Human cervical carcinoma cell Hela library was adopted. Recombinant bait plasmid pBT-PER1 and pTRG cDNA plasmid library were cotransformed into the two-hybrid system reporter strain cultured in a special selective medium. Target clones were screened. After isolating the positive clones, the target clones were sequenced and analyzed. Fourteen protein coding genes were identified, 4 of which were found to contain whole coding regions of genes, which included optic atrophy 3 protein (OPA3) associated with mitochondrial dynamics and homo sapiens cutA divalent cation tolerance homolog of E. coli (CUTA) associated with copper metabolism. There were also cellular events related proteins and proteins which are involved in biochemical reaction and signal transduction-related proteins. Identification of potential interacting proteins with PER1 in tumors may provide us new insights into the functions of the circadian clock protein PER1 during tumorigenesis.

  5. MASCOT HTML and XML parser: an implementation of a novel object model for protein identification data.

    PubMed

    Yang, Chunguang G; Granite, Stephen J; Van Eyk, Jennifer E; Winslow, Raimond L

    2006-11-01

    Protein identification using MS is an important technique in proteomics as well as a major generator of proteomics data. We have designed the protein identification data object model (PDOM) and developed a parser based on this model to facilitate the analysis and storage of these data. The parser works with HTML or XML files saved or exported from MASCOT MS/MS ions search in peptide summary report or MASCOT PMF search in protein summary report. The program creates PDOM objects, eliminates redundancy in the input file, and has the capability to output any PDOM object to a relational database. This program facilitates additional analysis of MASCOT search results and aids the storage of protein identification information. The implementation is extensible and can serve as a template to develop parsers for other search engines. The parser can be used as a stand-alone application or can be driven by other Java programs. It is currently being used as the front end for a system that loads HTML and XML result files of MASCOT searches into a relational database. The source code is freely available at http://www.ccbm.jhu.edu and the program uses only free and open-source Java libraries.

  6. The cassava (Manihot esculenta Crantz) root proteome: protein identification and differential expression.

    PubMed

    Sheffield, Jeanne; Taylor, Nigel; Fauquet, Claude; Chen, Sixue

    2006-03-01

    Using high-resolution 2-DE, we resolved proteins extracted from fibrous and tuberous root tissues of 3-month-old cassava plants. Gel image analysis revealed an average of 1467 electrophoretically resolved spots on the fibrous gels and 1595 spots on the tuberous gels in pH 3-10 range. Protein spots from both sets of gels were digested with trypsin. The digests were subjected to nanoelectrospray quadrupole TOF tandem mass analysis. Currently, we have obtained 299 protein identifications for 292 gel spots corresponding to 237 proteins. The proteins span various functional categories from energy, primary and secondary metabolism, disease and defense, destination and storage, transport, signal transduction, protein synthesis, cell structure, and transcription to cell growth and division. Gel image analysis has shown unique, as well as up- and down-regulated proteins, present in the tuberous and the fibrous tissues. Quantitative and qualitative analysis of the cassava root proteome is an important step towards further characterization of differentially expressed proteins and the elucidation of the mechanisms underlying the development and biological functions of the two types of roots.

  7. SCPRED: accurate prediction of protein structural class for sequences of twilight-zone similarity with predicting sequences.

    PubMed

    Kurgan, Lukasz; Cios, Krzysztof; Chen, Ke

    2008-05-01

    Protein structure prediction methods provide accurate results when a homologous protein is predicted, while poorer predictions are obtained in the absence of homologous templates. However, some protein chains that share twilight-zone pairwise identity can form similar folds and thus determining structural similarity without the sequence similarity would be desirable for the structure prediction. The folding type of a protein or its domain is defined as the structural class. Current structural class prediction methods that predict the four structural classes defined in SCOP provide up to 63% accuracy for the datasets in which sequence identity of any pair of sequences belongs to the twilight-zone. We propose SCPRED method that improves prediction accuracy for sequences that share twilight-zone pairwise similarity with sequences used for the prediction. SCPRED uses a support vector machine classifier that takes several custom-designed features as its input to predict the structural classes. Based on extensive design that considers over 2300 index-, composition- and physicochemical properties-based features along with features based on the predicted secondary structure and content, the classifier's input includes 8 features based on information extracted from the secondary structure predicted with PSI-PRED and one feature computed from the sequence. Tests performed with datasets of 1673 protein chains, in which any pair of sequences shares twilight-zone similarity, show that SCPRED obtains 80.3% accuracy when predicting the four SCOP-defined structural classes, which is superior when compared with over a dozen recent competing methods that are based on support vector machine, logistic regression, and ensemble of classifiers predictors. The SCPRED can accurately find similar structures for sequences that share low identity with sequence used for the prediction. The high predictive accuracy achieved by SCPRED is attributed to the design of the features, which are

  8. SCPRED: Accurate prediction of protein structural class for sequences of twilight-zone similarity with predicting sequences

    PubMed Central

    Kurgan, Lukasz; Cios, Krzysztof; Chen, Ke

    2008-01-01

    Background Protein structure prediction methods provide accurate results when a homologous protein is predicted, while poorer predictions are obtained in the absence of homologous templates. However, some protein chains that share twilight-zone pairwise identity can form similar folds and thus determining structural similarity without the sequence similarity would be desirable for the structure prediction. The folding type of a protein or its domain is defined as the structural class. Current structural class prediction methods that predict the four structural classes defined in SCOP provide up to 63% accuracy for the datasets in which sequence identity of any pair of sequences belongs to the twilight-zone. We propose SCPRED method that improves prediction accuracy for sequences that share twilight-zone pairwise similarity with sequences used for the prediction. Results SCPRED uses a support vector machine classifier that takes several custom-designed features as its input to predict the structural classes. Based on extensive design that considers over 2300 index-, composition- and physicochemical properties-based features along with features based on the predicted secondary structure and content, the classifier's input includes 8 features based on information extracted from the secondary structure predicted with PSI-PRED and one feature computed from the sequence. Tests performed with datasets of 1673 protein chains, in which any pair of sequences shares twilight-zone similarity, show that SCPRED obtains 80.3% accuracy when predicting the four SCOP-defined structural classes, which is superior when compared with over a dozen recent competing methods that are based on support vector machine, logistic regression, and ensemble of classifiers predictors. Conclusion The SCPRED can accurately find similar structures for sequences that share low identity with sequence used for the prediction. The high predictive accuracy achieved by SCPRED is attributed to the design of

  9. Identification of Importin 8 (IPO8) as the most accurate reference gene for the clinicopathological analysis of lung specimens

    PubMed Central

    Nguewa, Paul A; Agorreta, Jackeline; Blanco, David; Lozano, Maria Dolores; Gomez-Roman, Javier; Sanchez, Blas A; Valles, Iñaki; Pajares, Maria J; Pio, Ruben; Rodriguez, Maria Jose; Montuenga, Luis M; Calvo, Alfonso

    2008-01-01

    Background The accurate normalization of differentially expressed genes in lung cancer is essential for the identification of novel therapeutic targets and biomarkers by real time RT-PCR and microarrays. Although classical "housekeeping" genes, such as GAPDH, HPRT1, and beta-actin have been widely used in the past, their accuracy as reference genes for lung tissues has not been proven. Results We have conducted a thorough analysis of a panel of 16 candidate reference genes for lung specimens and lung cell lines. Gene expression was measured by quantitative real time RT-PCR and expression stability was analyzed with the softwares GeNorm and NormFinder, mean of |ΔCt| (= |Ct Normal-Ct tumor|) ± SEM, and correlation coefficients among genes. Systematic comparison between candidates led us to the identification of a subset of suitable reference genes for clinical samples: IPO8, ACTB, POLR2A, 18S, and PPIA. Further analysis showed that IPO8 had a very low mean of |ΔCt| (0.70 ± 0.09), with no statistically significant differences between normal and malignant samples and with excellent expression stability. Conclusion Our data show that IPO8 is the most accurate reference gene for clinical lung specimens. In addition, we demonstrate that the commonly used genes GAPDH and HPRT1 are inappropriate to normalize data derived from lung biopsies, although they are suitable as reference genes for lung cell lines. We thus propose IPO8 as a novel reference gene for lung cancer samples. PMID:19014639

  10. Electrostatics of proteins in dielectric solvent continua. I. An accurate and efficient reaction field description

    NASA Astrophysics Data System (ADS)

    Bauer, Sebastian; Mathias, Gerald; Tavan, Paul

    2014-03-01

    We present a reaction field (RF) method which accurately solves the Poisson equation for proteins embedded in dielectric solvent continua at a computational effort comparable to that of an electrostatics calculation with polarizable molecular mechanics (MM) force fields. The method combines an approach originally suggested by Egwolf and Tavan [J. Chem. Phys. 118, 2039 (2003)] with concepts generalizing the Born solution [Z. Phys. 1, 45 (1920)] for a solvated ion. First, we derive an exact representation according to which the sources of the RF potential and energy are inducible atomic anti-polarization densities and atomic shielding charge distributions. Modeling these atomic densities by Gaussians leads to an approximate representation. Here, the strengths of the Gaussian shielding charge distributions are directly given in terms of the static partial charges as defined, e.g., by standard MM force fields for the various atom types, whereas the strengths of the Gaussian anti-polarization densities are calculated by a self-consistency iteration. The atomic volumes are also described by Gaussians. To account for covalently overlapping atoms, their effective volumes are calculated by another self-consistency procedure, which guarantees that the dielectric function ɛ(r) is close to one everywhere inside the protein. The Gaussian widths σi of the atoms i are parameters of the RF approximation. The remarkable accuracy of the method is demonstrated by comparison with Kirkwood's analytical solution for a spherical protein [J. Chem. Phys. 2, 351 (1934)] and with computationally expensive grid-based numerical solutions for simple model systems in dielectric continua including a di-peptide (Ac-Ala-NHMe) as modeled by a standard MM force field. The latter example shows how weakly the RF conformational free energy landscape depends on the parameters σi. A summarizing discussion highlights the achievements of the new theory and of its approximate solution particularly by

  11. Electrostatics of proteins in dielectric solvent continua. I. An accurate and efficient reaction field description.

    PubMed

    Bauer, Sebastian; Mathias, Gerald; Tavan, Paul

    2014-03-14

    We present a reaction field (RF) method which accurately solves the Poisson equation for proteins embedded in dielectric solvent continua at a computational effort comparable to that of an electrostatics calculation with polarizable molecular mechanics (MM) force fields. The method combines an approach originally suggested by Egwolf and Tavan [J. Chem. Phys. 118, 2039 (2003)] with concepts generalizing the Born solution [Z. Phys. 1, 45 (1920)] for a solvated ion. First, we derive an exact representation according to which the sources of the RF potential and energy are inducible atomic anti-polarization densities and atomic shielding charge distributions. Modeling these atomic densities by Gaussians leads to an approximate representation. Here, the strengths of the Gaussian shielding charge distributions are directly given in terms of the static partial charges as defined, e.g., by standard MM force fields for the various atom types, whereas the strengths of the Gaussian anti-polarization densities are calculated by a self-consistency iteration. The atomic volumes are also described by Gaussians. To account for covalently overlapping atoms, their effective volumes are calculated by another self-consistency procedure, which guarantees that the dielectric function ε(r) is close to one everywhere inside the protein. The Gaussian widths σ(i) of the atoms i are parameters of the RF approximation. The remarkable accuracy of the method is demonstrated by comparison with Kirkwood's analytical solution for a spherical protein [J. Chem. Phys. 2, 351 (1934)] and with computationally expensive grid-based numerical solutions for simple model systems in dielectric continua including a di-peptide (Ac-Ala-NHMe) as modeled by a standard MM force field. The latter example shows how weakly the RF conformational free energy landscape depends on the parameters σ(i). A summarizing discussion highlights the achievements of the new theory and of its approximate solution particularly by

  12. Identification of immunodominant proteins from Mannheimia haemolytica and Histophilus somni by an immunoproteomic approach.

    PubMed

    Alvarez, Angel H; Gutiérrez-Ortega, Abel; Hernández-Gutiérrez, Rodolfo

    2015-10-01

    Mannheimia haemolytica and Histophilus somni are frequently isolated from diseased cattle with bovine respiratory disease (BRD). They compromise animal lung function and the immune responses generated are not sufficient to limit infection. Identification of specific immunogenic antigens for vaccine development represents a great challenge. Immunogenic proteins were identified by immunoproteomic approach with sera from cattle immunized with a commercial cellular vaccine of M. haemolytica and H. somni. Proteins of M. haemolytica were identified as solute ABC transporter, iron-binding protein, and hypothetical protein of capsular biosynthesis. Histophilus somni proteins correspond to porin, amino acid ABC transporter, hypothetical outer membrane protein, cysteine synthase, and outer membrane protein P6. Although these antigens share strong similarities with other proteins from animal pathogens, the ABC system proteins have been associated with virulence and these proteins could be considered as potential vaccine candidates for BRD.

  13. Identification of immunodominant proteins from Mannheimia haemolytica and Histophilus somni by an immunoproteomic approach

    PubMed Central

    Alvarez, Angel H.; Gutiérrez-Ortega, Abel; Hernández-Gutiérrez, Rodolfo

    2015-01-01

    Mannheimia haemolytica and Histophilus somni are frequently isolated from diseased cattle with bovine respiratory disease (BRD). They compromise animal lung function and the immune responses generated are not sufficient to limit infection. Identification of specific immunogenic antigens for vaccine development represents a great challenge. Immunogenic proteins were identified by immunoproteomic approach with sera from cattle immunized with a commercial cellular vaccine of M. haemolytica and H. somni. Proteins of M. haemolytica were identified as solute ABC transporter, iron-binding protein, and hypothetical protein of capsular biosynthesis. Histophilus somni proteins correspond to porin, amino acid ABC transporter, hypothetical outer membrane protein, cysteine synthase, and outer membrane protein P6. Although these antigens share strong similarities with other proteins from animal pathogens, the ABC system proteins have been associated with virulence and these proteins could be considered as potential vaccine candidates for BRD. PMID:26424916

  14. Identification of new intrinsic proteins in Arabidopsis plasma membrane proteome.

    PubMed

    Marmagne, Anne; Rouet, Marie-Aude; Ferro, Myriam; Rolland, Norbert; Alcon, Carine; Joyard, Jacques; Garin, Jérome; Barbier-Brygoo, Hélène; Ephritikhine, Geneviève

    2004-07-01

    Identification and characterization of anion channel genes in plants represent a goal for a better understanding of their central role in cell signaling, osmoregulation, nutrition, and metabolism. Though channel activities have been well characterized in plasma membrane by electrophysiology, the corresponding molecular entities are little documented. Indeed, the hydrophobic protein equipment of plant plasma membrane still remains largely unknown, though several proteomic approaches have been reported. To identify new putative transport systems, we developed a new proteomic strategy based on mass spectrometry analyses of a plasma membrane fraction enriched in hydrophobic proteins. We produced from Arabidopsis cell suspensions a highly purified plasma membrane fraction and characterized it in detail by immunological and enzymatic tests. Using complementary methods for the extraction of hydrophobic proteins and mass spectrometry analyses on mono-dimensional gels, about 100 proteins have been identified, 95% of which had never been found in previous proteomic studies. The inventory of the plasma membrane proteome generated by this approach contains numerous plasma membrane integral proteins, one-third displaying at least four transmembrane segments. The plasma membrane localization was confirmed for several proteins, therefore validating such proteomic strategy. An in silico analysis shows a correlation between the putative functions of the identified proteins and the expected roles for plasma membrane in transport, signaling, cellular traffic, and metabolism. This analysis also reveals 10 proteins that display structural properties compatible with transport functions and will constitute interesting targets for further functional studies.

  15. GEPSI: A Gene Expression Profile Similarity-Based Identification Method of Bioactive Components in Traditional Chinese Medicine Formula.

    PubMed

    Zhang, Baixia; He, Shuaibing; Lv, Chenyang; Zhang, Yanling; Wang, Yun

    2018-01-01

    The identification of bioactive components in traditional Chinese medicine (TCM) is an important part of the TCM material foundation research. Recently, molecular docking technology has been extensively used for the identification of TCM bioactive components. However, target proteins that are used in molecular docking may not be the actual TCM target. For this reason, the bioactive components would likely be omitted or incorrect. To address this problem, this study proposed the GEPSI method that identified the target proteins of TCM based on the similarity of gene expression profiles. The similarity of the gene expression profiles affected by TCM and small molecular drugs was calculated. The pharmacological action of TCM may be similar to that of small molecule drugs that have a high similarity score. Indeed, the target proteins of the small molecule drugs could be considered TCM targets. Thus, we identified the bioactive components of a TCM by molecular docking and verified the reliability of this method by a literature investigation. Using the target proteins that TCM actually affected as targets, the identification of the bioactive components was more accurate. This study provides a fast and effective method for the identification of TCM bioactive components.

  16. GEPSI: A Gene Expression Profile Similarity-Based Identification Method of Bioactive Components in Traditional Chinese Medicine Formula

    PubMed Central

    Zhang, Baixia; He, Shuaibing; Lv, Chenyang; Zhang, Yanling

    2018-01-01

    The identification of bioactive components in traditional Chinese medicine (TCM) is an important part of the TCM material foundation research. Recently, molecular docking technology has been extensively used for the identification of TCM bioactive components. However, target proteins that are used in molecular docking may not be the actual TCM target. For this reason, the bioactive components would likely be omitted or incorrect. To address this problem, this study proposed the GEPSI method that identified the target proteins of TCM based on the similarity of gene expression profiles. The similarity of the gene expression profiles affected by TCM and small molecular drugs was calculated. The pharmacological action of TCM may be similar to that of small molecule drugs that have a high similarity score. Indeed, the target proteins of the small molecule drugs could be considered TCM targets. Thus, we identified the bioactive components of a TCM by molecular docking and verified the reliability of this method by a literature investigation. Using the target proteins that TCM actually affected as targets, the identification of the bioactive components was more accurate. This study provides a fast and effective method for the identification of TCM bioactive components. PMID:29692857

  17. Identification of Novel Tumor-Associated Cell Surface Sialoglycoproteins in Human Glioblastoma Tumors Using Quantitative Proteomics

    PubMed Central

    Autelitano, François; Loyaux, Denis; Roudières, Sébastien; Déon, Catherine; Guette, Frédérique; Fabre, Philippe; Ping, Qinggong; Wang, Su; Auvergne, Romane; Badarinarayana, Vasudeo; Smith, Michael; Guillemot, Jean-Claude; Goldman, Steven A.; Natesan, Sridaran; Ferrara, Pascual; August, Paul

    2014-01-01

    Glioblastoma multiform (GBM) remains clinical indication with significant “unmet medical need”. Innovative new therapy to eliminate residual tumor cells and prevent tumor recurrences is critically needed for this deadly disease. A major challenge of GBM research has been the identification of novel molecular therapeutic targets and accurate diagnostic/prognostic biomarkers. Many of the current clinical therapeutic targets of immunotoxins and ligand-directed toxins for high-grade glioma (HGG) cells are surface sialylated glycoproteins. Therefore, methods that systematically and quantitatively analyze cell surface sialoglycoproteins in human clinical tumor samples would be useful for the identification of potential diagnostic markers and therapeutic targets for malignant gliomas. In this study, we used the bioorthogonal chemical reporter strategy (BOCR) in combination with label-free quantitative mass spectrometry (LFQ-MS) to characterize and accurately quantify the individual cell surface sialoproteome in human GBM tissues, in fetal, adult human astrocytes, and in human neural progenitor cells (NPCs). We identified and quantified a total of 843 proteins, including 801 glycoproteins. Among the 843 proteins, 606 (72%) are known cell surface or secreted glycoproteins, including 156 CD-antigens, all major classes of cell surface receptor proteins, transporters, and adhesion proteins. Our findings identified several known as well as new cell surface antigens whose expression is predominantly restricted to human GBM tumors as confirmed by microarray transcription profiling, quantitative RT-PCR and immunohistochemical staining. This report presents the comprehensive identification of new biomarkers and therapeutic targets for the treatment of malignant gliomas using quantitative sialoglycoproteomics with clinically relevant, patient derived primary glioma cells. PMID:25360666

  18. Identification of novel tumor-associated cell surface sialoglycoproteins in human glioblastoma tumors using quantitative proteomics.

    PubMed

    Autelitano, François; Loyaux, Denis; Roudières, Sébastien; Déon, Catherine; Guette, Frédérique; Fabre, Philippe; Ping, Qinggong; Wang, Su; Auvergne, Romane; Badarinarayana, Vasudeo; Smith, Michael; Guillemot, Jean-Claude; Goldman, Steven A; Natesan, Sridaran; Ferrara, Pascual; August, Paul

    2014-01-01

    Glioblastoma multiform (GBM) remains clinical indication with significant "unmet medical need". Innovative new therapy to eliminate residual tumor cells and prevent tumor recurrences is critically needed for this deadly disease. A major challenge of GBM research has been the identification of novel molecular therapeutic targets and accurate diagnostic/prognostic biomarkers. Many of the current clinical therapeutic targets of immunotoxins and ligand-directed toxins for high-grade glioma (HGG) cells are surface sialylated glycoproteins. Therefore, methods that systematically and quantitatively analyze cell surface sialoglycoproteins in human clinical tumor samples would be useful for the identification of potential diagnostic markers and therapeutic targets for malignant gliomas. In this study, we used the bioorthogonal chemical reporter strategy (BOCR) in combination with label-free quantitative mass spectrometry (LFQ-MS) to characterize and accurately quantify the individual cell surface sialoproteome in human GBM tissues, in fetal, adult human astrocytes, and in human neural progenitor cells (NPCs). We identified and quantified a total of 843 proteins, including 801 glycoproteins. Among the 843 proteins, 606 (72%) are known cell surface or secreted glycoproteins, including 156 CD-antigens, all major classes of cell surface receptor proteins, transporters, and adhesion proteins. Our findings identified several known as well as new cell surface antigens whose expression is predominantly restricted to human GBM tumors as confirmed by microarray transcription profiling, quantitative RT-PCR and immunohistochemical staining. This report presents the comprehensive identification of new biomarkers and therapeutic targets for the treatment of malignant gliomas using quantitative sialoglycoproteomics with clinically relevant, patient derived primary glioma cells.

  19. FAMBE-pH: A Fast and Accurate Method to Compute the Total Solvation Free Energies of Proteins

    PubMed Central

    Vorobjev, Yury N.; Vila, Jorge A.

    2009-01-01

    A fast and accurate method to compute the total solvation free energies of proteins as a function of pH is presented. The method makes use of a combination of approaches, some of which have already appeared in the literature; (i) the Poisson equation is solved with an optimized fast adaptive multigrid boundary element (FAMBE) method; (ii) the electrostatic free energies of the ionizable sites are calculated for their neutral and charged states by using a detailed model of atomic charges; (iii) a set of optimal atomic radii is used to define a precise dielectric surface interface; (iv) a multilevel adaptive tessellation of this dielectric surface interface is achieved by using multisized boundary elements; and (v) 1:1 salt effects are included. The equilibrium proton binding/release is calculated with the Tanford–Schellman integral if the proteins contain more than ∼20–25 ionizable groups; for a smaller number of ionizable groups, the ionization partition function is calculated directly. The FAMBE method is tested as a function of pH (FAMBE-pH) with three proteins, namely, bovine pancreatic trypsin inhibitor (BPTI), hen egg white lysozyme (HEWL), and bovine pancreatic ribonuclease A (RNaseA). The results are (a) the FAMBE-pH method reproduces the observed pKa's of the ionizable groups of these proteins within an average absolute value of 0.4 pK units and a maximum error of 1.2 pK units and (b) comparison of the calculated total pH-dependent solvation free energy for BPTI, between the exact calculation of the ionization partition function and the Tanford–Schellman integral method, shows agreement within 1.2 kcal/mol. These results indicate that calculation of total solvation free energies with the FAMBE-pH method can provide an accurate prediction of protein conformational stability at a given fixed pH and, if coupled with molecular mechanics or molecular dynamics methods, can also be used for more realistic studies of protein folding, unfolding, and dynamics

  20. Accurate spectroscopic characterization of oxirane: A valuable route to its identification in Titan's atmosphere and the assignment of unidentified infrared bands

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Puzzarini, Cristina; Biczysko, Malgorzata; Bloino, Julien

    2014-04-20

    In an effort to provide an accurate spectroscopic characterization of oxirane, state-of-the-art computational methods and approaches have been employed to determine highly accurate fundamental vibrational frequencies and rotational parameters. Available experimental data were used to assess the reliability of our computations, and an accuracy on average of 10 cm{sup –1} for fundamental transitions as well as overtones and combination bands has been pointed out. Moving to rotational spectroscopy, relative discrepancies of 0.1%, 2%-3%, and 3%-4% were observed for rotational, quartic, and sextic centrifugal-distortion constants, respectively. We are therefore confident that the highly accurate spectroscopic data provided herein can be usefulmore » for identification of oxirane in Titan's atmosphere and the assignment of unidentified infrared bands. Since oxirane was already observed in the interstellar medium and some astronomical objects are characterized by very high D/H ratios, we also considered the accurate determination of the spectroscopic parameters for the mono-deuterated species, oxirane-d1. For the latter, an empirical scaling procedure allowed us to improve our computed data and to provide predictions for rotational transitions with a relative accuracy of about 0.02% (i.e., an uncertainty of about 40 MHz for a transition lying at 200 GHz).« less

  1. Accurate mass screening and identification of emerging contaminants in environmental samples by liquid chromatography-hybrid linear ion trap Orbitrap mass spectrometry.

    PubMed

    Hogenboom, A C; van Leerdam, J A; de Voogt, P

    2009-01-16

    The European Reach legislation will possibly drive producers to develop newly designed chemicals that will be less persistent, bioaccumulative or toxic. If this innovation leads to an increased use of more hydrophilic chemicals it may result in higher mobilities of chemicals in the aqueous environment. As a result, the drinking water companies may face stronger demands on removal processes as the hydrophilic compounds inherently are more difficult to remove. Monitoring efforts will also experience a shift in focus to more water-soluble compounds. Screening source waters on the presence of (emerging) contaminants is an essential step in the control of the water cycle from source to tap water. In this article, some of our experiences are presented with the hybrid linear ion trap (LTQ) FT Orbitrap mass spectrometer, in the area of chemical water analysis. A two-pronged strategy in mass spectrometric research was employed: (i) exploring effluent, surface, ground- and drinking-water samples searching for accurate masses corresponding to target compounds (and their product ions) known from, e.g. priority lists or the scientific literature and (ii) full-scan screening of water samples in search of 'unknown' or unexpected masses, followed by MS(n) experiments to elucidate the structure of the unknowns. Applications of both approaches to emerging water contaminants are presented and discussed. Results are presented for target analysis search for pharmaceuticals, benzotriazoles, illicit drugs and for the identification of unknown compounds in a groundwater sample and in a polar extract of a landfill soil sample (a toxicity identification evaluation bioassay sample). The applications of accurate mass screening and identification described in this article demonstrate that the LC-LTQ FT Orbitrap MS is well equipped to meet the challenges posed by newly emerging polar contaminants.

  2. Identification and characterization of moonlighting long non-coding RNAs based on RNA and protein interactome.

    PubMed

    Cheng, Lixin; Leung, Kwong-Sak

    2018-05-16

    Moonlighting proteins are a class of proteins having multiple distinct functions, which play essential roles in a variety of cellular and enzymatic functioning systems. Although there have long been calls for computational algorithms for the identification of moonlighting proteins, research on approaches to identify moonlighting long non-coding RNAs (lncRNAs) has never been undertaken. Here, we introduce a novel methodology, MoonFinder, for the identification of moonlighting lncRNAs. MoonFinder is a statistical algorithm identifying moonlighting lncRNAs without a priori knowledge through the integration of protein interactome, RNA-protein interactions, and functional annotation of proteins. We identify 155 moonlighting lncRNA candidates and uncover that they are a distinct class of lncRNAs characterized by specific sequence and cellular localization features. The non-coding genes that transcript moonlighting lncRNAs tend to have shorter but more exons and the moonlighting lncRNAs have a variable localization pattern with a high chance of residing in the cytoplasmic compartment in comparison to the other lncRNAs. Moreover, moonlighting lncRNAs and moonlighting proteins are rather mutually exclusive in terms of both their direct interactions and interacting partners. Our results also shed light on how the moonlighting candidates and their interacting proteins implicated in the formation and development of cancers and other diseases. The code implementing MoonFinder is supplied as an R package in the supplementary material. lxcheng@cse.cuhk.edu.hk or ksleung@cse.cuhk.edu.hk. Supplementary data are available at Bioinformatics online.

  3. NMRDSP: an accurate prediction of protein shape strings from NMR chemical shifts and sequence data.

    PubMed

    Mao, Wusong; Cong, Peisheng; Wang, Zhiheng; Lu, Longjian; Zhu, Zhongliang; Li, Tonghua

    2013-01-01

    Shape string is structural sequence and is an extremely important structure representation of protein backbone conformations. Nuclear magnetic resonance chemical shifts give a strong correlation with the local protein structure, and are exploited to predict protein structures in conjunction with computational approaches. Here we demonstrate a novel approach, NMRDSP, which can accurately predict the protein shape string based on nuclear magnetic resonance chemical shifts and structural profiles obtained from sequence data. The NMRDSP uses six chemical shifts (HA, H, N, CA, CB and C) and eight elements of structure profiles as features, a non-redundant set (1,003 entries) as the training set, and a conditional random field as a classification algorithm. For an independent testing set (203 entries), we achieved an accuracy of 75.8% for S8 (the eight states accuracy) and 87.8% for S3 (the three states accuracy). This is higher than only using chemical shifts or sequence data, and confirms that the chemical shift and the structure profile are significant features for shape string prediction and their combination prominently improves the accuracy of the predictor. We have constructed the NMRDSP web server and believe it could be employed to provide a solid platform to predict other protein structures and functions. The NMRDSP web server is freely available at http://cal.tongji.edu.cn/NMRDSP/index.jsp.

  4. Identification of a dual-specificity protein phosphatase that inactivates a MAP kinase from Arabidopsis

    NASA Technical Reports Server (NTRS)

    Gupta, R.; Huang, Y.; Kieber, J.; Luan, S.; Evans, M. L. (Principal Investigator)

    1998-01-01

    Mitogen-activated protein kinases (MAPKs) play a key role in plant responses to stress and pathogens. Activation and inactivation of MAPKs involve phosphorylation and dephosphorylation on both threonine and tyrosine residues in the kinase domain. Here we report the identification of an Arabidopsis gene encoding a dual-specificity protein phosphatase capable of hydrolysing both phosphoserine/threonine and phosphotyrosine in protein substrates. This enzyme, designated AtDsPTP1 (Arabidopsis thaliana dual-specificity protein tyrosine phosphatase), dephosphorylated and inactivated AtMPK4, a MAPK member from the same plant. Replacement of a highly conserved cysteine by serine abolished phosphatase activity of AtDsPTP1, indicating a conserved catalytic mechanism of dual-specificity protein phosphatases from all eukaryotes.

  5. Spatial Resolution Requirements for Accurate Identification of Drivers of Atrial Fibrillation

    PubMed Central

    Roney, Caroline H.; Cantwell, Chris D.; Bayer, Jason D.; Qureshi, Norman A.; Lim, Phang Boon; Tweedy, Jennifer H.; Kanagaratnam, Prapa; Vigmond, Edward J.; Ng, Fu Siong

    2017-01-01

    Background— Recent studies have demonstrated conflicting mechanisms underlying atrial fibrillation (AF), with the spatial resolution of data often cited as a potential reason for the disagreement. The purpose of this study was to investigate whether the variation in spatial resolution of mapping may lead to misinterpretation of the underlying mechanism in persistent AF. Methods and Results— Simulations of rotors and focal sources were performed to estimate the minimum number of recording points required to correctly identify the underlying AF mechanism. The effects of different data types (action potentials and unipolar or bipolar electrograms) and rotor stability on resolution requirements were investigated. We also determined the ability of clinically used endocardial catheters to identify AF mechanisms using clinically recorded and simulated data. The spatial resolution required for correct identification of rotors and focal sources is a linear function of spatial wavelength (the distance between wavefronts) of the arrhythmia. Rotor localization errors are larger for electrogram data than for action potential data. Stationary rotors are more reliably identified compared with meandering trajectories, for any given spatial resolution. All clinical high-resolution multipolar catheters are of sufficient resolution to accurately detect and track rotors when placed over the rotor core although the low-resolution basket catheter is prone to false detections and may incorrectly identify rotors that are not present. Conclusions— The spatial resolution of AF data can significantly affect the interpretation of the underlying AF mechanism. Therefore, the interpretation of human AF data must be taken in the context of the spatial resolution of the recordings. PMID:28500175

  6. A practical guide for the identification of membrane and plasma membrane proteins in human embryonic stem cells and human embryonal carcinoma cells.

    PubMed

    Dormeyer, Wilma; van Hoof, Dennis; Mummery, Christine L; Krijgsveld, Jeroen; Heck, Albert J R

    2008-10-01

    The identification of (plasma) membrane proteins in cells can provide valuable insights into the regulation of their biological processes. Pluripotent cells such as human embryonic stem cells and embryonal carcinoma cells are capable of unlimited self-renewal and share many of the biological mechanisms that regulate proliferation and differentiation. The comparison of their membrane proteomes will help unravel the biological principles of pluripotency, and the identification of biomarker proteins in their plasma membranes is considered a crucial step to fully exploit pluripotent cells for therapeutic purposes. For these tasks, membrane proteomics is the method of choice, but as indicated by the scarce identification of membrane and plasma membrane proteins in global proteomic surveys it is not an easy task. In this minireview, we first describe the general challenges of membrane proteomics. We then review current sample preparation steps and discuss protocols that we found particularly beneficial for the identification of large numbers of (plasma) membrane proteins in human tumour- and embryo-derived stem cells. Our optimized assembled protocol led to the identification of a large number of membrane proteins. However, as the composition of cells and membranes is highly variable we still recommend adapting the sample preparation protocol for each individual system.

  7. Cy5 maleimide labelling for sensitive detection of free thiols in native protein extracts: identification of seed proteins targeted by barley thioredoxin h isoforms.

    PubMed Central

    Maeda, Kenji; Finnie, Christine; Svensson, Birte

    2004-01-01

    Barley thioredoxin h isoforms HvTrxh1 and HvTrxh2 differ in temporal and spatial distribution and in kinetic properties. Target proteins of HvTrxh1 and HvTrxh2 were identified in mature seeds and in seeds after 72 h of germination. Improvement of the established method for identification of thioredoxin-targeted proteins based on two-dimensional electrophoresis and fluorescence labelling of thiol groups was achieved by application of a highly sensitive Cy5 maleimide dye and large-format two-dimensional gels, resulting in a 10-fold increase in the observed number of labelled protein spots. The technique also provided information about accessible thiol groups in the proteins identified in the barley seed proteome. In total, 16 different putative target proteins were identified from 26 spots using tryptic in-gel digestion, matrix-assisted laser-desorption ionization-time-of-flight MS and database search. HvTrxh1 and HvTrxh2 were shown to have similar target specificity. Barley alpha-amylase/subtilisin inhibitor, previously demonstrated to be reduced by both HvTrxh1 and HvTrxh2, was among the identified target proteins, confirming the suitability of the method. Several alpha-amylase/trypsin inhibitors, some of which are already known as target proteins of thioredoxin h, and cyclophilin known as a target protein of m-type thioredoxin were also identified. Lipid transfer protein, embryospecific protein, three chitinase isoenzymes, a single-domain glyoxalase-like protein and superoxide dismutase were novel identifications of putative target proteins, suggesting new physiological roles of thioredoxin h in barley seeds. PMID:14636158

  8. Identification of the protein components displaying immunomodulatory activity in aged garlic extract.

    PubMed

    Chandrashekar, P M; Venkatesh, Y P

    2009-07-30

    Traditionally, garlic (Allium sativum L.; Alliaceae) has been known to boost the immune system. Aged garlic has more potent immunomodulatory effects than raw garlic. These effects have been attributed to the transformed organosulfur compounds; the identity of the immunomodulatory proteins in aged garlic extract (AGE) is not known. The major aims are to examine the changes occurring in the protein fraction during ageing of garlic and to identify the immunomodulatory proteins. Changes occurring in garlic during ageing have been examined by protein quantitation and gel electrophoresis. Purification and identification of the immunomodulatory proteins have been achieved by Q-Sepharose chromatography and mitogenic activity. Only two major proteins (12-14 kDa range by SDS-PAGE) are observed in AGE. The purified protein components QA-1, QA-2, and QA-3 display immunomodulatory and mannose-binding activity; QA-2 shows the highest mitogenic activity. The identity of QA-2 and QA-1 proteins with the garlic lectins ASA I and ASA II, respectively, has been confirmed by hemagglutination analysis. QA-3 exhibits mitogenic activity, but no hemagglutination activity. The immunomodulatory activity of AGE is also contributed by immunomodulatory proteins. The major immunomodulatory proteins have been identified as the well-known garlic lectins.

  9. Identification of DNA-Binding Proteins Using Structural, Electrostatic and Evolutionary Features

    PubMed Central

    Nimrod, Guy; Szilágyi, András; Leslie, Christina; Ben-Tal, Nir

    2009-01-01

    Summary DNA binding proteins (DBPs) often take part in various crucial processes of the cell's life cycle. Therefore, the identification and characterization of these proteins are of great importance. We present here a random forests classifier for identifying DBPs among proteins with known three-dimensional structures. First, clusters of evolutionarily conserved regions (patches) on the protein's surface are detected using the PatchFinder algorithm; previous studies showed that these regions are typically the proteins' functionally important regions. Next, we train a classifier using features like the electrostatic potential, cluster-based amino acid conservation patterns and the secondary structure content of the patches, as well as features of the whole protein including its dipole moment. Using 10-fold cross validation on a dataset of 138 DNA-binding proteins and 110 proteins which do not bind DNA, the classifier achieved a sensitivity and a specificity of 0.90, which is overall better than the performance of previously published methods. Furthermore, when we tested 5 different methods on 11 new DBPs which did not appear in the original dataset, only our method annotated all correctly. The resulting classifier was applied to a collection of 757 proteins of known structure and unknown function. Of these proteins, 218 were predicted to bind DNA, and we anticipate that some of them interact with DNA using new structural motifs. The use of complementary computational tools supports the notion that at least some of them do bind DNA. PMID:19233205

  10. Proteomics meets blood banking: identification of protein targets for the improvement of platelet quality.

    PubMed

    Schubert, Peter; Devine, Dana V

    2010-01-03

    Proteomics has brought new perspectives to the fields of hematology and transfusion medicine in the last decade. The steady improvement of proteomic technology is propelling novel discoveries of molecular mechanisms by studying protein expression, post-translational modifications and protein interactions. This review article focuses on the application of proteomics to the identification of molecular mechanisms leading to the deterioration of blood platelets during storage - a critical aspect in the provision of platelet transfusion products. Several proteomic approaches have been employed to analyse changes in the platelet protein profile during storage and the obtained data now need to be translated into platelet biochemistry in order to connect the results to platelet function. Targeted biochemical applications then allow the identification of points for intervention in signal transduction pathways. Once validated and placed in a transfusion context, these data will provide further understanding of the underlying molecular mechanisms leading to platelet storage lesion. Future aspects of proteomics in blood banking will aim to make use of protein markers identified for platelet storage lesion development to monitor proteome changes when alterations such as the use of additive solutions or pathogen reduction strategies are put in place in order to improve platelet quality for patients. (c) 2009 Elsevier B.V. All rights reserved.

  11. P185-M Protein Identification and Validation of Results in Workflows that Integrate over Various Instruments, Datasets, Search Engines

    PubMed Central

    Hufnagel, P.; Glandorf, J.; Körting, G.; Jabs, W.; Schweiger-Hufnagel, U.; Hahner, S.; Lubeck, M.; Suckau, D.

    2007-01-01

    Analysis of complex proteomes often results in long protein lists, but falls short in measuring the validity of identification and quantification results on a greater number of proteins. Biological and technical replicates are mandatory, as is the combination of the MS data from various workflows (gels, 1D-LC, 2D-LC), instruments (TOF/TOF, trap, qTOF or FTMS), and search engines. We describe a database-driven study that combines two workflows, two mass spectrometers, and four search engines with protein identification following a decoy database strategy. The sample was a tryptically digested lysate (10,000 cells) of a human colorectal cancer cell line. Data from two LC-MALDI-TOF/TOF runs and a 2D-LC-ESI-trap run using capillary and nano-LC columns were submitted to the proteomics software platform ProteinScape. The combined MALDI data and the ESI data were searched using Mascot (Matrix Science), Phenyx (GeneBio), ProteinSolver (Bruker and Protagen), and Sequest (Thermo) against a decoy database generated from IPI-human in order to obtain one protein list across all workflows and search engines at a defined maximum false-positive rate of 5%. ProteinScape combined the data to one LC-MALDI and one LC-ESI dataset. The initial separate searches from the two combined datasets generated eight independent peptide lists. These were compiled into an integrated protein list using the ProteinExtractor algorithm. An initial evaluation of the generated data led to the identification of approximately 1200 proteins. Result integration on a peptide level allowed discrimination of protein isoforms that would not have been possible with a mere combination of protein lists.

  12. Stable and accurate methods for identification of water bodies from Landsat series imagery using meta-heuristic algorithms

    NASA Astrophysics Data System (ADS)

    Gamshadzaei, Mohammad Hossein; Rahimzadegan, Majid

    2017-10-01

    Identification of water extents in Landsat images is challenging due to surfaces with similar reflectance to water extents. The objective of this study is to provide stable and accurate methods for identifying water extents in Landsat images based on meta-heuristic algorithms. Then, seven Landsat images were selected from various environmental regions in Iran. Training of the algorithms was performed using 40 water pixels and 40 nonwater pixels in operational land imager images of Chitgar Lake (one of the study regions). Moreover, high-resolution images from Google Earth were digitized to evaluate the results. Two approaches were considered: index-based and artificial intelligence (AI) algorithms. In the first approach, nine common water spectral indices were investigated. AI algorithms were utilized to acquire coefficients of optimal band combinations to extract water extents. Among the AI algorithms, the artificial neural network algorithm and also the ant colony optimization, genetic algorithm, and particle swarm optimization (PSO) meta-heuristic algorithms were implemented. Index-based methods represented different performances in various regions. Among AI methods, PSO had the best performance with average overall accuracy and kappa coefficient of 93% and 98%, respectively. The results indicated the applicability of acquired band combinations to extract accurately and stably water extents in Landsat imagery.

  13. Identification, sequencing and expression of an integral membrane protein of the trans-Golgi network (TGN38).

    PubMed Central

    Luzio, J P; Brake, B; Banting, G; Howell, K E; Braghetta, P; Stanley, K K

    1990-01-01

    Organelle-specific integral membrane proteins were identified by a novel strategy which gives rise to monospecific antibodies to these proteins as well as to the cDNA clones encoding them. A cDNA expression library was screened with a polyclonal antiserum raised against Triton X-114-extracted organelle proteins and clones were then grouped using antibodies affinity-purified on individual fusion proteins. The identification, molecular cloning and sequencing are described of a type 1 membrane protein (TGN38) which is located specifically in the trans-Golgi network. Images Fig. 1. Fig. 3. PMID:2204342

  14. Identification and Characterization of Arabidopsis Seed Coat Mucilage Proteins1[OPEN

    PubMed Central

    Tsai, Allen Yi-Lun; Kunieda, Tadashi; Rogalski, Jason; Foster, Leonard J.; Ellis, Brian E.

    2017-01-01

    Plant cell wall proteins are important regulators of cell wall architecture and function. However, because cell wall proteins are difficult to extract and analyze, they are generally poorly understood. Here, we describe the identification and characterization of proteins integral to the Arabidopsis (Arabidopsis thaliana) seed coat mucilage, a specialized layer of the extracellular matrix composed of plant cell wall carbohydrates that is used as a model for cell wall research. The proteins identified in mucilage include those previously identified by genetic analysis, and several mucilage proteins are reduced in mucilage-deficient mutant seeds, suggesting that these proteins are genuinely associated with the mucilage. Arabidopsis mucilage has both nonadherent and adherent layers. Both layers have similar protein profiles except for proteins involved in lipid metabolism, which are present exclusively in the adherent mucilage. The most abundant mucilage proteins include a family of proteins named TESTA ABUNDANT1 (TBA1) to TBA3; a less abundant fourth homolog was named TBA-LIKE (TBAL). TBA and TBAL transcripts and promoter activities were detected in developing seed coats, and their expression requires seed coat differentiation regulators. TBA proteins are secreted to the mucilage pocket during differentiation. Although reverse genetics failed to identify a function for TBAs/TBAL, the TBA promoters are highly expressed and cell type specific and so should be very useful tools for targeting proteins to the seed coat epidermis. Altogether, these results highlight the mucilage proteome as a model for cell walls in general, as it shares similarities with other cell wall proteomes while also containing mucilage-specific features. PMID:28003327

  15. Estimating the Efficiency of Phosphopeptide Identification by Tandem Mass Spectrometry

    NASA Astrophysics Data System (ADS)

    Hsu, Chuan-Chih; Xue, Liang; Arrington, Justine V.; Wang, Pengcheng; Paez Paez, Juan Sebastian; Zhou, Yuan; Zhu, Jian-Kang; Tao, W. Andy

    2017-06-01

    Mass spectrometry has played a significant role in the identification of unknown phosphoproteins and sites of phosphorylation in biological samples. Analyses of protein phosphorylation, particularly large scale phosphoproteomic experiments, have recently been enhanced by efficient enrichment, fast and accurate instrumentation, and better software, but challenges remain because of the low stoichiometry of phosphorylation and poor phosphopeptide ionization efficiency and fragmentation due to neutral loss. Phosphoproteomics has become an important dimension in systems biology studies, and it is essential to have efficient analytical tools to cover a broad range of signaling events. To evaluate current mass spectrometric performance, we present here a novel method to estimate the efficiency of phosphopeptide identification by tandem mass spectrometry. Phosphopeptides were directly isolated from whole plant cell extracts, dephosphorylated, and then incubated with one of three purified kinases—casein kinase II, mitogen-activated protein kinase 6, and SNF-related protein kinase 2.6—along with 16O4- and 18O4-ATP separately for in vitro kinase reactions. Phosphopeptides were enriched and analyzed by LC-MS. The phosphopeptide identification rate was estimated by comparing phosphopeptides identified by tandem mass spectrometry with phosphopeptide pairs generated by stable isotope labeled kinase reactions. Overall, we found that current high speed and high accuracy mass spectrometers can only identify 20%-40% of total phosphopeptides primarily due to relatively poor fragmentation, additional modifications, and low abundance, highlighting the urgent need for continuous efforts to improve phosphopeptide identification efficiency. [Figure not available: see fulltext.

  16. Optimization of the GBMV2 implicit solvent force field for accurate simulation of protein conformational equilibria.

    PubMed

    Lee, Kuo Hao; Chen, Jianhan

    2017-06-15

    Accurate treatment of solvent environment is critical for reliable simulations of protein conformational equilibria. Implicit treatment of solvation, such as using the generalized Born (GB) class of models arguably provides an optimal balance between computational efficiency and physical accuracy. Yet, GB models are frequently plagued by a tendency to generate overly compact structures. The physical origins of this drawback are relatively well understood, and the key to a balanced implicit solvent protein force field is careful optimization of physical parameters to achieve a sufficient level of cancellation of errors. The latter has been hampered by the difficulty of generating converged conformational ensembles of non-trivial model proteins using the popular replica exchange sampling technique. Here, we leverage improved sampling efficiency of a newly developed multi-scale enhanced sampling technique to re-optimize the generalized-Born with molecular volume (GBMV2) implicit solvent model with the CHARMM36 protein force field. Recursive optimization of key GBMV2 parameters (such as input radii) and protein torsion profiles (via the CMAP torsion cross terms) has led to a more balanced GBMV2 protein force field that recapitulates the structures and stabilities of both helical and β-hairpin model peptides. Importantly, this force field appears to be free of the over-compaction bias, and can generate structural ensembles of several intrinsically disordered proteins of various lengths that seem highly consistent with available experimental data. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.

  17. Identification of lipopolysaccharide-interacting plasma membrane-type proteins in Arabidopsis thaliana.

    PubMed

    Vilakazi, Cornelius S; Dubery, Ian A; Piater, Lizelle A

    2017-02-01

    Lipopolysaccharide (LPS) is an amphiphatic bacterial glycoconjugate found on the external membrane of Gram-negative bacteria. This endotoxin is considered as a microbe-associated molecular pattern (MAMP) molecule and has been shown to elicit defense responses in plants. Here, LPS-interacting proteins from Arabidopsis thaliana plasma membrane (PM)-type fractions were captured and identified in order to investigate those involved in LPS perception and linked to triggering of innate immune responses. A novel proteomics-based affinity-capture strategy coupled to liquid chromatography-tandem mass spectrometry (LC-MS/MS) was employed for the enrichment and identification of LPS-interacting proteins. As such, LPS isolated from Burkholderia cepacia (LPS B.cep. ) was immobilized on three independent and distinct affinity-based matrices to serve as bait for interacting proteins from A. thaliana leaf and callus tissue. These were resolved by 1D electrophoresis and identified by mass spectrometry. Proteins specifically bound to LPS B.cep. have been implicated in membrane structure (e.g. COBRA-like and tubulin proteins), membrane trafficking and/or transport (e.g. soluble NSF attachment protein receptor (SNARE) proteins, patellin, aquaporin, PM instrinsic proteins (PIP) and H + -ATPase), signal transduction (receptor-like kinases and calcium-dependent protein kinases) as well as defense/stress responses (e.g. hypersensitive-induced response (HIR) proteins, jacalin-like lectin domain-containing protein and myrosinase-binding proteins). The novel affinity-capture strategy for the enrichment of LPS-interacting proteins proved to be effective, especially in the binding of proteins involved in plant defense responses, and can thus be used to elucidate LPS-mediated molecular recognition and disease mechanism(s). Copyright © 2016 Elsevier Masson SAS. All rights reserved.

  18. Bayesian module identification from multiple noisy networks.

    PubMed

    Zamani Dadaneh, Siamak; Qian, Xiaoning

    2016-12-01

    Module identification has been studied extensively in order to gain deeper understanding of complex systems, such as social networks as well as biological networks. Modules are often defined as groups of vertices in these networks that are topologically cohesive with similar interaction patterns with the rest of the vertices. Most of the existing module identification algorithms assume that the given networks are faithfully measured without errors. However, in many real-world applications, for example, when analyzing protein-protein interaction networks from high-throughput profiling techniques, there is significant noise with both false positive and missing links between vertices. In this paper, we propose a new model for more robust module identification by taking advantage of multiple observed networks with significant noise so that signals in multiple networks can be strengthened and help improve the solution quality by combining information from various sources. We adopt a hierarchical Bayesian model to integrate multiple noisy snapshots that capture the underlying modular structure of the networks under study. By introducing a latent root assignment matrix and its relations to instantaneous module assignments in all the observed networks to capture the underlying modular structure and combine information across multiple networks, an efficient variational Bayes algorithm can be derived to accurately and robustly identify the underlying modules from multiple noisy networks. Experiments on synthetic and protein-protein interaction data sets show that our proposed model enhances both the accuracy and resolution in detecting cohesive modules, and it is less vulnerable to noise in the observed data. In addition, it shows higher power in predicting missing edges compared to individual-network methods.

  19. Proteomic platform for the identification of proteins in olive (Olea europaea) pulp.

    PubMed

    Capriotti, Anna Laura; Cavaliere, Chiara; Foglia, Patrizia; Piovesana, Susy; Samperi, Roberto; Stampachiacchiere, Serena; Laganà, Aldo

    2013-10-24

    The nutritional and cancer-protective properties of the oil extracted mechanically from the ripe fruits of Olea europaea trees are attracting constantly more attention worldwide. The preparation of high-quality protein samples from plant tissues for proteomic analysis poses many challenging problems. In this study we employed a proteomic platform based on two different extraction methods, SDS and CHAPS based protocols, followed by two precipitation protocols, TCA/acetone and MeOH precipitation, in order to increase the final number of identified proteins. The use of advanced MS techniques in combination with the Swissprot and NCBI Viridiplantae databases and TAIR10 Arabidopsis database allowed us to identify 1265 proteins, of which 22 belong to O. europaea. The application of this proteomic platform for protein extraction and identification will be useful also for other proteomic studies on recalcitrant plant/fruit tissues. Copyright © 2013. Published by Elsevier B.V.

  20. Identification of DNA-binding proteins using structural, electrostatic and evolutionary features.

    PubMed

    Nimrod, Guy; Szilágyi, András; Leslie, Christina; Ben-Tal, Nir

    2009-04-10

    DNA-binding proteins (DBPs) participate in various crucial processes in the life-cycle of the cells, and the identification and characterization of these proteins is of great importance. We present here a random forests classifier for identifying DBPs among proteins with known 3D structures. First, clusters of evolutionarily conserved regions (patches) on the surface of proteins were detected using the PatchFinder algorithm; earlier studies showed that these regions are typically the functionally important regions of proteins. Next, we trained a classifier using features like the electrostatic potential, cluster-based amino acid conservation patterns and the secondary structure content of the patches, as well as features of the whole protein, including its dipole moment. Using 10-fold cross-validation on a dataset of 138 DBPs and 110 proteins that do not bind DNA, the classifier achieved a sensitivity and a specificity of 0.90, which is overall better than the performance of published methods. Furthermore, when we tested five different methods on 11 new DBPs that did not appear in the original dataset, only our method annotated all correctly. The resulting classifier was applied to a collection of 757 proteins of known structure and unknown function. Of these proteins, 218 were predicted to bind DNA, and we anticipate that some of them interact with DNA using new structural motifs. The use of complementary computational tools supports the notion that at least some of them do bind DNA.

  1. Mass spectrometry applied to the identification of Mycobacterium tuberculosis and biomarker discovery.

    PubMed

    López-Hernández, Y; Patiño-Rodríguez, O; García-Orta, S T; Pinos-Rodríguez, J M

    2016-12-01

    An adequate and effective tuberculosis (TB) diagnosis system has been identified by the World Health Organization as a priority in the fight against this disease. Over the years, several methods have been developed to identify the bacillus, but bacterial culture remains one of the most affordable methods for most countries. For rapid and accurate identification, however, it is more feasible to implement molecular techniques, taking advantage of the availability of public databases containing protein sequences. Mass spectrometry (MS) has become an interesting technique for the identification of TB. Here, we review some of the most widely employed methods for identifying Mycobacterium tuberculosis and present an update on MS applied for the identification of mycobacterial species. © 2016 The Society for Applied Microbiology.

  2. Genome-Wide Identification and Expression of Xenopus F-Box Family of Proteins.

    PubMed

    Saritas-Yildirim, Banu; Pliner, Hannah A; Ochoa, Angelica; Silva, Elena M

    2015-01-01

    Protein degradation via the multistep ubiquitin/26S proteasome pathway is a rapid way to alter the protein profile and drive cell processes and developmental changes. Many key regulators of embryonic development are targeted for degradation by E3 ubiquitin ligases. The most studied family of E3 ubiquitin ligases is the SCF ubiquitin ligases, which use F-box adaptor proteins to recognize and recruit target proteins. Here, we used a bioinformatics screen and phylogenetic analysis to identify and annotate the family of F-box proteins in the Xenopus tropicalis genome. To shed light on the function of the F-box proteins, we analyzed expression of F-box genes during early stages of Xenopus development. Many F-box genes are broadly expressed with expression domains localized to diverse tissues including brain, spinal cord, eye, neural crest derivatives, somites, kidneys, and heart. All together, our genome-wide identification and expression profiling of the Xenopus F-box family of proteins provide a foundation for future research aimed to identify the precise role of F-box dependent E3 ubiquitin ligases and their targets in the regulatory circuits of development.

  3. Identification and characterization of intracellular proteins that bind oligonucleotides with phosphorothioate linkages

    PubMed Central

    Liang, Xue-hai; Sun, Hong; Shen, Wen; Crooke, Stanley T.

    2015-01-01

    Although the RNase H-dependent mechanism of inhibition of gene expression by chemically modified antisense oligonucleotides (ASOs) has been well characterized, little is known about the interactions between ASOs and intracellular proteins that may alter cellular localization and/or potency of ASOs. Here, we report the identification of 56 intracellular ASO-binding proteins using multi-step affinity selection approaches. Many of the tested proteins had no significant effect on ASO activity; however, some proteins, including La/SSB, NPM1, ANXA2, VARS and PC4, appeared to enhance ASO activities, likely through mechanisms related to subcellular distribution. VARS and ANXA2 co-localized with ASOs in endocytic organelles, and reduction in the level of VARS altered lysosome/ASO localization patterns, implying that these proteins may facilitate ASO release from the endocytic pathway. Depletion of La and NPM1 reduced nuclear ASO levels, suggesting potential roles in ASO nuclear accumulation. On the other hand, Ku70 and Ku80 proteins inhibited ASO activity, most likely by competition with RNase H1 for ASO/RNA duplex binding. Our results demonstrate that phosphorothioate-modified ASOs bind a set of cellular proteins that affect ASO activity via different mechanisms. PMID:25712094

  4. Identification and characterization of plastid-type proteins from sequence-attributed features using machine learning

    PubMed Central

    2013-01-01

    Background Plastids are an important component of plant cells, being the site of manufacture and storage of chemical compounds used by the cell, and contain pigments such as those used in photosynthesis, starch synthesis/storage, cell color etc. They are essential organelles of the plant cell, also present in algae. Recent advances in genomic technology and sequencing efforts is generating a huge amount of DNA sequence data every day. The predicted proteome of these genomes needs annotation at a faster pace. In view of this, one such annotation need is to develop an automated system that can distinguish between plastid and non-plastid proteins accurately, and further classify plastid-types based on their functionality. We compared the amino acid compositions of plastid proteins with those of non-plastid ones and found significant differences, which were used as a basis to develop various feature-based prediction models using similarity-search and machine learning. Results In this study, we developed separate Support Vector Machine (SVM) trained classifiers for characterizing the plastids in two steps: first distinguishing the plastid vs. non-plastid proteins, and then classifying the identified plastids into their various types based on their function (chloroplast, chromoplast, etioplast, and amyloplast). Five diverse protein features: amino acid composition, dipeptide composition, the pseudo amino acid composition, Nterminal-Center-Cterminal composition and the protein physicochemical properties are used to develop SVM models. Overall, the dipeptide composition-based module shows the best performance with an accuracy of 86.80% and Matthews Correlation Coefficient (MCC) of 0.74 in phase-I and 78.60% with a MCC of 0.44 in phase-II. On independent test data, this model also performs better with an overall accuracy of 76.58% and 74.97% in phase-I and phase-II, respectively. The similarity-based PSI-BLAST module shows very low performance with about 50% prediction

  5. Identification of a nuclear localization signal in the retinitis pigmentosa-mutated RP26 protein, ceramide kinase-like protein

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Inagaki, Yuichi; Mitsutake, Susumu; Igarashi, Yasuyuki

    2006-05-12

    Retinitis pigmentosa (RP) is a genetically heterogeneous disease characterized by degeneration of the retina. A mutation in a new ceramide kinase (CERK) homologous gene, named CERK-like protein (CERKL), was found to cause autosomal recessive retinitis pigmentosa (RP26). Here, we show a point mutation of one of two putative nuclear localization signal (NLS) sequences inhibited the nuclear localization of the protein. Furthermore, the tetra-GFP-tagged NLS, which cannot passively enter the nucleus, was observed not only in the nucleus but also in the nucleolus. Our results provide First evidence of the active nuclear import of CERKL and suggest that the identified NLSmore » might be responsible for nucleolar retention of the protein. As recent studies have shown other RP-related proteins are localized in the nucleus or the nucleolus, our identification of NLS in CERKL suggests that CERKL likely plays important roles for retinal functions in the nucleus and the nucleolus.« less

  6. Identification of ZASP, a novel protein associated to Zona occludens-2.

    PubMed

    Lechuga, Susana; Alarcón, Lourdes; Solano, Jesús; Huerta, Miriam; Lopez-Bayghen, Esther; González-Mariscal, Lorenza

    2010-11-15

    With the aim of discovering new molecular interactions of the tight junction protein ZO-2, a two-hybrid screen was performed on a human kidney cDNA library using as bait the middle segment of ZO-2. Through this assay we identified a 24-kDa novel protein herein named ZASP for ZO-2 associated speckle protein. ZO-2/ZASP interaction further confirmed by pull down and immunoprecipitation experiments, requires the presence of the intact PDZ binding motif SQV of ZASP and the third PDZ domain of ZO-2. ZASP mRNA and protein are present in the kidney and in several epithelial cell lines. Endogenous ZASP is expressed primarily in nuclear speckles in co-localization with splicing factor SC-35. Nocodazole treatment and wash out reveals that ZASP disappears from the nucleus during mitosis in accordance with speckle disassembly during metaphase. ZASP amino acid sequence exhibits a canonical nuclear exportation signal and in agreement the protein exits the nucleus through a process mediated by exportin/CRM1. ZASP over-expression blocks the inhibitory activity of ZO-2 on cyclin D1 gene transcription and protein expression. The identification of ZASP helps to unfold the complex nuclear molecular arrays that form on ZO-2 scaffolds. Copyright © 2010 Elsevier Inc. All rights reserved.

  7. THE IDENTIFICATION AND CHARACTERIZATION OF AN IGE-INDUCING PROTEIN IN METARHIZIUM ANISOPLIAE EXTRACT

    EPA Science Inventory

    The Identification and Characterization of an IgE-Inducing Protein in Metarhizium anisopliae Extract

    Marsha D.W. Ward1, Lisa B. Copeland1, Maura J. Donahue2, and Jody A. Shoemaker3
    1ORD, NHEERL, US EPA, RTP, NC; 2Oak Ridge Institute for Science and Education, Cincinnati...

  8. Comparative Evaluation of Small Molecular Additives and Their Effects on Peptide/Protein Identification.

    PubMed

    Gao, Jing; Zhong, Shaoyun; Zhou, Yanting; He, Han; Peng, Shuying; Zhu, Zhenyun; Liu, Xing; Zheng, Jing; Xu, Bin; Zhou, Hu

    2017-06-06

    Detergents and salts are widely used in lysis buffers to enhance protein extraction from biological samples, facilitating in-depth proteomic analysis. However, these detergents and salt additives must be efficiently removed from the digested samples prior to LC-MS/MS analysis to obtain high-quality mass spectra. Although filter-aided sample preparation (FASP), acetone precipitation (AP), followed by in-solution digestion, and strong cation exchange-based centrifugal proteomic reactors (CPRs) are commonly used for proteomic sample processing, little is known about their efficiencies at removing detergents and salt additives. In this study, we (i) developed an integrative workflow for the quantification of small molecular additives in proteomic samples, developing a multiple reaction monitoring (MRM)-based LC-MS approach for the quantification of six additives (i.e., Tris, urea, CHAPS, SDS, SDC, and Triton X-100) and (ii) systematically evaluated the relationships between the level of additive remaining in samples following sample processing and the number of peptides/proteins identified by mass spectrometry. Although FASP outperformed the other two methods, the results were complementary in terms of peptide/protein identification, as well as the GRAVY index and amino acid distributions. This is the first systematic and quantitative study of the effect of detergents and salt additives on protein identification. This MRM-based approach can be used for an unbiased evaluation of the performance of new sample preparation methods. Data are available via ProteomeXchange under identifier PXD005405.

  9. In silico re-identification of properties of drug target proteins.

    PubMed

    Kim, Baeksoo; Jo, Jihoon; Han, Jonghyun; Park, Chungoo; Lee, Hyunju

    2017-05-31

    Computational approaches in the identification of drug targets are expected to reduce time and effort in drug development. Advances in genomics and proteomics provide the opportunity to uncover properties of druggable genomes. Although several studies have been conducted for distinguishing drug targets from non-drug targets, they mainly focus on the sequences and functional roles of proteins. Many other properties of proteins have not been fully investigated. Using the DrugBank (version 3.0) database containing nearly 6,816 drug entries including 760 FDA-approved drugs and 1822 of their targets and human UniProt/Swiss-Prot databases, we defined 1578 non-redundant drug target and 17,575 non-drug target proteins. To select these non-redundant protein datasets, we built four datasets (A, B, C, and D) by considering clustering of paralogous proteins. We first reassessed the widely used properties of drug target proteins. We confirmed and extended that drug target proteins (1) are likely to have more hydrophobic, less polar, less PEST sequences, and more signal peptide sequences higher and (2) are more involved in enzyme catalysis, oxidation and reduction in cellular respiration, and operational genes. In this study, we proposed new properties (essentiality, expression pattern, PTMs, and solvent accessibility) for effectively identifying drug target proteins. We found that (1) drug targetability and protein essentiality are decoupled, (2) druggability of proteins has high expression level and tissue specificity, and (3) functional post-translational modification residues are enriched in drug target proteins. In addition, to predict the drug targetability of proteins, we exploited two machine learning methods (Support Vector Machine and Random Forest). When we predicted drug targets by combining previously known protein properties and proposed new properties, an F-score of 0.8307 was obtained. When the newly proposed properties are integrated, the prediction performance

  10. Matrix-assisted laser desorption ionization time-of-flight mass spectrometry for fast and accurate identification of clinically relevant Aspergillus species.

    PubMed

    Alanio, A; Beretti, J-L; Dauphin, B; Mellado, E; Quesne, G; Lacroix, C; Amara, A; Berche, P; Nassif, X; Bougnoux, M-E

    2011-05-01

    New Aspergillus species have recently been described with the use of multilocus sequencing in refractory cases of invasive aspergillosis. The classical phenotypic identification methods routinely used in clinical laboratories failed to identify them adequately. Some of these Aspergillus species have specific patterns of susceptibility to antifungal agents, and misidentification may lead to inappropriate therapy. We developed a matrix-assisted laser desorption ionization time-of-flight (MALDI-TOF) mass spectrometry (MS)-based strategy to adequately identify Aspergillus species to the species level. A database including the reference spectra of 28 clinically relevant species from seven Aspergillus sections (five common and 23 unusual species) was engineered. The profiles of young and mature colonies were analysed for each reference strain, and species-specific spectral fingerprints were identified. The performance of the database was then tested on 124 clinical and 16 environmental isolates previously characterized by partial sequencing of the β-tubulin and calmodulin genes. One hundred and thirty-eight isolates of 140 (98.6%) were correctly identified. Two atypical isolates could not be identified, but no isolate was misidentified (specificity: 100%). The database, including species-specific spectral fingerprints of young and mature colonies of the reference strains, allowed identification regardless of the maturity of the clinical isolate. These results indicate that MALDI-TOF MS is a powerful tool for rapid and accurate identification of both common and unusual species of Aspergillus. It can give better results than morphological identification in clinical laboratories. © 2010 The Authors. Clinical Microbiology and Infection © 2010 European Society of Clinical Microbiology and Infectious Diseases.

  11. Identification of neuronal target genes for CCAAT/Enhancer Binding Proteins

    PubMed Central

    Kfoury, N.; Kapatos, G.

    2009-01-01

    CCAAT/Enhancer Binding Proteins (C/EBPs) play pivotal roles in development and plasticity of the nervous system. Identification of the physiological targets of C/EBPs (C/EBP target genes) should therefore provide insight into the underlying biology of these processes. We used unbiased genome-wide mapping to identify 115 C/EBPβ target genes in PC12 cells that include transcription factors, neurotransmitter receptors, ion channels, protein kinases and synaptic vesicle proteins. C/EBPβ binding sites were located primarily within introns, suggesting novel regulatory functions, and were associated with binding sites for other developmentally important transcription factors. Experiments using dominant negatives showed C/EBPβ to repress transcription of a subset of target genes. Target genes in rat brain were subsequently found to preferentially bind C/EBPα, β and δ. Analysis of the hippocampal transcriptome of C/EBPβ knockout mice revealed dysregulation of a high percentage of transcripts identified as C/EBP target genes. These results support the hypothesis that C/EBPs play non-redundant roles in the brain. PMID:19103292

  12. NetCoffee: a fast and accurate global alignment approach to identify functionally conserved proteins in multiple networks.

    PubMed

    Hu, Jialu; Kehr, Birte; Reinert, Knut

    2014-02-15

    Owing to recent advancements in high-throughput technologies, protein-protein interaction networks of more and more species become available in public databases. The question of how to identify functionally conserved proteins across species attracts a lot of attention in computational biology. Network alignments provide a systematic way to solve this problem. However, most existing alignment tools encounter limitations in tackling this problem. Therefore, the demand for faster and more efficient alignment tools is growing. We present a fast and accurate algorithm, NetCoffee, which allows to find a global alignment of multiple protein-protein interaction networks. NetCoffee searches for a global alignment by maximizing a target function using simulated annealing on a set of weighted bipartite graphs that are constructed using a triplet approach similar to T-Coffee. To assess its performance, NetCoffee was applied to four real datasets. Our results suggest that NetCoffee remedies several limitations of previous algorithms, outperforms all existing alignment tools in terms of speed and nevertheless identifies biologically meaningful alignments. The source code and data are freely available for download under the GNU GPL v3 license at https://code.google.com/p/netcoffee/.

  13. Shotgun protein sequencing: assembly of peptide tandem mass spectra from mixtures of modified proteins.

    PubMed

    Bandeira, Nuno; Clauser, Karl R; Pevzner, Pavel A

    2007-07-01

    Despite significant advances in the identification of known proteins, the analysis of unknown proteins by MS/MS still remains a challenging open problem. Although Klaus Biemann recognized the potential of MS/MS for sequencing of unknown proteins in the 1980s, low throughput Edman degradation followed by cloning still remains the main method to sequence unknown proteins. The automated interpretation of MS/MS spectra has been limited by a focus on individual spectra and has not capitalized on the information contained in spectra of overlapping peptides. Indeed the powerful shotgun DNA sequencing strategies have not been extended to automated protein sequencing. We demonstrate, for the first time, the feasibility of automated shotgun protein sequencing of protein mixtures by utilizing MS/MS spectra of overlapping and possibly modified peptides generated via multiple proteases of different specificities. We validate this approach by generating highly accurate de novo reconstructions of multiple regions of various proteins in western diamondback rattlesnake venom. We further argue that shotgun protein sequencing has the potential to overcome the limitations of current protein sequencing approaches and thus catalyze the otherwise impractical applications of proteomics methodologies in studies of unknown proteins.

  14. Identification of ADAM 31: a protein expressed in Leydig cells and specialized epithelia.

    PubMed

    Liu, L; Smith, J W

    2000-06-01

    A family of proteins containing a disintegrin and metalloproteinase domain (ADAMs) has been identified recently. Here, we report the identification of a novel member of the ADAM protein family from mouse. This protein is designated ADAM 31. The complementary DNA sequence of ADAM 31 predicts a transmembrane protein with metalloproteinase, disintegrin, cysteine-rich, and cytoplasmic domains. Messenger RNA encoding ADAM 31 was most abundant in testes, but was also detected in many other tissues. More significantly, the antibodies raised against ADAM 31 reveal that the protein has a unique and restricted expression pattern. ADAM 31 is expressed in Leydig cells of the testes, but unlike many other ADAMs, it is not found on developing sperm. Furthermore, ADAM 31 is highly expressed on four types of specialized epithelia: the cauda epididymidis, the vas deferens, the convoluted tubules of the kidney, and the parietal cells of the stomach.

  15. Modal identification of dynamic mechanical systems

    NASA Astrophysics Data System (ADS)

    Srivastava, R. K.; Kundra, T. K.

    1992-07-01

    This paper reviews modal identification techniques which are now helping designers all over the world to improve the dynamic behavior of vibrating engineering systems. In this context the need to develop more accurate and faster parameter identification is ever increasing. A new dynamic stiffness matrix based identification method which is highly accurate, fast and system-dynamic-modification compatible is presented. The technique is applicable to all those multidegree-of-freedom systems where full receptance matrix can be experimentally measured.

  16. Size-Sorting Combined with Improved Nanocapillary-LC-MS for Identification of Intact Proteins up to 80 kDa

    PubMed Central

    Vellaichamy, Adaikkalam; Tran, John C.; Catherman, Adam D.; Lee, Ji Eun; Kellie, John F.; Sweet, Steve M.M.; Zamdborg, Leonid; Thomas, Paul M.; Ahlf, Dorothy R.; Durbin, Kenneth R.; Valaskovic, Gary A.; Kelleher, Neil L.

    2010-01-01

    Despite the availability of ultra-high resolution mass spectrometers, methods for separation and detection of intact proteins for proteome-scale analyses are still in a developmental phase. Here we report robust protocols for on-line LC-MS to drive high-throughput top-down proteomics in a fashion similar to bottom-up. Comparative work on protein standards showed that a polymeric stationary phase led to superior sensitivity over a silica-based medium in reversed-phase nanocapillary-LC, with detection of proteins >50 kDa routinely accomplished in the linear ion trap of a hybrid Fourier-Transform mass spectrometer. Protein identification was enabled by nozzle-skimmer dissociation (NSD) and detection of fragment ions with <5 ppm mass accuracy for highly-specific database searching using custom software. This overall approach led to identification of proteins up to 80 kDa, with 10-60 proteins identified in single LC-MS runs of samples from yeast and human cell lines pre-fractionated by their molecular weight using a gel-based sieving system. PMID:20073486

  17. Identification of 24h Ixodes scapularis immunogenic tick saliva proteins.

    PubMed

    Lewis, Lauren A; Radulović, Željko M; Kim, Tae K; Porter, Lindsay M; Mulenga, Albert

    2015-04-01

    Ixodes scapularis is arguably the most medically important tick species in the United States. This tick transmits 5 of the 14 human tick-borne disease (TBD) agents in the USA: Borrelia burgdorferi, Anaplasma phagocytophilum, B. miyamotoi, Babesia microti, and Powassan virus disease. Except for the Powassan virus disease, I. scapularis-vectored TBD agents require more than 24h post attachment to be transmitted. This study describes identification of 24h immunogenic I. scapularis tick saliva proteins, which could provide opportunities to develop strategies to stop tick feeding before transmission of the majority of pathogens. A 24h fed female I. scapularis phage display cDNA expression library was biopanned using rabbit antibodies to 24h fed I. scapularis female tick saliva proteins, subjected to next generation sequencing, de novo assembly, and bioinformatic analyses. A total of 182 contigs were assembled, of which ∼19% (35/182) are novel and did not show identity to any known proteins in GenBank. The remaining ∼81% (147/182) of contigs were provisionally identified based on matches in GenBank including ∼18% (27/147) that matched protein sequences previously annotated as hypothetical and putative tick saliva proteins. Others include proteases and protease inhibitors (∼3%, 5/147), transporters and/or ligand binding proteins (∼6%, 9/147), immunogenic tick saliva housekeeping enzyme-like (17%, 25/147), ribosomal protein-like (∼31%, 46/147), and those classified as miscellaneous (∼24%, 35/147). Notable among the miscellaneous class include antimicrobial peptides (microplusin and ricinusin), myosin-like proteins that have been previously found in tick saliva, and heat shock tick saliva protein. Data in this study provides the foundation for in-depth analysis of I. scapularis feeding during the first 24h, before the majority of TBD agents can be transmitted. Copyright © 2015 Elsevier GmbH. All rights reserved.

  18. Accurate Identification of Fear Facial Expressions Predicts Prosocial Behavior

    PubMed Central

    Marsh, Abigail A.; Kozak, Megan N.; Ambady, Nalini

    2009-01-01

    The fear facial expression is a distress cue that is associated with the provision of help and prosocial behavior. Prior psychiatric studies have found deficits in the recognition of this expression by individuals with antisocial tendencies. However, no prior study has shown accuracy for recognition of fear to predict actual prosocial or antisocial behavior in an experimental setting. In 3 studies, the authors tested the prediction that individuals who recognize fear more accurately will behave more prosocially. In Study 1, participants who identified fear more accurately also donated more money and time to a victim in a classic altruism paradigm. In Studies 2 and 3, participants’ ability to identify the fear expression predicted prosocial behavior in a novel task designed to control for confounding variables. In Study 3, accuracy for recognizing fear proved a better predictor of prosocial behavior than gender, mood, or scores on an empathy scale. PMID:17516803

  19. Accurate identification of fear facial expressions predicts prosocial behavior.

    PubMed

    Marsh, Abigail A; Kozak, Megan N; Ambady, Nalini

    2007-05-01

    The fear facial expression is a distress cue that is associated with the provision of help and prosocial behavior. Prior psychiatric studies have found deficits in the recognition of this expression by individuals with antisocial tendencies. However, no prior study has shown accuracy for recognition of fear to predict actual prosocial or antisocial behavior in an experimental setting. In 3 studies, the authors tested the prediction that individuals who recognize fear more accurately will behave more prosocially. In Study 1, participants who identified fear more accurately also donated more money and time to a victim in a classic altruism paradigm. In Studies 2 and 3, participants' ability to identify the fear expression predicted prosocial behavior in a novel task designed to control for confounding variables. In Study 3, accuracy for recognizing fear proved a better predictor of prosocial behavior than gender, mood, or scores on an empathy scale.

  20. PSSP-RFE: accurate prediction of protein structural class by recursive feature extraction from PSI-BLAST profile, physical-chemical property and functional annotations.

    PubMed

    Li, Liqi; Cui, Xiang; Yu, Sanjiu; Zhang, Yuan; Luo, Zhong; Yang, Hua; Zhou, Yue; Zheng, Xiaoqi

    2014-01-01

    Protein structure prediction is critical to functional annotation of the massively accumulated biological sequences, which prompts an imperative need for the development of high-throughput technologies. As a first and key step in protein structure prediction, protein structural class prediction becomes an increasingly challenging task. Amongst most homological-based approaches, the accuracies of protein structural class prediction are sufficiently high for high similarity datasets, but still far from being satisfactory for low similarity datasets, i.e., below 40% in pairwise sequence similarity. Therefore, we present a novel method for accurate and reliable protein structural class prediction for both high and low similarity datasets. This method is based on Support Vector Machine (SVM) in conjunction with integrated features from position-specific score matrix (PSSM), PROFEAT and Gene Ontology (GO). A feature selection approach, SVM-RFE, is also used to rank the integrated feature vectors through recursively removing the feature with the lowest ranking score. The definitive top features selected by SVM-RFE are input into the SVM engines to predict the structural class of a query protein. To validate our method, jackknife tests were applied to seven widely used benchmark datasets, reaching overall accuracies between 84.61% and 99.79%, which are significantly higher than those achieved by state-of-the-art tools. These results suggest that our method could serve as an accurate and cost-effective alternative to existing methods in protein structural classification, especially for low similarity datasets.

  1. Decision peptide-driven: a free software tool for accurate protein quantification using gel electrophoresis and matrix assisted laser desorption ionization time of flight mass spectrometry.

    PubMed

    Santos, Hugo M; Reboiro-Jato, Miguel; Glez-Peña, Daniel; Nunes-Miranda, J D; Fdez-Riverola, Florentino; Carvallo, R; Capelo, J L

    2010-09-15

    The decision peptide-driven tool implements a software application for assisting the user in a protocol for accurate protein quantification based on the following steps: (1) protein separation through gel electrophoresis; (2) in-gel protein digestion; (3) direct and inverse (18)O-labeling and (4) matrix assisted laser desorption ionization time of flight mass spectrometry, MALDI analysis. The DPD software compares the MALDI results of the direct and inverse (18)O-labeling experiments and quickly identifies those peptides with paralleled loses in different sets of a typical proteomic workflow. Those peptides are used for subsequent accurate protein quantification. The interpretation of the MALDI data from direct and inverse labeling experiments is time-consuming requiring a significant amount of time to do all comparisons manually. The DPD software shortens and simplifies the searching of the peptides that must be used for quantification from a week to just some minutes. To do so, it takes as input several MALDI spectra and aids the researcher in an automatic mode (i) to compare data from direct and inverse (18)O-labeling experiments, calculating the corresponding ratios to determine those peptides with paralleled losses throughout different sets of experiments; and (ii) allow to use those peptides as internal standards for subsequent accurate protein quantification using (18)O-labeling. In this work the DPD software is presented and explained with the quantification of protein carbonic anhydrase. Copyright (c) 2010 Elsevier B.V. All rights reserved.

  2. Accurate assessment and identification of naturally occurring cellular cobalamins.

    PubMed

    Hannibal, Luciana; Axhemi, Armend; Glushchenko, Alla V; Moreira, Edward S; Brasch, Nicola E; Jacobsen, Donald W

    2008-01-01

    Accurate assessment of cobalamin profiles in human serum, cells, and tissues may have clinical diagnostic value. However, non-alkyl forms of cobalamin undergo beta-axial ligand exchange reactions during extraction, which leads to inaccurate profiles having little or no diagnostic value. Experiments were designed to: 1) assess beta-axial ligand exchange chemistry during the extraction and isolation of cobalamins from cultured bovine aortic endothelial cells, human foreskin fibroblasts, and human hepatoma HepG2 cells, and 2) to establish extraction conditions that would provide a more accurate assessment of endogenous forms containing both exchangeable and non-exchangeable beta-axial ligands. The cobalamin profile of cells grown in the presence of [ 57Co]-cyanocobalamin as a source of vitamin B12 shows that the following derivatives are present: [ 57Co]-aquacobalamin, [ 57Co]-glutathionylcobalamin, [ 57Co]-sulfitocobalamin, [ 57Co]-cyanocobalamin, [ 57Co]-adenosylcobalamin, [ 57Co]-methylcobalamin, as well as other yet unidentified corrinoids. When the extraction is performed in the presence of excess cold aquacobalaminacting as a scavenger cobalamin (i.e. "cold trapping"), the recovery of both [ 57Co]-glutathionylcobalamin and [ 57Co]-sulfitocobalamin decreases to low but consistent levels. In contrasts, the [ 57Co]-nitrocobalamin observed in the extracts prepared without excess aquacobalamin is undetected in extracts prepared with cold trapping. This demonstrates that beta-ligand exchange occur with non-covalently bound beta-ligands. The exception to this observation is cyanocobalamin with a non-exchangeable CN- group. It is now possible to obtain accurate profiles of cellular cobalamin.

  3. Accurate assessment and identification of naturally occurring cellular cobalamins

    PubMed Central

    Hannibal, Luciana; Axhemi, Armend; Glushchenko, Alla V.; Moreira, Edward S.; Brasch, Nicola E.; Jacobsen, Donald W.

    2009-01-01

    Background Accurate assessment of cobalamin profiles in human serum, cells, and tissues may have clinical diagnostic value. However, non-alkyl forms of cobalamin undergo β-axial ligand exchange reactions during extraction, which leads to inaccurate profiles having little or no diagnostic value. Methods Experiments were designed to: 1) assess β-axial ligand exchange chemistry during the extraction and isolation of cobalamins from cultured bovine aortic endothelial cells, human foreskin fibroblasts, and human hepatoma HepG2 cells, and 2) to establish extraction conditions that would provide a more accurate assessment of endogenous forms containing both exchangeable and non-exchangeable β-axial ligands. Results The cobalamin profile of cells grown in the presence of [57Co]-cyanocobalamin as a source of vitamin B12 shows that the following derivatives are present: [57Co]-aquacobalamin, [57Co]-glutathionylcobalamin, [57Co]-sulfitocobalamin, [57Co]-cyanocobalamin, [57Co]-adenosylcobalamin, [57Co]-methylcobalamin, as well as other yet unidentified corrinoids. When the extraction is performed in the presence of excess cold aquacobalamin acting as a scavenger cobalamin (i.e., “cold trapping”), the recovery of both [57Co]-glutathionylcobalamin and [57Co]-sulfitocobalamin decreases to low but consistent levels. In contrast, the [57Co]-nitrocobalamin observed in extracts prepared without excess aquacobalamin is undetectable in extracts prepared with cold trapping. Conclusions This demonstrates that β-ligand exchange occurs with non-covalently bound β-ligands. The exception to this observation is cyanocobalamin with a non-covalent but non-exchangeable− CNT group. It is now possible to obtain accurate profiles of cellular cobalamins. PMID:18973458

  4. Identification of Important Process Variables for Fiber Spinning of Protein Nanotubes Generated from Waste Materials

    DTIC Science & Technology

    2012-01-11

    nanotubes , which sold at the same current cost as carbon nanotubes , this would equate to a $788 million industry. In the USA, the potential to source eye...advantages over carbon nanotubes due to the ability to functionalized them 31. The nanotubes are a highly ordered, insoluble form of protein. Fibrils...1756 Identification of important process variables for fiber spinning of protein nanotubes generated from waste materials. Research Team (listed

  5. Proteogenomic Analysis Greatly Expands the Identification of Proteins Related to Reproduction in the Apogamous Fern Dryopteris affinis ssp. affinis.

    PubMed

    Grossmann, Jonas; Fernández, Helena; Chaubey, Pururawa M; Valdés, Ana E; Gagliardini, Valeria; Cañal, María J; Russo, Giancarlo; Grossniklaus, Ueli

    2017-01-01

    Performing proteomic studies on non-model organisms with little or no genomic information is still difficult. However, many specific processes and biochemical pathways occur only in species that are poorly characterized at the genomic level. For example, many plants can reproduce both sexually and asexually, the first one allowing the generation of new genotypes and the latter their fixation. Thus, both modes of reproduction are of great agronomic value. However, the molecular basis of asexual reproduction is not well understood in any plant. In ferns, it combines the production of unreduced spores (diplospory) and the formation of sporophytes from somatic cells (apogamy). To set the basis to study these processes, we performed transcriptomics by next-generation sequencing (NGS) and shotgun proteomics by tandem mass spectrometry in the apogamous fern D. affinis ssp. affinis . For protein identification we used the public viridiplantae database (VPDB) to identify orthologous proteins from other plant species and new transcriptomics data to generate a "species-specific transcriptome database" (SSTDB). In total 1,397 protein clusters with 5,865 unique peptide sequences were identified (13 decoy proteins out of 1,410, protFDR 0.93% on protein cluster level). We show that using the SSTDB for protein identification increases the number of identified peptides almost four times compared to using only the publically available VPDB. We identified homologs of proteins involved in reproduction of higher plants, including proteins with a potential role in apogamy. With the increasing availability of genomic data from non-model species, similar proteogenomics approaches will improve the sensitivity in protein identification for species only distantly related to models.

  6. Efficient and accurate Greedy Search Methods for mining functional modules in protein interaction networks.

    PubMed

    He, Jieyue; Li, Chaojun; Ye, Baoliu; Zhong, Wei

    2012-06-25

    Most computational algorithms mainly focus on detecting highly connected subgraphs in PPI networks as protein complexes but ignore their inherent organization. Furthermore, many of these algorithms are computationally expensive. However, recent analysis indicates that experimentally detected protein complexes generally contain Core/attachment structures. In this paper, a Greedy Search Method based on Core-Attachment structure (GSM-CA) is proposed. The GSM-CA method detects densely connected regions in large protein-protein interaction networks based on the edge weight and two criteria for determining core nodes and attachment nodes. The GSM-CA method improves the prediction accuracy compared to other similar module detection approaches, however it is computationally expensive. Many module detection approaches are based on the traditional hierarchical methods, which is also computationally inefficient because the hierarchical tree structure produced by these approaches cannot provide adequate information to identify whether a network belongs to a module structure or not. In order to speed up the computational process, the Greedy Search Method based on Fast Clustering (GSM-FC) is proposed in this work. The edge weight based GSM-FC method uses a greedy procedure to traverse all edges just once to separate the network into the suitable set of modules. The proposed methods are applied to the protein interaction network of S. cerevisiae. Experimental results indicate that many significant functional modules are detected, most of which match the known complexes. Results also demonstrate that the GSM-FC algorithm is faster and more accurate as compared to other competing algorithms. Based on the new edge weight definition, the proposed algorithm takes advantages of the greedy search procedure to separate the network into the suitable set of modules. Experimental analysis shows that the identified modules are statistically significant. The algorithm can reduce the

  7. Identification of GPCR-Interacting Cytosolic Proteins Using HDL Particles and Mass Spectrometry-Based Proteomic Approach

    PubMed Central

    Chung, Ka Young; Day, Peter W.; Vélez-Ruiz, Gisselle; Sunahara, Roger K.; Kobilka, Brian K.

    2013-01-01

    G protein-coupled receptors (GPCRs) have critical roles in various physiological and pathophysiological processes, and more than 40% of marketed drugs target GPCRs. Although the canonical downstream target of an agonist-activated GPCR is a G protein heterotrimer; there is a growing body of evidence suggesting that other signaling molecules interact, directly or indirectly, with GPCRs. However, due to the low abundance in the intact cell system and poor solubility of GPCRs, identification of these GPCR-interacting molecules remains challenging. Here, we establish a strategy to overcome these difficulties by using high-density lipoprotein (HDL) particles. We used the β2-adrenergic receptor (β2AR), a GPCR involved in regulating cardiovascular physiology, as a model system. We reconstituted purified β2AR in HDL particles, to mimic the plasma membrane environment, and used the reconstituted receptor as bait to pull-down binding partners from rat heart cytosol. A total of 293 proteins were identified in the full agonist-activated β2AR pull-down, 242 proteins in the inverse agonist-activated β2AR pull-down, and 210 proteins were commonly identified in both pull-downs. A small subset of the β2AR-interacting proteins isolated was confirmed by Western blot; three known β2AR-interacting proteins (Gsα, NHERF-2, and Grb2) and 3 newly identified known β2AR-interacting proteins (AMPKα, acetyl-CoA carboxylase, and UBC-13). Profiling of the identified proteins showed a clear bias toward intracellular signal transduction pathways, which is consistent with the role of β2AR as a cell signaling molecule. This study suggests that HDL particle-reconstituted GPCRs can provide an effective platform method for the identification of GPCR binding partners coupled with a mass spectrometry-based proteomic analysis. PMID:23372797

  8. Identification and characterization of intracellular proteins that bind oligonucleotides with phosphorothioate linkages.

    PubMed

    Liang, Xue-hai; Sun, Hong; Shen, Wen; Crooke, Stanley T

    2015-03-11

    Although the RNase H-dependent mechanism of inhibition of gene expression by chemically modified antisense oligonucleotides (ASOs) has been well characterized, little is known about the interactions between ASOs and intracellular proteins that may alter cellular localization and/or potency of ASOs. Here, we report the identification of 56 intracellular ASO-binding proteins using multi-step affinity selection approaches. Many of the tested proteins had no significant effect on ASO activity; however, some proteins, including La/SSB, NPM1, ANXA2, VARS and PC4, appeared to enhance ASO activities, likely through mechanisms related to subcellular distribution. VARS and ANXA2 co-localized with ASOs in endocytic organelles, and reduction in the level of VARS altered lysosome/ASO localization patterns, implying that these proteins may facilitate ASO release from the endocytic pathway. Depletion of La and NPM1 reduced nuclear ASO levels, suggesting potential roles in ASO nuclear accumulation. On the other hand, Ku70 and Ku80 proteins inhibited ASO activity, most likely by competition with RNase H1 for ASO/RNA duplex binding. Our results demonstrate that phosphorothioate-modified ASOs bind a set of cellular proteins that affect ASO activity via different mechanisms. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  9. Identification of O-linked β-d-N-acetylglucosamine-Modified Proteins from Arabidopsis

    PubMed Central

    Xu, Shou-Ling; Chalkley, Robert J.; Wang, Zhi-Yong; Burlingame, Alma L.

    2013-01-01

    The posttranslational modification of proteins with O-linked β-d-N-acetylglucosamine (O-GlcNAc) on serine and threonine residues occurs in all animals and plants. This modification is dynamic and ubiquitous, and regulates many cellular processes, including transcription, signaling and cytokinesis and is associated with several diseases. Cycling of O-GlcNAc is tightly regulated by O-GlcNAc transferase (OGT) and O-GlcNAcase (OGA). Plants have two OGTs, SPINDLY (SPY) and SECRET AGENT (SEC); disruption of both causes embryo lethality. Despite O-GlcNAc modification of proteins being discovered more than 20-years ago, identification and mapping of protein GlcNAcylation is still a challenging task. Here we describe the use of lectin affinity chromatography combined with electron transfer dissociation mass spectrometry to enrich and to detect O-GlcNAc modified peptides from Arabidopsis. PMID:22576084

  10. De novo identification of highly diverged protein repeats by probabilistic consistency.

    PubMed

    Biegert, A; Söding, J

    2008-03-15

    An estimated 25% of all eukaryotic proteins contain repeats, which underlines the importance of duplication for evolving new protein functions. Internal repeats often correspond to structural or functional units in proteins. Methods capable of identifying diverged repeated segments or domains at the sequence level can therefore assist in predicting domain structures, inferring hypotheses about function and mechanism, and investigating the evolution of proteins from smaller fragments. We present HHrepID, a method for the de novo identification of repeats in protein sequences. It is able to detect the sequence signature of structural repeats in many proteins that have not yet been known to possess internal sequence symmetry, such as outer membrane beta-barrels. HHrepID uses HMM-HMM comparison to exploit evolutionary information in the form of multiple sequence alignments of homologs. In contrast to a previous method, the new method (1) generates a multiple alignment of repeats; (2) utilizes the transitive nature of homology through a novel merging procedure with fully probabilistic treatment of alignments; (3) improves alignment quality through an algorithm that maximizes the expected accuracy; (4) is able to identify different kinds of repeats within complex architectures by a probabilistic domain boundary detection method and (5) improves sensitivity through a new approach to assess statistical significance. Server: http://toolkit.tuebingen.mpg.de/hhrepid; Executables: ftp://ftp.tuebingen.mpg.de/pub/protevo/HHrepID

  11. A Proof of Concept to Bridge the Gap between Mass Spectrometry Imaging, Protein Identification and Relative Quantitation: MSI~LC-MS/MS-LF.

    PubMed

    Théron, Laëtitia; Centeno, Delphine; Coudy-Gandilhon, Cécile; Pujos-Guillot, Estelle; Astruc, Thierry; Rémond, Didier; Barthelemy, Jean-Claude; Roche, Frédéric; Feasson, Léonard; Hébraud, Michel; Béchet, Daniel; Chambon, Christophe

    2016-10-26

    Mass spectrometry imaging (MSI) is a powerful tool to visualize the spatial distribution of molecules on a tissue section. The main limitation of MALDI-MSI of proteins is the lack of direct identification. Therefore, this study focuses on a MSI~LC-MS/MS-LF workflow to link the results from MALDI-MSI with potential peak identification and label-free quantitation, using only one tissue section. At first, we studied the impact of matrix deposition and laser ablation on protein extraction from the tissue section. Then, we did a back-correlation of the m / z of the proteins detected by MALDI-MSI to those identified by label-free quantitation. This allowed us to compare the label-free quantitation of proteins obtained in LC-MS/MS with the peak intensities observed in MALDI-MSI. We managed to link identification to nine peaks observed by MALDI-MSI. The results showed that the MSI~LC-MS/MS-LF workflow (i) allowed us to study a representative muscle proteome compared to a classical bottom-up workflow; and (ii) was sparsely impacted by matrix deposition and laser ablation. This workflow, performed as a proof-of-concept, suggests that a single tissue section can be used to perform MALDI-MSI and protein extraction, identification, and relative quantitation.

  12. A Proof of Concept to Bridge the Gap between Mass Spectrometry Imaging, Protein Identification and Relative Quantitation: MSI~LC-MS/MS-LF

    PubMed Central

    Théron, Laëtitia; Centeno, Delphine; Coudy-Gandilhon, Cécile; Pujos-Guillot, Estelle; Astruc, Thierry; Rémond, Didier; Barthelemy, Jean-Claude; Roche, Frédéric; Feasson, Léonard; Hébraud, Michel; Béchet, Daniel; Chambon, Christophe

    2016-01-01

    Mass spectrometry imaging (MSI) is a powerful tool to visualize the spatial distribution of molecules on a tissue section. The main limitation of MALDI-MSI of proteins is the lack of direct identification. Therefore, this study focuses on a MSI~LC-MS/MS-LF workflow to link the results from MALDI-MSI with potential peak identification and label-free quantitation, using only one tissue section. At first, we studied the impact of matrix deposition and laser ablation on protein extraction from the tissue section. Then, we did a back-correlation of the m/z of the proteins detected by MALDI-MSI to those identified by label-free quantitation. This allowed us to compare the label-free quantitation of proteins obtained in LC-MS/MS with the peak intensities observed in MALDI-MSI. We managed to link identification to nine peaks observed by MALDI-MSI. The results showed that the MSI~LC-MS/MS-LF workflow (i) allowed us to study a representative muscle proteome compared to a classical bottom-up workflow; and (ii) was sparsely impacted by matrix deposition and laser ablation. This workflow, performed as a proof-of-concept, suggests that a single tissue section can be used to perform MALDI-MSI and protein extraction, identification, and relative quantitation. PMID:28248242

  13. A targeted mass spectrometry-based approach for the identification and characterization of proteins containing α-aminoadipic and γ-glutamic semialdehyde residues

    PubMed Central

    Chavez, Juan D.; Bisson, William H.

    2011-01-01

    The site-specific identification of α-aminoadipic semialdehyde (AAS) and γ-glutamic semialdehyde (GGS) residues in proteins is reported. Semialdehydic protein modifications result from the metal-catalyzed oxidation of Lys or Arg and Pro residues, respectively. Most of the analytical methods for the analysis of protein carbonylation measure change to the global level of carbonylation and fail to provide details regarding protein identity, site, and chemical nature of the carbonylation. In this work, we used a targeted approach, which combines chemical labeling, enrichment, and tandem mass spectrometric analysis, for the site-specific identification of AAS and GGS sites in proteins. The approach is applied to in vitro oxidized glyceraldehyde-3-phosphate dehydrogenase (GAPDH) and an untreated biological sample, namely cardiac mitochondrial proteins. The analysis of GAPDH resulted in the site-specific identification of two AAA and four GGS residues. Computational evaluation of the identified AAS and GGS sites in GAPDH indicated that these sites are located in flexible regions, show high solvent accessibility values, and are in proximity with possible metal ion binding sites. The targeted proteomic analysis of semialdehydic modifications in cardiac mitochondria yielded nine AAS modification sites which were unambiguously assigned to distinct lysine residues in the following proteins: ATP/ATP translocase isoforms 1 and 2, ubiquinol cytochrome-c reductase core protein 2, and ATP synthase α-subunit. PMID:20957471

  14. Sialome of a Generalist Lepidopteran Herbivore: Identification of Transcripts and Proteins from Helicoverpa armigera Labial Salivary Glands

    PubMed Central

    Celorio-Mancera, Maria de la Paz; Courtiade, Juliette; Muck, Alexander; Heckel, David G.; Musser, Richard O.; Vogel, Heiko

    2011-01-01

    Although the importance of insect saliva in insect-host plant interactions has been acknowledged, there is very limited information on the nature and complexity of the salivary proteome in lepidopteran herbivores. We inspected the labial salivary transcriptome and proteome of Helicoverpa armigera, an important polyphagous pest species. To identify the majority of the salivary proteins we have randomly sequenced 19,389 expressed sequence tags (ESTs) from a normalized cDNA library of salivary glands. In parallel, a non-cytosolic enriched protein fraction was obtained from labial salivary glands and subjected to two-dimensional gel electrophoresis (2-DE) and de novo peptide sequencing. This procedure allowed comparison of peptides and EST sequences and enabled us to identify 65 protein spots from the secreted labial saliva 2DE proteome. The mass spectrometry analysis revealed ecdysone, glucose oxidase, fructosidase, carboxyl/cholinesterase and an uncharacterized protein previously detected in H. armigera midgut proteome. Consistently, their corresponding transcripts are among the most abundant in our cDNA library. We did find redundancy of sequence identification of saliva-secreted proteins suggesting multiple isoforms. As expected, we found several enzymes responsible for digestion and plant offense. In addition, we identified non-digestive proteins such as an arginine kinase and abundant proteins of unknown function. This identification of secreted salivary gland proteins allows a more comprehensive understanding of insect feeding and poses new challenges for the elucidation of protein function. PMID:22046331

  15. Accurate, safe, and rapid method of intraoperative tumor identification for totally laparoscopic distal gastrectomy: injection of mixed fluid of sodium hyaluronate and patent blue.

    PubMed

    Nakagawa, Masatoshi; Ehara, Kazuhisa; Ueno, Masaki; Tanaka, Tsuyoshi; Kaida, Sachiko; Udagawa, Harushi

    2014-04-01

    In totally laparoscopic distal gastrectomy, determining the resection line with safe proximal margins is often difficult, particularly for tumors located in a relatively upper area. This is because, in contrast to open surgery, identifying lesions by palpating or opening the stomach is essentially impossible. This study introduces a useful method of tumor identification that is accurate, safe, and rapid. On the operation day, after inducing general anesthesia, a mixture of sodium hyaluronate and patent blue is injected into the submucosal layer of the proximal margin. When resecting stomach, all marker spots should be on the resected side. In all cases, the proximal margin is examined histologically by using frozen sections during the operation. From October 2009 to September 2011, a prospective study that evaluated this method was performed. A total of 34 patients who underwent totally laparoscopic distal gastrectomy were enrolled in this study. Approximately 5 min was required to complete the procedure. Proximal margins were negative in all cases, and the mean ± standard deviation length of the proximal margin was 23.5 ± 12.8 mm. No side effects, such as allergy, were encountered. As a method of tumor identification for totally laparoscopic distal gastrectomy, this procedure appears accurate, safe, and rapid.

  16. Identification of two bvg-repressed surface proteins of Bordetella pertussis.

    PubMed Central

    Stenson, T H; Peppler, M S

    1995-01-01

    Bordetella pertussis, the etiological agent of whooping cough, has the ability to modulate its phenotype in response to environmental conditions by using the BvgAS sensory transduction system which is encoded by the vir locus (now known as bvg). The BvgAS system is part of a large family of two-component sensory transduction systems which are common to a number of pathogenic bacteria. Although much is known about the proteins which exist in the B. pertussis virulent (X-mode or phase I) phenotype, relatively little is known about the proteins produced in the avirulent (C-mode or phase III) phenotype. We used sodium dodecyl sulfate-polyacrylamide gel electrophoresis and isoelectric focusing techniques to demonstrate the existence of at least 22 vir-repressed molecules which are increased in the avirulent phenotype. In addition, a series of monoclonal antibodies which are specific for the surface of avirulent B. pertussis were developed. Using immunological and protein techniques, we characterized two of these antigens as surface-exposed proteins. One of these antigens is expressed only in B. pertussis but not in the related species B. parapertussis and B. bronchiseptica. The other antigen is also present in B. parapertussis and B. bronchiseptica but is expressed at lower levels which are not regulated by bvg. The identification and characterization of vir-repressed proteins (and the genes which encode and regulate them) may help elucidate a physiological role for modulation of this obligate human pathogen. PMID:7558280

  17. Microbial Protein-Antigenome Determination (MAD) Technology: A Proteomics-Based Strategy for Rapid Identification of Microbial Targets of Host Humoral Immune Responses

    USDA-ARS?s Scientific Manuscript database

    Immunogenic, pathogen-specific proteins have excellent potential for development of novel management modalities. Here, we describe an innovative application of proteomics called Microbial protein-Antigenome Determination (MAD) Technology for rapid identification of native microbial proteins that el...

  18. Microbial Protein-Antigenome Determination (MAD) Technology: A Proteomics-Based Strategy for Rapid Identification of Microbial Targets of Host Humoral Immune Responses

    USDA-ARS?s Scientific Manuscript database

    Immunogenic, pathogen-specific proteins have excellent potential for development of novel management modalities. Here, we describe an innovative application of proteomics called Microbial protein-Antigenome Determination (MAD) Technology for rapid identification of native microbial proteins that eli...

  19. A Deep Learning Framework for Robust and Accurate Prediction of ncRNA-Protein Interactions Using Evolutionary Information.

    PubMed

    Yi, Hai-Cheng; You, Zhu-Hong; Huang, De-Shuang; Li, Xiao; Jiang, Tong-Hai; Li, Li-Ping

    2018-06-01

    The interactions between non-coding RNAs (ncRNAs) and proteins play an important role in many biological processes, and their biological functions are primarily achieved by binding with a variety of proteins. High-throughput biological techniques are used to identify protein molecules bound with specific ncRNA, but they are usually expensive and time consuming. Deep learning provides a powerful solution to computationally predict RNA-protein interactions. In this work, we propose the RPI-SAN model by using the deep-learning stacked auto-encoder network to mine the hidden high-level features from RNA and protein sequences and feed them into a random forest (RF) model to predict ncRNA binding proteins. Stacked assembling is further used to improve the accuracy of the proposed method. Four benchmark datasets, including RPI2241, RPI488, RPI1807, and NPInter v2.0, were employed for the unbiased evaluation of five established prediction tools: RPI-Pred, IPMiner, RPISeq-RF, lncPro, and RPI-SAN. The experimental results show that our RPI-SAN model achieves much better performance than other methods, with accuracies of 90.77%, 89.7%, 96.1%, and 99.33%, respectively. It is anticipated that RPI-SAN can be used as an effective computational tool for future biomedical researches and can accurately predict the potential ncRNA-protein interacted pairs, which provides reliable guidance for biological research. Copyright © 2018 The Author(s). Published by Elsevier Inc. All rights reserved.

  20. Complex network theory for the identification and assessment of candidate protein targets.

    PubMed

    McGarry, Ken; McDonald, Sharon

    2018-06-01

    In this work we use complex network theory to provide a statistical model of the connectivity patterns of human proteins and their interaction partners. Our intention is to identify important proteins that may be predisposed to be potential candidates as drug targets for therapeutic interventions. Target proteins usually have more interaction partners than non-target proteins, but there are no hard-and-fast rules for defining the actual number of interactions. We devise a statistical measure for identifying hub proteins, we score our target proteins with gene ontology annotations. The important druggable protein targets are likely to have similar biological functions that can be assessed for their potential therapeutic value. Our system provides a statistical analysis of the local and distant neighborhood protein interactions of the potential targets using complex network measures. This approach builds a more accurate model of drug-to-target activity and therefore the likely impact on treating diseases. We integrate high quality protein interaction data from the HINT database and disease associated proteins from the DrugTarget database. Other sources include biological knowledge from Gene Ontology and drug information from DrugBank. The problem is a very challenging one since the data is highly imbalanced between target proteins and the more numerous nontargets. We use undersampling on the training data and build Random Forest classifier models which are used to identify previously unclassified target proteins. We validate and corroborate these findings from the available literature. Copyright © 2018 Elsevier Ltd. All rights reserved.

  1. Accurate computational design of multipass transmembrane proteins.

    PubMed

    Lu, Peilong; Min, Duyoung; DiMaio, Frank; Wei, Kathy Y; Vahey, Michael D; Boyken, Scott E; Chen, Zibo; Fallas, Jorge A; Ueda, George; Sheffler, William; Mulligan, Vikram Khipple; Xu, Wenqing; Bowie, James U; Baker, David

    2018-03-02

    The computational design of transmembrane proteins with more than one membrane-spanning region remains a major challenge. We report the design of transmembrane monomers, homodimers, trimers, and tetramers with 76 to 215 residue subunits containing two to four membrane-spanning regions and up to 860 total residues that adopt the target oligomerization state in detergent solution. The designed proteins localize to the plasma membrane in bacteria and in mammalian cells, and magnetic tweezer unfolding experiments in the membrane indicate that they are very stable. Crystal structures of the designed dimer and tetramer-a rocket-shaped structure with a wide cytoplasmic base that funnels into eight transmembrane helices-are very close to the design models. Our results pave the way for the design of multispan membrane proteins with new functions. Copyright © 2018 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. No claim to original U.S. Government Works.

  2. A comparative proteomics method for multiple samples based on a 18O-reference strategy and a quantitation and identification-decoupled strategy.

    PubMed

    Wang, Hongbin; Zhang, Yongqian; Gui, Shuqi; Zhang, Yong; Lu, Fuping; Deng, Yulin

    2017-08-15

    Comparisons across large numbers of samples are frequently necessary in quantitative proteomics. Many quantitative methods used in proteomics are based on stable isotope labeling, but most of these are only useful for comparing two samples. For up to eight samples, the iTRAQ labeling technique can be used. For greater numbers of samples, the label-free method has been used, but this method was criticized for low reproducibility and accuracy. An ingenious strategy has been introduced, comparing each sample against a 18 O-labeled reference sample that was created by pooling equal amounts of all samples. However, it is necessary to use proportion-known protein mixtures to investigate and evaluate this new strategy. Another problem for comparative proteomics of multiple samples is the poor coincidence and reproducibility in protein identification results across samples. In present study, a method combining 18 O-reference strategy and a quantitation and identification-decoupled strategy was investigated with proportion-known protein mixtures. The results obviously demonstrated that the 18 O-reference strategy had greater accuracy and reliability than other previously used comparison methods based on transferring comparison or label-free strategies. By the decoupling strategy, the quantification data acquired by LC-MS and the identification data acquired by LC-MS/MS are matched and correlated to identify differential expressed proteins, according to retention time and accurate mass. This strategy made protein identification possible for all samples using a single pooled sample, and therefore gave a good reproducibility in protein identification across multiple samples, and allowed for optimizing peptide identification separately so as to identify more proteins. Copyright © 2017 Elsevier B.V. All rights reserved.

  3. Identification and quantification of major maillard cross-links in human serum albumin and lens protein. Evidence for glucosepane as the dominant compound.

    PubMed

    Biemel, Klaus M; Friedl, D Alexander; Lederer, Markus O

    2002-07-12

    Glycation reactions leading to protein modifications (advanced glycation end products) contribute to various pathologies associated with the general aging process and long term complications of diabetes. However, only few relevant compounds have so far been detected in vivo. We now report on the first unequivocal identification of the lysine-arginine cross-links glucosepane 5, DOGDIC 6, MODIC 7, and GODIC 8 in human material. For their accurate quantification by coupled liquid chromatography-electrospray ionization mass spectrometry, (13)C-labeled reference compounds were synthesized independently. Compounds 5-8 are formed via the alpha-dicarbonyl compounds N(6)-(2,3-dihydroxy-5,6-dioxohexyl)-l-lysinate (1a,b), 3-deoxyglucosone (), methylglyoxal (), and glyoxal (), respectively. The protein-bound dideoxyosone 1a,b seems to be of prime significance for cross-linking because it presumably is not detoxified by mammalian enzymes as readily as 2-4. Hence, the follow-up product glucosepane 5 was found to be the dominant compound. Up to 42.3 pmol of 5/mg of protein was identified in human serum albumin of diabetics; the level of 5 correlates markedly with the glycated hemoglobin HbA(1c). In the water-insoluble fraction of lens proteins from normoglycemics, concentration of 5 ranges between 132.3 and 241.7 pmol/mg. The advanced glycoxidation end product GODIC 8 is elevated significantly in brunescent lenses, indicating enhanced oxidative stress in this material. Compounds 5-8 thus appear predestined as markers for pathophysiological processes.

  4. Accurate Quantification of Cardiovascular Biomarkers in Serum Using Protein Standard Absolute Quantification (PSAQ™) and Selected Reaction Monitoring*

    PubMed Central

    Huillet, Céline; Adrait, Annie; Lebert, Dorothée; Picard, Guillaume; Trauchessec, Mathieu; Louwagie, Mathilde; Dupuis, Alain; Hittinger, Luc; Ghaleh, Bijan; Le Corvoisier, Philippe; Jaquinod, Michel; Garin, Jérôme; Bruley, Christophe; Brun, Virginie

    2012-01-01

    Development of new biomarkers needs to be significantly accelerated to improve diagnostic, prognostic, and toxicity monitoring as well as therapeutic follow-up. Biomarker evaluation is the main bottleneck in this development process. Selected Reaction Monitoring (SRM) combined with stable isotope dilution has emerged as a promising option to speed this step, particularly because of its multiplexing capacities. However, analytical variabilities because of upstream sample handling or incomplete trypsin digestion still need to be resolved. In 2007, we developed the PSAQ™ method (Protein Standard Absolute Quantification), which uses full-length isotope-labeled protein standards to quantify target proteins. In the present study we used clinically validated cardiovascular biomarkers (LDH-B, CKMB, myoglobin, and troponin I) to demonstrate that the combination of PSAQ and SRM (PSAQ-SRM) allows highly accurate biomarker quantification in serum samples. A multiplex PSAQ-SRM assay was used to quantify these biomarkers in clinical samples from myocardial infarction patients. Good correlation between PSAQ-SRM and ELISA assay results was found and demonstrated the consistency between these analytical approaches. Thus, PSAQ-SRM has the capacity to improve both accuracy and reproducibility in protein analysis. This will be a major contribution to efficient biomarker development strategies. PMID:22080464

  5. Establishment of a protein frequency library and its application in the reliable identification of specific protein interaction partners.

    PubMed

    Boulon, Séverine; Ahmad, Yasmeen; Trinkle-Mulcahy, Laura; Verheggen, Céline; Cobley, Andy; Gregor, Peter; Bertrand, Edouard; Whitehorn, Mark; Lamond, Angus I

    2010-05-01

    The reliable identification of protein interaction partners and how such interactions change in response to physiological or pathological perturbations is a key goal in most areas of cell biology. Stable isotope labeling with amino acids in cell culture (SILAC)-based mass spectrometry has been shown to provide a powerful strategy for characterizing protein complexes and identifying specific interactions. Here, we show how SILAC can be combined with computational methods drawn from the business intelligence field for multidimensional data analysis to improve the discrimination between specific and nonspecific protein associations and to analyze dynamic protein complexes. A strategy is shown for developing a protein frequency library (PFL) that improves on previous use of static "bead proteomes." The PFL annotates the frequency of detection in co-immunoprecipitation and pulldown experiments for all proteins in the human proteome. It can provide a flexible and objective filter for discriminating between contaminants and specifically bound proteins and can be used to normalize data values and facilitate comparisons between data obtained in separate experiments. The PFL is a dynamic tool that can be filtered for specific experimental parameters to generate a customized library. It will be continuously updated as data from each new experiment are added to the library, thereby progressively enhancing its utility. The application of the PFL to pulldown experiments is especially helpful in identifying either lower abundance or less tightly bound specific components of protein complexes that are otherwise lost among the large, nonspecific background.

  6. Identification of Plant Ice-binding Proteins Through Assessment of Ice-recrystallization Inhibition and Isolation Using Ice-affinity Purification.

    PubMed

    Bredow, Melissa; Tomalty, Heather E; Walker, Virginia K

    2017-05-05

    Ice-binding proteins (IBPs) belong to a family of stress-induced proteins that are synthesized by certain organisms exposed to subzero temperatures. In plants, freeze damage occurs when extracellular ice crystals grow, resulting in the rupture of plasma membranes and possible cell death. Adsorption of IBPs to ice crystals restricts further growth by a process known as ice-recrystallization inhibition (IRI), thereby reducing cellular damage. IBPs also demonstrate the ability to depress the freezing point of a solution below the equilibrium melting point, a property known as thermal hysteresis (TH) activity. These protective properties have raised interest in the identification of novel IBPs due to their potential use in industrial, medical and agricultural applications. This paper describes the identification of plant IBPs through 1) the induction and extraction of IBPs in plant tissue, 2) the screening of extracts for IRI activity, and 3) the isolation and purification of IBPs. Following the induction of IBPs by low temperature exposure, extracts are tested for IRI activity using a 'splat assay', which allows the observation of ice crystal growth using a standard light microscope. This assay requires a low protein concentration and generates results that are quickly obtained and easily interpreted, providing an initial screen for ice binding activity. IBPs can then be isolated from contaminating proteins by utilizing the property of IBPs to adsorb to ice, through a technique called 'ice-affinity purification'. Using cell lysates collected from plant extracts, an ice hemisphere can be slowly grown on a brass probe. This incorporates IBPs into the crystalline structure of the polycrystalline ice. Requiring no a priori biochemical or structural knowledge of the IBP, this method allows for recovery of active protein. Ice-purified protein fractions can be used for downstream applications including the identification of peptide sequences by mass spectrometry and the

  7. Systematic analysis of protein turnover in primary cells.

    PubMed

    Mathieson, Toby; Franken, Holger; Kosinski, Jan; Kurzawa, Nils; Zinn, Nico; Sweetman, Gavain; Poeckel, Daniel; Ratnu, Vikram S; Schramm, Maike; Becher, Isabelle; Steidel, Michael; Noh, Kyung-Min; Bergamini, Giovanna; Beck, Martin; Bantscheff, Marcus; Savitski, Mikhail M

    2018-02-15

    A better understanding of proteostasis in health and disease requires robust methods to determine protein half-lives. Here we improve the precision and accuracy of peptide ion intensity-based quantification, enabling more accurate protein turnover determination in non-dividing cells by dynamic SILAC-based proteomics. This approach allows exact determination of protein half-lives ranging from 10 to >1000 h. We identified 4000-6000 proteins in several non-dividing cell types, corresponding to 9699 unique protein identifications over the entire data set. We observed similar protein half-lives in B-cells, natural killer cells and monocytes, whereas hepatocytes and mouse embryonic neurons show substantial differences. Our data set extends and statistically validates the previous observation that subunits of protein complexes tend to have coherent turnover. Moreover, analysis of different proteasome and nuclear pore complex assemblies suggests that their turnover rate is architecture dependent. These results illustrate that our approach allows investigating protein turnover and its implications in various cell types.

  8. Identification of proteins with the CDw75 epitope in human colorectal cancer

    PubMed Central

    Mariño-Crespo, Óscar; Fernández-Briera, Almudena; Gil-Martín, Emilio

    2018-01-01

    The CDw75 epitope is an α(2,6) sialylated antigen overexpressed in colorectal cancer (CRC), where its expression correlates with the progression of the disease. The CDw75 epitope is located mainly in N-glycoproteins, whose identity remains unknown. The aim of the present study was to identify proteins with the CDw75 epitope as a strategy to deepen the understanding of molecular pathogenesis of CRC and to identify novel biomarkers for this disease. For this purpose, a two-dimensional electrophoresis approach was employed. Protein spots in the gels were matched to the corresponding CDw75 positive spots in the immunoblotted polyvinylidene difluoride membranes, and further identification of the protein species was performed by mass spectrometry. Additionally, one-dimensional western blotting experiments were performed to verify the expression of these candidate proteins in the colorectal tissue and their coincidence in molecular mass with the CDw75-positive bands. The findings of the present study indicate that haptoglobin and the keratins 8 (K8) and 18 (K18) are proteins with the CDw75 epitope in the colorectal tissue from CRC patients and also suggest novel functions and cellular locations for these proteins in the colorectal tissue and in relation to CRC. PMID:29391890

  9. Identification of compound-protein interactions through the analysis of gene ontology, KEGG enrichment for proteins and molecular fragments of compounds.

    PubMed

    Chen, Lei; Zhang, Yu-Hang; Zheng, Mingyue; Huang, Tao; Cai, Yu-Dong

    2016-12-01

    Compound-protein interactions play important roles in every cell via the recognition and regulation of specific functional proteins. The correct identification of compound-protein interactions can lead to a good comprehension of this complicated system and provide useful input for the investigation of various attributes of compounds and proteins. In this study, we attempted to understand this system by extracting properties from both proteins and compounds, in which proteins were represented by gene ontology and KEGG pathway enrichment scores and compounds were represented by molecular fragments. Advanced feature selection methods, including minimum redundancy maximum relevance, incremental feature selection, and the basic machine learning algorithm random forest, were used to analyze these properties and extract core factors for the determination of actual compound-protein interactions. Compound-protein interactions reported in The Binding Databases were used as positive samples. To improve the reliability of the results, the analytic procedure was executed five times using different negative samples. Simultaneously, five optimal prediction methods based on a random forest and yielding maximum MCCs of approximately 77.55 % were constructed and may be useful tools for the prediction of compound-protein interactions. This work provides new clues to understanding the system of compound-protein interactions by analyzing extracted core features. Our results indicate that compound-protein interactions are related to biological processes involving immune, developmental and hormone-associated pathways.

  10. Binomial probability distribution model-based protein identification algorithm for tandem mass spectrometry utilizing peak intensity information.

    PubMed

    Xiao, Chuan-Le; Chen, Xiao-Zhou; Du, Yang-Li; Sun, Xuesong; Zhang, Gong; He, Qing-Yu

    2013-01-04

    Mass spectrometry has become one of the most important technologies in proteomic analysis. Tandem mass spectrometry (LC-MS/MS) is a major tool for the analysis of peptide mixtures from protein samples. The key step of MS data processing is the identification of peptides from experimental spectra by searching public sequence databases. Although a number of algorithms to identify peptides from MS/MS data have been already proposed, e.g. Sequest, OMSSA, X!Tandem, Mascot, etc., they are mainly based on statistical models considering only peak-matches between experimental and theoretical spectra, but not peak intensity information. Moreover, different algorithms gave different results from the same MS data, implying their probable incompleteness and questionable reproducibility. We developed a novel peptide identification algorithm, ProVerB, based on a binomial probability distribution model of protein tandem mass spectrometry combined with a new scoring function, making full use of peak intensity information and, thus, enhancing the ability of identification. Compared with Mascot, Sequest, and SQID, ProVerB identified significantly more peptides from LC-MS/MS data sets than the current algorithms at 1% False Discovery Rate (FDR) and provided more confident peptide identifications. ProVerB is also compatible with various platforms and experimental data sets, showing its robustness and versatility. The open-source program ProVerB is available at http://bioinformatics.jnu.edu.cn/software/proverb/ .

  11. An approach to large scale identification of non-obvious structural similarities between proteins

    PubMed Central

    Cherkasov, Artem; Jones, Steven JM

    2004-01-01

    Background A new sequence independent bioinformatics approach allowing genome-wide search for proteins with similar three dimensional structures has been developed. By utilizing the numerical output of the sequence threading it establishes putative non-obvious structural similarities between proteins. When applied to the testing set of proteins with known three dimensional structures the developed approach was able to recognize structurally similar proteins with high accuracy. Results The method has been developed to identify pathogenic proteins with low sequence identity and high structural similarity to host analogues. Such protein structure relationships would be hypothesized to arise through convergent evolution or through ancient horizontal gene transfer events, now undetectable using current sequence alignment techniques. The pathogen proteins, which could mimic or interfere with host activities, would represent candidate virulence factors. The developed approach utilizes the numerical outputs from the sequence-structure threading. It identifies the potential structural similarity between a pair of proteins by correlating the threading scores of the corresponding two primary sequences against the library of the standard folds. This approach allowed up to 64% sensitivity and 99.9% specificity in distinguishing protein pairs with high structural similarity. Conclusion Preliminary results obtained by comparison of the genomes of Homo sapiens and several strains of Chlamydia trachomatis have demonstrated the potential usefulness of the method in the identification of bacterial proteins with known or potential roles in virulence. PMID:15147578

  12. Identification of proteins from Mycobacterium tuberculosis missing in attenuated Mycobacterium bovis BCG strains.

    PubMed

    Mattow, J; Jungblut, P R; Schaible, U E; Mollenkopf, H J; Lamer, S; Zimny-Arndt, U; Hagens, K; Müller, E C; Kaufmann, S H

    2001-08-01

    A proteome approach, combining high-resolution two-dimensional electrophoresis (2-DE) with mass spectrometry, was used to compare the cellular protein composition of two virulent strains of Mycobacterium tuberculosis with two attenuated strains of Mycobacterium bovis Bacillus Calmette-Guerin (BCG), in order to identify unique proteins of these strains. Emphasis was given to the identification of M. tuberculosis specific proteins, because we consider these proteins to represent putative virulence factors and interesting candidates for vaccination and diagnosis of tuberculosis. The genome of M. tuberculosis strain H37Rv comprises nearly 4000 predicted open reading frames. In contrast, the separation of proteins from whole mycobacterial cells by 2-DE resulted in silver-stained patterns comprising about 1800 distinct protein spots. Amongst these, 96 spots were exclusively detected either in the virulent (56 spots) or in the attenuated (40 spots) mycobacterial strains. Fifty-three of these spots were analyzed by mass spectrometry, of which 41 were identified, including 32 M. tuberculosis specific spots. Twelve M. tuberculosis specific spots were identified as proteins, encoded by genes previously reported to be deleted in M. bovis BCG. The remaining 20 spots unique for M. tuberculosis were identified as proteins encoded by genes that are not known to be missing in M. bovis BCG.

  13. Fast and accurate non-sequential protein structure alignment using a new asymmetric linear sum assignment heuristic.

    PubMed

    Brown, Peter; Pullan, Wayne; Yang, Yuedong; Zhou, Yaoqi

    2016-02-01

    The three dimensional tertiary structure of a protein at near atomic level resolution provides insight alluding to its function and evolution. As protein structure decides its functionality, similarity in structure usually implies similarity in function. As such, structure alignment techniques are often useful in the classifications of protein function. Given the rapidly growing rate of new, experimentally determined structures being made available from repositories such as the Protein Data Bank, fast and accurate computational structure comparison tools are required. This paper presents SPalignNS, a non-sequential protein structure alignment tool using a novel asymmetrical greedy search technique. The performance of SPalignNS was evaluated against existing sequential and non-sequential structure alignment methods by performing trials with commonly used datasets. These benchmark datasets used to gauge alignment accuracy include (i) 9538 pairwise alignments implied by the HOMSTRAD database of homologous proteins; (ii) a subset of 64 difficult alignments from set (i) that have low structure similarity; (iii) 199 pairwise alignments of proteins with similar structure but different topology; and (iv) a subset of 20 pairwise alignments from the RIPC set. SPalignNS is shown to achieve greater alignment accuracy (lower or comparable root-mean squared distance with increased structure overlap coverage) for all datasets, and the highest agreement with reference alignments from the challenging dataset (iv) above, when compared with both sequentially constrained alignments and other non-sequential alignments. SPalignNS was implemented in C++. The source code, binary executable, and a web server version is freely available at: http://sparks-lab.org yaoqi.zhou@griffith.edu.au. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  14. Evaluation of MALDI-TOF mass spectrometry for identification of environmental yeasts and development of supplementary database.

    PubMed

    Agustini, Bruna Carla; Silva, Luciano Paulino; Bloch, Carlos; Bonfim, Tania M B; da Silva, Gildo Almeida

    2014-06-01

    Yeast identification using traditional methods which employ morphological, physiological, and biochemical characteristics can be considered a hard task as it requires experienced microbiologists and a rigorous control in culture conditions that could implicate in different outcomes. Considering clinical or industrial applications, the fast and accurate identification of microorganisms is a crescent demand. Hence, molecular biology approaches has been extensively used and, more recently, protein profiling using matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) has proved to be an even more efficient tool for taxonomic purposes. Nonetheless, concerning to mass spectrometry, data available for the differentiation of yeast species for industrial purpose is limited and reference databases commercially available comprise almost exclusively clinical microorganisms. In this context, studies focusing on environmental isolates are required to extend the existing databases. The development of a supplementary database and the assessment of a commercial database for taxonomic identifications of environmental yeast are the aims of this study. We challenge MALDI-TOF MS to create protein profiles for 845 yeast strains isolated from grape must and 67.7 % of the strains were successfully identified according to previously available manufacturer database. The remaining 32.3 % strains were not identified due to the absence of a reference spectrum. After matching the correct taxon for these strains by using molecular biology approaches, the spectra concerning the missing species were added in a supplementary database. This new library was able to accurately predict unidentified species at first instance by MALDI-TOF MS, proving it is a powerful tool for the identification of environmental yeasts.

  15. Rapid identification of fluorochrome modification sites in proteins by LC ESI-Q-TOF mass spectrometry.

    PubMed

    Manikwar, Prakash; Zimmerman, Tahl; Blanco, Francisco J; Williams, Todd D; Siahaan, Teruna J

    2011-07-20

    Conjugation of either a fluorescent dye or a drug molecule to the ε-amino groups of lysine residues of proteins has many applications in biology and medicine. However, this type of conjugation produces a heterogeneous population of protein conjugates. Because conjugation of fluorochrome or drug molecule to a protein may have deleterious effects on protein function, the identification of conjugation sites is necessary. Unfortunately, the identification process can be time-consuming and laborious; therefore, there is a need to develop a rapid and reliable way to determine the conjugation sites of the fluorescent label or drug molecule. In this study, the sites of conjugation of fluorescein-5'-isothiocyanate and rhodamine-B-isothiocyanate to free amino groups on the insert-domain (I-domain) protein derived from the α-subunit of lymphocyte function-associated antigen-1 (LFA-1) were determined by electrospray ionization quadrupole time-of-flight mass spectrometry (ESI-Q-TOF MS) along with peptide mapping using trypsin digestion. A reporter fragment of the fluorochrome moiety that is generated in the collision cell of the Q-TOF without explicit MS/MS precursor selection was used to identify the conjugation site. Selected ion plots of the reporter ion readily mark modified peptides in chromatograms of the complex digest. Interrogation of theses spectra reveals a neutral loss/precursor pair that identifies the modified peptide. The results show that one to seven fluorescein molecules or one to four rhodamine molecules were attached to the lysine residue(s) of the I-domain protein. No modifications were found in the metal ion-dependent adhesion site (MIDAS), which is an important binding region of the I-domain.

  16. MetaPSICOV: combining coevolution methods for accurate prediction of contacts and long range hydrogen bonding in proteins.

    PubMed

    Jones, David T; Singh, Tanya; Kosciolek, Tomasz; Tetchner, Stuart

    2015-04-01

    Recent developments of statistical techniques to infer direct evolutionary couplings between residue pairs have rendered covariation-based contact prediction a viable means for accurate 3D modelling of proteins, with no information other than the sequence required. To extend the usefulness of contact prediction, we have designed a new meta-predictor (MetaPSICOV) which combines three distinct approaches for inferring covariation signals from multiple sequence alignments, considers a broad range of other sequence-derived features and, uniquely, a range of metrics which describe both the local and global quality of the input multiple sequence alignment. Finally, we use a two-stage predictor, where the second stage filters the output of the first stage. This two-stage predictor is additionally evaluated on its ability to accurately predict the long range network of hydrogen bonds, including correctly assigning the donor and acceptor residues. Using the original PSICOV benchmark set of 150 protein families, MetaPSICOV achieves a mean precision of 0.54 for top-L predicted long range contacts-around 60% higher than PSICOV, and around 40% better than CCMpred. In de novo protein structure prediction using FRAGFOLD, MetaPSICOV is able to improve the TM-scores of models by a median of 0.05 compared with PSICOV. Lastly, for predicting long range hydrogen bonding, MetaPSICOV-HB achieves a precision of 0.69 for the top-L/10 hydrogen bonds compared with just 0.26 for the baseline MetaPSICOV. MetaPSICOV is available as a freely available web server at http://bioinf.cs.ucl.ac.uk/MetaPSICOV. Raw data (predicted contact lists and 3D models) and source code can be downloaded from http://bioinf.cs.ucl.ac.uk/downloads/MetaPSICOV. Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press.

  17. Accurate population genetic measurements require cryptic species identification in corals

    NASA Astrophysics Data System (ADS)

    Sheets, Elizabeth A.; Warner, Patricia A.; Palumbi, Stephen R.

    2018-06-01

    Correct identification of closely related species is important for reliable measures of gene flow. Incorrectly lumping individuals of different species together has been shown to over- or underestimate population differentiation, but examples highlighting when these different results are observed in empirical datasets are rare. Using 199 single nucleotide polymorphisms, we assigned 768 individuals in the Acropora hyacinthus and A. cytherea morphospecies complexes to each of eight previously identified cryptic genetic species and measured intraspecific genetic differentiation across three geographic scales (within reefs, among reefs within an archipelago, and among Pacific archipelagos). We then compared these calculations to estimated genetic differentiation at each scale with all cryptic genetic species mixed as if we could not tell them apart. At the reef scale, correct genetic species identification yielded lower F ST estimates and fewer significant comparisons than when species were mixed, raising estimates of short-scale gene flow. In contrast, correct genetic species identification at large spatial scales yielded higher F ST measurements than mixed-species comparisons, lowering estimates of long-term gene flow among archipelagos. A meta-analysis of published population genetic studies in corals found similar results: F ST estimates at small spatial scales were lower and significance was found less often in studies that controlled for cryptic species. Our results and these prior datasets controlling for cryptic species suggest that genetic differentiation among local reefs may be lower than what has generally been reported in the literature. Not properly controlling for cryptic species structure can bias population genetic analyses in different directions across spatial scales, and this has important implications for conservation strategies that rely on these estimates.

  18. Top-Down and Bottom-Up Identification of Proteins by Liquid Extraction Surface Analysis Mass Spectrometry of Healthy and Diseased Human Liver Tissue

    NASA Astrophysics Data System (ADS)

    Sarsby, Joscelyn; Martin, Nicholas J.; Lalor, Patricia F.; Bunch, Josephine; Cooper, Helen J.

    2014-09-01

    Liquid extraction surface analysis mass spectrometry (LESA MS) has the potential to become a useful tool in the spatially-resolved profiling of proteins in substrates. Here, the approach has been applied to the analysis of thin tissue sections from human liver. The aim was to determine whether LESA MS was a suitable approach for the detection of protein biomarkers of nonalcoholic liver disease (nonalcoholic steatohepatitis, NASH), with a view to the eventual development of LESA MS for imaging NASH pathology. Two approaches were considered. In the first, endogenous proteins were extracted from liver tissue sections by LESA, subjected to automated trypsin digestion, and the resulting peptide mixture was analyzed by liquid chromatography tandem mass spectrometry (LC-MS/MS) (bottom-up approach). In the second (top-down approach), endogenous proteins were extracted by LESA, and analyzed intact. Selected protein ions were subjected to collision-induced dissociation (CID) and/or electron transfer dissociation (ETD) mass spectrometry. The bottom-up approach resulted in the identification of over 500 proteins; however identification of key protein biomarkers, liver fatty acid binding protein (FABP1), and its variant (Thr→Ala, position 94), was unreliable and irreproducible. Top-down LESA MS analysis of healthy and diseased liver tissue revealed peaks corresponding to multiple (~15-25) proteins. MS/MS of four of these proteins identified them as FABP1, its variant, α-hemoglobin, and 10 kDa heat shock protein. The reliable identification of FABP1 and its variant by top-down LESA MS suggests that the approach may be suitable for imaging NASH pathology in sections from liver biopsies.

  19. Top-down and bottom-up identification of proteins by liquid extraction surface analysis mass spectrometry of healthy and diseased human liver tissue.

    PubMed

    Sarsby, Joscelyn; Martin, Nicholas J; Lalor, Patricia F; Bunch, Josephine; Cooper, Helen J

    2014-11-01

    Liquid extraction surface analysis mass spectrometry (LESA MS) has the potential to become a useful tool in the spatially-resolved profiling of proteins in substrates. Here, the approach has been applied to the analysis of thin tissue sections from human liver. The aim was to determine whether LESA MS was a suitable approach for the detection of protein biomarkers of nonalcoholic liver disease (nonalcoholic steatohepatitis, NASH), with a view to the eventual development of LESA MS for imaging NASH pathology. Two approaches were considered. In the first, endogenous proteins were extracted from liver tissue sections by LESA, subjected to automated trypsin digestion, and the resulting peptide mixture was analyzed by liquid chromatography tandem mass spectrometry (LC-MS/MS) (bottom-up approach). In the second (top-down approach), endogenous proteins were extracted by LESA, and analyzed intact. Selected protein ions were subjected to collision-induced dissociation (CID) and/or electron transfer dissociation (ETD) mass spectrometry. The bottom-up approach resulted in the identification of over 500 proteins; however identification of key protein biomarkers, liver fatty acid binding protein (FABP1), and its variant (Thr→Ala, position 94), was unreliable and irreproducible. Top-down LESA MS analysis of healthy and diseased liver tissue revealed peaks corresponding to multiple (~15-25) proteins. MS/MS of four of these proteins identified them as FABP1, its variant, α-hemoglobin, and 10 kDa heat shock protein. The reliable identification of FABP1 and its variant by top-down LESA MS suggests that the approach may be suitable for imaging NASH pathology in sections from liver biopsies.

  20. Analysis of Proteins, Protein Complexes, and Organellar Proteomes Using Sheathless Capillary Zone Electrophoresis - Native Mass Spectrometry

    NASA Astrophysics Data System (ADS)

    Belov, Arseniy M.; Viner, Rosa; Santos, Marcia R.; Horn, David M.; Bern, Marshall; Karger, Barry L.; Ivanov, Alexander R.

    2017-12-01

    Native mass spectrometry (MS) is a rapidly advancing field in the analysis of proteins, protein complexes, and macromolecular species of various types. The majority of native MS experiments reported to-date has been conducted using direct infusion of purified analytes into a mass spectrometer. In this study, capillary zone electrophoresis (CZE) was coupled online to Orbitrap mass spectrometers using a commercial sheathless interface to enable high-performance separation, identification, and structural characterization of limited amounts of purified proteins and protein complexes, the latter with preserved non-covalent associations under native conditions. The performance of both bare-fused silica and polyacrylamide-coated capillaries was assessed using mixtures of protein standards known to form non-covalent protein-protein and protein-ligand complexes. High-efficiency separation of native complexes is demonstrated using both capillary types, while the polyacrylamide neutral-coated capillary showed better reproducibility and higher efficiency for more complex samples. The platform was then evaluated for the determination of monoclonal antibody aggregation and for analysis of proteomes of limited complexity using a ribosomal isolate from E. coli. Native CZE-MS, using accurate single stage and tandem-MS measurements, enabled identification of proteoforms and non-covalent complexes at femtomole levels. This study demonstrates that native CZE-MS can serve as an orthogonal and complementary technique to conventional native MS methodologies with the advantages of low sample consumption, minimal sample processing and losses, and high throughput and sensitivity. This study presents a novel platform for analysis of ribosomes and other macromolecular complexes and organelles, with the potential for discovery of novel structural features defining cellular phenotypes (e.g., specialized ribosomes). [Figure not available: see fulltext.

  1. Methods for the accurate estimation of confidence intervals on protein folding ϕ-values

    PubMed Central

    Ruczinski, Ingo; Sosnick, Tobin R.; Plaxco, Kevin W.

    2006-01-01

    ϕ-Values provide an important benchmark for the comparison of experimental protein folding studies to computer simulations and theories of the folding process. Despite the growing importance of ϕ measurements, however, formulas to quantify the precision with which ϕ is measured have seen little significant discussion. Moreover, a commonly employed method for the determination of standard errors on ϕ estimates assumes that estimates of the changes in free energy of the transition and folded states are independent. Here we demonstrate that this assumption is usually incorrect and that this typically leads to the underestimation of ϕ precision. We derive an analytical expression for the precision of ϕ estimates (assuming linear chevron behavior) that explicitly takes this dependence into account. We also describe an alternative method that implicitly corrects for the effect. By simulating experimental chevron data, we show that both methods accurately estimate ϕ confidence intervals. We also explore the effects of the commonly employed techniques of calculating ϕ from kinetics estimated at non-zero denaturant concentrations and via the assumption of parallel chevron arms. We find that these approaches can produce significantly different estimates for ϕ (again, even for truly linear chevron behavior), indicating that they are not equivalent, interchangeable measures of transition state structure. Lastly, we describe a Web-based implementation of the above algorithms for general use by the protein folding community. PMID:17008714

  2. "Plasmo2D": an ancillary proteomic tool to aid identification of proteins from Plasmodium falciparum.

    PubMed

    Khachane, Amit; Kumar, Ranjit; Jain, Sanyam; Jain, Samta; Banumathy, Gowrishankar; Singh, Varsha; Nagpal, Saurabh; Tatu, Utpal

    2005-01-01

    Bioinformatics tools to aid gene and protein sequence analysis have become an integral part of biology in the post-genomic era. Release of the Plasmodium falciparum genome sequence has allowed biologists to define the gene and the predicted protein content as well as their sequences in the parasite. Using pI and molecular weight as characteristics unique to each protein, we have developed a bioinformatics tool to aid identification of proteins from Plasmodium falciparum. The tool makes use of a Virtual 2-DE generated by plotting all of the proteins from the Plasmodium database on a pI versus molecular weight scale. Proteins are identified by comparing the position of migration of desired protein spots from an experimental 2-DE and that on a virtual 2-DE. The procedure has been automated in the form of user-friendly software called "Plasmo2D". The tool can be downloaded from http://144.16.89.25/Plasmo2D.zip.

  3. Study of cellular oncometabolism via multidimensional protein identification technology.

    PubMed

    Aukim-Hastie, Claire; Garbis, Spiros D

    2014-01-01

    Cellular proteomics is becoming a widespread clinical application, matching the definition of bench-to-bedside translation. Among various fields of investigation, this approach can be applied to the study of the metabolic alterations that accompany oncogenesis and tumor progression, which are globally referred to as oncometabolism. Here, we describe a multidimensional protein identification technology (MuDPIT)-based strategy that can be employed to study the cellular proteome of malignant cells and tissues. This method has previously been shown to be compatible with the reproducible, in-depth analysis of up to a thousand proteins in clinical samples. The possibility to employ this technique to study clinical specimens demonstrates its robustness. MuDPIT is advantageous as compared to other approaches because it is direct, highly sensitive, and reproducible, it provides high resolution with ultra-high mass accuracy, it allows for relative quantifications, and it is compatible with multiplexing (thus limiting costs).This method enables the direct assessment of the proteomic profile of neoplastic cells and tissues and could be employed in the near future as a high-throughput, rapid, quantitative, and cost-effective screening platform for clinical samples. © 2014 Elsevier Inc. All rights reserved.

  4. Accurate Structural Correlations from Maximum Likelihood Superpositions

    PubMed Central

    Theobald, Douglas L; Wuttke, Deborah S

    2008-01-01

    The cores of globular proteins are densely packed, resulting in complicated networks of structural interactions. These interactions in turn give rise to dynamic structural correlations over a wide range of time scales. Accurate analysis of these complex correlations is crucial for understanding biomolecular mechanisms and for relating structure to function. Here we report a highly accurate technique for inferring the major modes of structural correlation in macromolecules using likelihood-based statistical analysis of sets of structures. This method is generally applicable to any ensemble of related molecules, including families of nuclear magnetic resonance (NMR) models, different crystal forms of a protein, and structural alignments of homologous proteins, as well as molecular dynamics trajectories. Dominant modes of structural correlation are determined using principal components analysis (PCA) of the maximum likelihood estimate of the correlation matrix. The correlations we identify are inherently independent of the statistical uncertainty and dynamic heterogeneity associated with the structural coordinates. We additionally present an easily interpretable method (“PCA plots”) for displaying these positional correlations by color-coding them onto a macromolecular structure. Maximum likelihood PCA of structural superpositions, and the structural PCA plots that illustrate the results, will facilitate the accurate determination of dynamic structural correlations analyzed in diverse fields of structural biology. PMID:18282091

  5. Demonstration of protein-based human identification using the hair shaft proteome [Protein-based human identification: A proof of concept using the hair shaft proteome

    DOE PAGES

    Parker, Glendon J.; Leppert, Tami; Anex, Deon S.; ...

    2016-09-07

    Human identification from biological material is largely dependent on the ability to characterize genetic polymorphisms in DNA. Unfortunately, DNA can degrade in the environment, sometimes below the level at which it can be amplified by PCR. Protein however is chemically more robust than DNA and can persist for longer periods. Protein also contains genetic variation in the form of single amino acid polymorphisms. These can be used to infer the status of non-synonymous single nucleotide polymorphism alleles. To demonstrate this, we used mass spectrometry-based shotgun proteomics to characterize hair shaft proteins in 66 European-American subjects. A total of 596 singlemore » nucleotide polymorphism alleles were correctly imputed in 32 loci from 22 genes of subjects’ DNA and directly validated using Sanger sequencing. Estimates of the probability of resulting individual non-synonymous single nucleotide polymorphism allelic profiles in the European population, using the product rule, resulted in a maximum power of discrimination of 1 in 12,500. Imputed non-synonymous single nucleotide polymorphism profiles from European–American subjects were considerably less frequent in the African population (maximum likelihood ratio = 11,000). The converse was true for hair shafts collected from an additional 10 subjects with African ancestry, where some profiles were more frequent in the African population. Genetically variant peptides were also identified in hair shaft datasets from six archaeological skeletal remains (up to 260 years old). Furthermore, this study demonstrates that quantifiable measures of identity discrimination and biogeographic background can be obtained from detecting genetically variant peptides in hair shaft protein, including hair from bioarchaeological contexts.« less

  6. Demonstration of protein-based human identification using the hair shaft proteome [Protein-based human identification: A proof of concept using the hair shaft proteome

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Parker, Glendon J.; Leppert, Tami; Anex, Deon S.

    Human identification from biological material is largely dependent on the ability to characterize genetic polymorphisms in DNA. Unfortunately, DNA can degrade in the environment, sometimes below the level at which it can be amplified by PCR. Protein however is chemically more robust than DNA and can persist for longer periods. Protein also contains genetic variation in the form of single amino acid polymorphisms. These can be used to infer the status of non-synonymous single nucleotide polymorphism alleles. To demonstrate this, we used mass spectrometry-based shotgun proteomics to characterize hair shaft proteins in 66 European-American subjects. A total of 596 singlemore » nucleotide polymorphism alleles were correctly imputed in 32 loci from 22 genes of subjects’ DNA and directly validated using Sanger sequencing. Estimates of the probability of resulting individual non-synonymous single nucleotide polymorphism allelic profiles in the European population, using the product rule, resulted in a maximum power of discrimination of 1 in 12,500. Imputed non-synonymous single nucleotide polymorphism profiles from European–American subjects were considerably less frequent in the African population (maximum likelihood ratio = 11,000). The converse was true for hair shafts collected from an additional 10 subjects with African ancestry, where some profiles were more frequent in the African population. Genetically variant peptides were also identified in hair shaft datasets from six archaeological skeletal remains (up to 260 years old). Furthermore, this study demonstrates that quantifiable measures of identity discrimination and biogeographic background can be obtained from detecting genetically variant peptides in hair shaft protein, including hair from bioarchaeological contexts.« less

  7. Decision tree for accurate infection timing in individuals newly diagnosed with HIV-1 infection.

    PubMed

    Verhofstede, Chris; Fransen, Katrien; Van Den Heuvel, Annelies; Van Laethem, Kristel; Ruelle, Jean; Vancutsem, Ellen; Stoffels, Karolien; Van den Wijngaert, Sigi; Delforge, Marie-Luce; Vaira, Dolores; Hebberecht, Laura; Schauvliege, Marlies; Mortier, Virginie; Dauwe, Kenny; Callens, Steven

    2017-11-29

    There is today no gold standard method to accurately define the time passed since infection at HIV diagnosis. Infection timing and incidence measurement is however essential to better monitor the dynamics of local epidemics and the effect of prevention initiatives. Three methods for infection timing were evaluated using 237 serial samples from documented seroconversions and 566 cross sectional samples from newly diagnosed patients: identification of antibodies against the HIV p31 protein in INNO-LIA, SediaTM BED CEIA and SediaTM LAg-Avidity EIA. A multi-assay decision tree for infection timing was developed. Clear differences in recency window between BED CEIA, LAg-Avidity EIA and p31 antibody presence were observed with a switch from recent to long term infection a median of 169.5, 108.0 and 64.5 days after collection of the pre-seroconversion sample respectively. BED showed high reliability for identification of long term infections while LAg-Avidity is highly accurate for identification of recent infections. Using BED as initial assay to identify the long term infections and LAg-Avidity as a confirmatory assay for those classified as recent infection by BED, explores the strengths of both while reduces the workload. The short recency window of p31 antibodies allows to discriminate very early from early infections based on this marker. BED recent infection results not confirmed by LAg-Avidity are considered to reflect a period more distant from the infection time. False recency predictions in this group can be minimized by elimination of patients with a CD4 count of less than 100 cells/mm3 or without no p31 antibodies. For 566 cross sectional sample the outcome of the decision tree confirmed the infection timing based on the results of all 3 markers but reduced the overall cost from 13.2 USD to 5.2 USD per sample. A step-wise multi assay decision tree allows accurate timing of the HIV infection at diagnosis at affordable effort and cost and can be an important

  8. Accounting for observed small angle X-ray scattering profile in the protein-protein docking server ClusPro.

    PubMed

    Xia, Bing; Mamonov, Artem; Leysen, Seppe; Allen, Karen N; Strelkov, Sergei V; Paschalidis, Ioannis Ch; Vajda, Sandor; Kozakov, Dima

    2015-07-30

    The protein-protein docking server ClusPro is used by thousands of laboratories, and models built by the server have been reported in over 300 publications. Although the structures generated by the docking include near-native ones for many proteins, selecting the best model is difficult due to the uncertainty in scoring. Small angle X-ray scattering (SAXS) is an experimental technique for obtaining low resolution structural information in solution. While not sufficient on its own to uniquely predict complex structures, accounting for SAXS data improves the ranking of models and facilitates the identification of the most accurate structure. Although SAXS profiles are currently available only for a small number of complexes, due to its simplicity the method is becoming increasingly popular. Since combining docking with SAXS experiments will provide a viable strategy for fairly high-throughput determination of protein complex structures, the option of using SAXS restraints is added to the ClusPro server. © 2015 Wiley Periodicals, Inc. © 2015 Wiley Periodicals, Inc.

  9. Rapid glucosinolate detection and identification using accurate mass MS-MS

    USDA-ARS?s Scientific Manuscript database

    Currently, there is a demand for accurate evaluation of brassica plat species for their glucosinolate content. An optimized method has been developed for detecting and identifying glucosinolates in plant extracts using MS-MS fragmentation with ion trap collision induced dissociation (CID) and higher...

  10. Nonexposure Accurate Location K-Anonymity Algorithm in LBS

    PubMed Central

    2014-01-01

    This paper tackles location privacy protection in current location-based services (LBS) where mobile users have to report their exact location information to an LBS provider in order to obtain their desired services. Location cloaking has been proposed and well studied to protect user privacy. It blurs the user's accurate coordinate and replaces it with a well-shaped cloaked region. However, to obtain such an anonymous spatial region (ASR), nearly all existent cloaking algorithms require knowing the accurate locations of all users. Therefore, location cloaking without exposing the user's accurate location to any party is urgently needed. In this paper, we present such two nonexposure accurate location cloaking algorithms. They are designed for K-anonymity, and cloaking is performed based on the identifications (IDs) of the grid areas which were reported by all the users, instead of directly on their accurate coordinates. Experimental results show that our algorithms are more secure than the existent cloaking algorithms, need not have all the users reporting their locations all the time, and can generate smaller ASR. PMID:24605060

  11. Technical advance: identification of plant actin-binding proteins by F-actin affinity chromatography

    NASA Technical Reports Server (NTRS)

    Hu, S.; Brady, S. R.; Kovar, D. R.; Staiger, C. J.; Clark, G. B.; Roux, S. J.; Muday, G. K.

    2000-01-01

    Proteins that interact with the actin cytoskeleton often modulate the dynamics or organization of the cytoskeleton or use the cytoskeleton to control their localization. In plants, very few actin-binding proteins have been identified and most are thought to modulate cytoskeleton function. To identify actin-binding proteins that are unique to plants, the development of new biochemical procedures will be critical. Affinity columns using actin monomers (globular actin, G-actin) or actin filaments (filamentous actin, F-actin) have been used to identify actin-binding proteins from a wide variety of organisms. Monomeric actin from zucchini (Cucurbita pepo L.) hypocotyl tissue was purified to electrophoretic homogeneity and shown to be native and competent for polymerization to actin filaments. G-actin, F-actin and bovine serum albumin affinity columns were prepared and used to separate samples enriched in either soluble or membrane-associated actin-binding proteins. Extracts of soluble actin-binding proteins yield distinct patterns when eluted from the G-actin and F-actin columns, respectively, leading to the identification of a putative F-actin-binding protein of approximately 40 kDa. When plasma membrane-associated proteins were applied to these columns, two abundant polypeptides eluted selectively from the F-actin column and cross-reacted with antiserum against pea annexins. Additionally, a protein that binds auxin transport inhibitors, the naphthylphthalamic acid binding protein, which has been previously suggested to associate with the actin cytoskeleton, was eluted in a single peak from the F-actin column. These experiments provide a new approach that may help to identify novel actin-binding proteins from plants.

  12. Technical advance: identification of plant actin-binding proteins by F-actin affinity chromatography.

    PubMed

    Hu, S; Brady, S R; Kovar, D R; Staiger, C J; Clark, G B; Roux, S J; Muday, G K

    2000-10-01

    Proteins that interact with the actin cytoskeleton often modulate the dynamics or organization of the cytoskeleton or use the cytoskeleton to control their localization. In plants, very few actin-binding proteins have been identified and most are thought to modulate cytoskeleton function. To identify actin-binding proteins that are unique to plants, the development of new biochemical procedures will be critical. Affinity columns using actin monomers (globular actin, G-actin) or actin filaments (filamentous actin, F-actin) have been used to identify actin-binding proteins from a wide variety of organisms. Monomeric actin from zucchini (Cucurbita pepo L.) hypocotyl tissue was purified to electrophoretic homogeneity and shown to be native and competent for polymerization to actin filaments. G-actin, F-actin and bovine serum albumin affinity columns were prepared and used to separate samples enriched in either soluble or membrane-associated actin-binding proteins. Extracts of soluble actin-binding proteins yield distinct patterns when eluted from the G-actin and F-actin columns, respectively, leading to the identification of a putative F-actin-binding protein of approximately 40 kDa. When plasma membrane-associated proteins were applied to these columns, two abundant polypeptides eluted selectively from the F-actin column and cross-reacted with antiserum against pea annexins. Additionally, a protein that binds auxin transport inhibitors, the naphthylphthalamic acid binding protein, which has been previously suggested to associate with the actin cytoskeleton, was eluted in a single peak from the F-actin column. These experiments provide a new approach that may help to identify novel actin-binding proteins from plants.

  13. Identification of beer-spoilage bacteria using matrix-assisted laser desorption/ionization time-of-flight mass spectrometry.

    PubMed

    Wieme, Anneleen D; Spitaels, Freek; Aerts, Maarten; De Bruyne, Katrien; Van Landschoot, Anita; Vandamme, Peter

    2014-08-18

    Applicability of matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) for identification of beer-spoilage bacteria was examined. To achieve this, an extensive identification database was constructed comprising more than 4200 mass spectra, including biological and technical replicates derived from 273 acetic acid bacteria (AAB) and lactic acid bacteria (LAB), covering a total of 52 species, grown on at least three growth media. Sequence analysis of protein coding genes was used to verify aberrant MALDI-TOF MS identification results and confirmed the earlier misidentification of 34 AAB and LAB strains. In total, 348 isolates were collected from culture media inoculated with 14 spoiled beer and brewery samples. Peak-based numerical analysis of MALDI-TOF MS spectra allowed a straightforward species identification of 327 (94.0%) isolates. The remaining isolates clustered separately and were assigned through sequence analysis of protein coding genes either to species not known as beer-spoilage bacteria, and thus not present in the database, or to novel AAB species. An alternative, classifier-based approach for the identification of spoilage bacteria was evaluated by combining the identification results obtained through peak-based cluster analysis and sequence analysis of protein coding genes as a standard. In total, 263 out of 348 isolates (75.6%) were correctly identified at species level and 24 isolates (6.9%) were misidentified. In addition, the identification results of 50 isolates (14.4%) were considered unreliable, and 11 isolates (3.2%) could not be identified. The present study demonstrated that MALDI-TOF MS is well-suited for the rapid, high-throughput and accurate identification of bacteria isolated from spoiled beer and brewery samples, which makes the technique appropriate for routine microbial quality control in the brewing industry. Copyright © 2014 Elsevier B.V. All rights reserved.

  14. Identification of functional interactome of a key cell division regulatory protein CedA of E.coli.

    PubMed

    Sharma, Pankaj; Tomar, Anil Kumar; Kundu, Bishwajit

    2018-01-01

    Cell division is compromised in DnaAcos mutant Escherichia coli cells that results in filamentous cell morphology. This is countered by over-expression of CedA protein that induces cytokinesis and thus, regular cell morphology is regained; however via an unknown mechanism. To understand the process systematically, exact role of CedA should be deciphered. Protein interactions are crucial for functional organization of a cell and their identification helps in revealing exact function(s) of a protein and its binding partners. Thus, this study was intended to identify CedA binding proteins (CBPs) to gain more clues of CedA function. We isolated CBPs by pull down assay using purified recombinant CedA and identified nine CBPs by mass spectrometric analysis (MALDI-TOF MS and LC-MS/MS), viz. PDHA1, RL2, DNAK, LPP, RPOB, G6PD, GLMS, RL3 and YBCJ. Based on CBPs identified, we hypothesize that CedA plays a crucial and multifaceted role in cell cycle regulation and specific pathways in which CedA participates may include transcription and energy metabolism. However, further validation through in-vitro and in-vivo experiments is necessary. In conclusion, identification of CBPs may help us in deciphering mechanism of CedA mediated cell division during chromosomal DNA over-replication. Copyright © 2017 Elsevier B.V. All rights reserved.

  15. A survey of computational methods and error rate estimation procedures for peptide and protein identification in shotgun proteomics

    PubMed Central

    Nesvizhskii, Alexey I.

    2010-01-01

    This manuscript provides a comprehensive review of the peptide and protein identification process using tandem mass spectrometry (MS/MS) data generated in shotgun proteomic experiments. The commonly used methods for assigning peptide sequences to MS/MS spectra are critically discussed and compared, from basic strategies to advanced multi-stage approaches. A particular attention is paid to the problem of false-positive identifications. Existing statistical approaches for assessing the significance of peptide to spectrum matches are surveyed, ranging from single-spectrum approaches such as expectation values to global error rate estimation procedures such as false discovery rates and posterior probabilities. The importance of using auxiliary discriminant information (mass accuracy, peptide separation coordinates, digestion properties, and etc.) is discussed, and advanced computational approaches for joint modeling of multiple sources of information are presented. This review also includes a detailed analysis of the issues affecting the interpretation of data at the protein level, including the amplification of error rates when going from peptide to protein level, and the ambiguities in inferring the identifies of sample proteins in the presence of shared peptides. Commonly used methods for computing protein-level confidence scores are discussed in detail. The review concludes with a discussion of several outstanding computational issues. PMID:20816881

  16. Fitmunk: improving protein structures by accurate, automatic modeling of side-chain conformations.

    PubMed

    Porebski, Przemyslaw Jerzy; Cymborowski, Marcin; Pasenkiewicz-Gierula, Marta; Minor, Wladek

    2016-02-01

    Improvements in crystallographic hardware and software have allowed automated structure-solution pipelines to approach a near-`one-click' experience for the initial determination of macromolecular structures. However, in many cases the resulting initial model requires a laborious, iterative process of refinement and validation. A new method has been developed for the automatic modeling of side-chain conformations that takes advantage of rotamer-prediction methods in a crystallographic context. The algorithm, which is based on deterministic dead-end elimination (DEE) theory, uses new dense conformer libraries and a hybrid energy function derived from experimental data and prior information about rotamer frequencies to find the optimal conformation of each side chain. In contrast to existing methods, which incorporate the electron-density term into protein-modeling frameworks, the proposed algorithm is designed to take advantage of the highly discriminatory nature of electron-density maps. This method has been implemented in the program Fitmunk, which uses extensive conformational sampling. This improves the accuracy of the modeling and makes it a versatile tool for crystallographic model building, refinement and validation. Fitmunk was extensively tested on over 115 new structures, as well as a subset of 1100 structures from the PDB. It is demonstrated that the ability of Fitmunk to model more than 95% of side chains accurately is beneficial for improving the quality of crystallographic protein models, especially at medium and low resolutions. Fitmunk can be used for model validation of existing structures and as a tool to assess whether side chains are modeled optimally or could be better fitted into electron density. Fitmunk is available as a web service at http://kniahini.med.virginia.edu/fitmunk/server/ or at http://fitmunk.bitbucket.org/.

  17. Identification of methyllysine peptides binding to chromobox protein homolog 6 chromodomain in the human proteome.

    PubMed

    Li, Nan; Stein, Richard S L; He, Wei; Komives, Elizabeth; Wang, Wei

    2013-10-01

    Methylation is one of the important post-translational modifications that play critical roles in regulating protein functions. Proteomic identification of this post-translational modification and understanding how it affects protein activity remain great challenges. We tackled this problem from the aspect of methylation mediating protein-protein interaction. Using the chromodomain of human chromobox protein homolog 6 as a model system, we developed a systematic approach that integrates structure modeling, bioinformatics analysis, and peptide microarray experiments to identify lysine residues that are methylated and recognized by the chromodomain in the human proteome. Given the important role of chromobox protein homolog 6 as a reader of histone modifications, it was interesting to find that the majority of its interacting partners identified via this approach function in chromatin remodeling and transcriptional regulation. Our study not only illustrates a novel angle for identifying methyllysines on a proteome-wide scale and elucidating their potential roles in regulating protein function, but also suggests possible strategies for engineering the chromodomain-peptide interface to enhance the recognition of and manipulate the signal transduction mediated by such interactions.

  18. Protein-based forensic identification using genetically variant peptides in human bone.

    PubMed

    Mason, Katelyn Elizabeth; Anex, Deon; Grey, Todd; Hart, Bradley; Parker, Glendon

    2018-04-22

    Bone tissue contains organic material that is useful for forensic investigations and may contain preserved endogenous protein that can persist in the environment for extended periods of time over a range of conditions. Single amino acid polymorphisms in these proteins reflect genetic information since they result from non-synonymous single nucleotide polymorphisms (SNPs) in DNA. Detection of genetically variant peptides (GVPs) - those peptides that contain amino acid polymorphisms - in digests of bone proteins allows for the corresponding SNP alleles to be inferred. Resulting genetic profiles can be used to calculate statistical measures of association between a bone sample and an individual. In this study proteomic analysis on rib cortical bone samples from 10 recently deceased individuals demonstrates this concept. A straight-forward acidic demineralization protocol yielded proteins that were digested with trypsin. Tryptic digests were analyzed by liquid chromatography mass spectrometry. A total of 1736 different proteins were identified across all resulting datasets. On average, individual samples contained 454±121 (x¯±σ) proteins. Thirty-five genetically variant peptides were identified from 15 observed proteins. Overall, 134 SNP inferences were made based on proteomically detected GVPs, which were confirmed by sequencing of subject DNA. Inferred individual SNP genetic profiles ranged in random match probability (RMP) from 1/6 to 1/42,472 when calculated with European population frequencies in the 1000 Genomes Project, Phase 3. Similarly, RMPs based on African population frequencies were calculated for each SNP genetic profile and likelihood ratios (LR) were obtained by dividing each European RMP by the corresponding African RMP. Resulting LR values ranged from 1.4 to 825 with a median value of 16. GVP markers offer a basis for the identification of compromised skeletal remains independent of the presence of DNA template. Published by Elsevier B.V.

  19. Hot-spot identification on a broad class of proteins and RNA suggest unifying principles of molecular recognition

    PubMed Central

    Kulp, John L.; Cloudsdale, Ian S.; Kulp, John L.

    2017-01-01

    Chemically diverse fragments tend to collectively bind at localized sites on proteins, which is a cornerstone of fragment-based techniques. A central question is how general are these strategies for predicting a wide variety of molecular interactions such as small molecule-protein, protein-protein and protein-nucleic acid for both experimental and computational methods. To address this issue, we recently proposed three governing principles, (1) accurate prediction of fragment-macromolecule binding free energy, (2) accurate prediction of water-macromolecule binding free energy, and (3) locating sites on a macromolecule that have high affinity for a diversity of fragments and low affinity for water. To test the generality of these concepts we used the computational technique of Simulated Annealing of Chemical Potential to design one small fragment to break the RecA-RecA protein-protein interaction and three fragments that inhibit peptide-deformylase via water-mediated multi-body interactions. Experiments confirm the predictions that 6-hydroxydopamine potently inhibits RecA and that PDF inhibition quantitatively tracks the water-mediated binding predictions. Additionally, the principles correctly predict the essential bound waters in HIV Protease, the surprisingly extensive binding site of elastase, the pinpoint location of electron transfer in dihydrofolate reductase, the HIV TAT-TAR protein-RNA interactions, and the MDM2-MDM4 differential binding to p53. The experimental confirmations of highly non-obvious predictions combined with the precise characterization of a broad range of known phenomena lend strong support to the generality of fragment-based methods for characterizing molecular recognition. PMID:28837642

  20. Hot-spot identification on a broad class of proteins and RNA suggest unifying principles of molecular recognition.

    PubMed

    Kulp, John L; Cloudsdale, Ian S; Kulp, John L; Guarnieri, Frank

    2017-01-01

    Chemically diverse fragments tend to collectively bind at localized sites on proteins, which is a cornerstone of fragment-based techniques. A central question is how general are these strategies for predicting a wide variety of molecular interactions such as small molecule-protein, protein-protein and protein-nucleic acid for both experimental and computational methods. To address this issue, we recently proposed three governing principles, (1) accurate prediction of fragment-macromolecule binding free energy, (2) accurate prediction of water-macromolecule binding free energy, and (3) locating sites on a macromolecule that have high affinity for a diversity of fragments and low affinity for water. To test the generality of these concepts we used the computational technique of Simulated Annealing of Chemical Potential to design one small fragment to break the RecA-RecA protein-protein interaction and three fragments that inhibit peptide-deformylase via water-mediated multi-body interactions. Experiments confirm the predictions that 6-hydroxydopamine potently inhibits RecA and that PDF inhibition quantitatively tracks the water-mediated binding predictions. Additionally, the principles correctly predict the essential bound waters in HIV Protease, the surprisingly extensive binding site of elastase, the pinpoint location of electron transfer in dihydrofolate reductase, the HIV TAT-TAR protein-RNA interactions, and the MDM2-MDM4 differential binding to p53. The experimental confirmations of highly non-obvious predictions combined with the precise characterization of a broad range of known phenomena lend strong support to the generality of fragment-based methods for characterizing molecular recognition.

  1. Identification of herpesvirus proteins that contribute to G1/S arrest.

    PubMed

    Paladino, Patrick; Marcon, Edyta; Greenblatt, Jack; Frappier, Lori

    2014-04-01

    Lytic infection by herpesviruses induces cell cycle arrest at the G1/S transition. This appears to be a function of multiple herpesvirus proteins, but only a minority of herpesvirus proteins have been examined for cell cycle effects. To gain a more comprehensive understanding of the viral proteins that contribute to G1/S arrest, we screened a library of over 200 proteins from herpes simplex virus type 1, human cytomegalovirus, and Epstein-Barr virus (EBV) for effects on the G1/S interface, using HeLa fluorescent, ubiquitination-based cell cycle indicator (Fucci) cells in which G1/S can be detected colorimetrically. Proteins from each virus were identified that induce accumulation of G1/S cells, predominantly tegument, early, and capsid proteins. The identification of several capsid proteins in this screen suggests that incoming viral capsids may function to modulate cellular processes. The cell cycle effects of selected EBV proteins were further verified and examined for effects on p53 and p21 as regulators of the G1/S transition. Two EBV replication proteins (BORF2 and BMRF1) were found to induce p53 but not p21, while a previously uncharacterized tegument protein (BGLF2) was found to induce p21 protein levels in a p53-independent manner. Proteomic analyses of BGLF2-interacting proteins identified interactions with the NIMA-related protein kinase (NEK9) and GEM-interacting protein (GMIP). Silencing of either NEK9 or GMIP induced p21 without affecting p53 and abrogated the ability of BGLF2 to further induce p21. Collectively, these results suggest multiple viral proteins contribute to G1/S arrest, including BGLF2, which induces p21 levels likely by interfering with the functions of NEK9 and GMIP. Most people are infected with multiple herpesviruses, whose proteins alter the infected cells in several ways. During lytic infection, the viral proteins block cell proliferation just before the cellular DNA replicates. We used a novel screening method to identify proteins

  2. Immuno-affinity Capture Followed by TMPP N-Terminus Tagging to Study Catabolism of Therapeutic Proteins.

    PubMed

    Kullolli, Majlinda; Rock, Dan A; Ma, Ji

    2017-02-03

    Characterization of in vitro and in vivo catabolism of therapeutic proteins has increasingly become an integral part of discovery and development process for novel proteins. Unambiguous and efficient identification of catabolites can not only facilitate accurate understanding of pharmacokinetic profiles of drug candidates, but also enables follow up protein engineering to generate more catabolically stable molecules with improved properties (pharmacokinetics and pharmacodynamics). Immunoaffinity capture (IC) followed by top-down intact protein analysis using either matrix-assisted laser desorption/ionization or electrospray ionization mass spectrometry analysis have been the primary methods of choice for catabolite identification. However, the sensitivity and efficiency of these methods is not always sufficient for characterization of novel proteins from complex biomatrices such as plasma or serum. In this study a novel bottom-up targeted protein workflow was optimized for analysis of proteolytic degradation of therapeutic proteins. Selective and sensitive tagging of the alpha-amine at the N-terminus of proteins of interest was performed by immunoaffinity capture of therapeutic protein and its catabolites followed by on-bead succinimidyloxycarbonylmethyl tri-(2,4,6-trimethoxyphenyl N-terminus (TMPP-NTT) tagging. The positively charged hydrophobic TMPP tag facilitates unambiguous sequence identification of all N-terminus peptides from complex tryptic digestion samples via data dependent liquid chromatgraphy-tandem mass spectroscopy. Utility of the workflow was illustrated by definitive analysis of in vitro catabolic profile of neurotensin human Fc (NTs-huFc) protein in mouse serum. The results from this study demonstrated that the IC-TMPP-NTT workflow is a simple and efficient method for catabolite formation in therapeutic proteins.

  3. Proteomic identification of early salicylate- and flg22-responsive redox-sensitive proteins in Arabidopsis

    PubMed Central

    Liu, Pei; Zhang, Huoming; Yu, Boying; Xiong, Liming; Xia, Yiji

    2015-01-01

    Accumulation of reactive oxygen species (ROS) is one of the early defense responses against pathogen infection in plants. The mechanism about the initial and direct regulation of the defense signaling pathway by ROS remains elusive. Perturbation of cellular redox homeostasis by ROS is believed to alter functions of redox-sensitive proteins through their oxidative modifications. Here we report an OxiTRAQ-based proteomic study in identifying proteins whose cysteines underwent oxidative modifications in Arabidopsis cells during the early response to salicylate or flg22, two defense pathway elicitors that are known to disturb cellular redox homeostasis. Among the salicylate- and/or flg22-responsive redox-sensitive proteins are those involved in transcriptional regulation, chromatin remodeling, RNA processing, post-translational modifications, and nucleocytoplasmic shuttling. The identification of the salicylate-/flg22-responsive redox-sensitive proteins provides a foundation from which further study can be conducted toward understanding biological significance of their oxidative modifications during the plant defense response. PMID:25720653

  4. Identification and modification of dynamical regions in proteins for alteration of enzyme catalytic effect

    DOEpatents

    Agarwal, Pratul K.

    2015-11-24

    A method for analysis, control, and manipulation for improvement of the chemical reaction rate of a protein-mediated reaction is provided. Enzymes, which typically comprise protein molecules, are very efficient catalysts that enhance chemical reaction rates by many orders of magnitude. Enzymes are widely used for a number of functions in chemical, biochemical, pharmaceutical, and other purposes. The method identifies key protein vibration modes that control the chemical reaction rate of the protein-mediated reaction, providing identification of the factors that enable the enzymes to achieve the high rate of reaction enhancement. By controlling these factors, the function of enzymes may be modulated, i.e., the activity can either be increased for faster enzyme reaction or it can be decreased when a slower enzyme is desired. This method provides an inexpensive and efficient solution by utilizing computer simulations, in combination with available experimental data, to build suitable models and investigate the enzyme activity.

  5. Identification and modification of dynamical regions in proteins for alteration of enzyme catalytic effect

    DOEpatents

    Agarwal, Pratul K.

    2013-04-09

    A method for analysis, control, and manipulation for improvement of the chemical reaction rate of a protein-mediated reaction is provided. Enzymes, which typically comprise protein molecules, are very efficient catalysts that enhance chemical reaction rates by many orders of magnitude. Enzymes are widely used for a number of functions in chemical, biochemical, pharmaceutical, and other purposes. The method identifies key protein vibration modes that control the chemical reaction rate of the protein-mediated reaction, providing identification of the factors that enable the enzymes to achieve the high rate of reaction enhancement. By controlling these factors, the function of enzymes may be modulated, i.e., the activity can either be increased for faster enzyme reaction or it can be decreased when a slower enzyme is desired. This method provides an inexpensive and efficient solution by utilizing computer simulations, in combination with available experimental data, to build suitable models and investigate the enzyme activity.

  6. Accurate determination of interfacial protein secondary structure by combining interfacial-sensitive amide I and amide III spectral signals.

    PubMed

    Ye, Shuji; Li, Hongchun; Yang, Weilai; Luo, Yi

    2014-01-29

    Accurate determination of protein structures at the interface is essential to understand the nature of interfacial protein interactions, but it can only be done with a few, very limited experimental methods. Here, we demonstrate for the first time that sum frequency generation vibrational spectroscopy can unambiguously differentiate the interfacial protein secondary structures by combining surface-sensitive amide I and amide III spectral signals. This combination offers a powerful tool to directly distinguish random-coil (disordered) and α-helical structures in proteins. From a systematic study on the interactions between several antimicrobial peptides (including LKα14, mastoparan X, cecropin P1, melittin, and pardaxin) and lipid bilayers, it is found that the spectral profiles of the random-coil and α-helical structures are well separated in the amide III spectra, appearing below and above 1260 cm(-1), respectively. For the peptides with a straight backbone chain, the strength ratio for the peaks of the random-coil and α-helical structures shows a distinct linear relationship with the fraction of the disordered structure deduced from independent NMR experiments reported in the literature. It is revealed that increasing the fraction of negatively charged lipids can induce a conformational change of pardaxin from random-coil to α-helical structures. This experimental protocol can be employed for determining the interfacial protein secondary structures and dynamics in situ and in real time without extraneous labels.

  7. Detection of protein-protein interactions by ribosome display and protein in situ immobilisation.

    PubMed

    He, Mingyue; Liu, Hong; Turner, Martin; Taussig, Michael J

    2009-12-31

    We describe a method for identification of protein-protein interactions by combining two cell-free protein technologies, namely ribosome display and protein in situ immobilisation. The method requires only PCR fragments as the starting material, the target proteins being made through cell-free protein synthesis, either associated with their encoding mRNA as ribosome complexes or immobilised on a solid surface. The use of ribosome complexes allows identification of interacting protein partners from their attached coding mRNA. To demonstrate the procedures, we have employed the lymphocyte signalling proteins Vav1 and Grb2 and confirmed the interaction between Grb2 and the N-terminal SH3 domain of Vav1. The method has promise for library screening of pairwise protein interactions, down to the analytical level of individual domain or motif mapping.

  8. Proteomic identification of altered cerebral proteins in the complex regional pain syndrome animal model.

    PubMed

    Nahm, Francis Sahngun; Park, Zee-Yong; Nahm, Sang-Soep; Kim, Yong Chul; Lee, Pyung Bok

    2014-01-01

    Complex regional pain syndrome (CRPS) is a rare but debilitating pain disorder. Although the exact pathophysiology of CRPS is not fully understood, central and peripheral mechanisms might be involved in the development of this disorder. To reveal the central mechanism of CRPS, we conducted a proteomic analysis of rat cerebrum using the chronic postischemia pain (CPIP) model, a novel experimental model of CRPS. After generating the CPIP animal model, we performed a proteomic analysis of the rat cerebrum using a multidimensional protein identification technology, and screened the proteins differentially expressed between the CPIP and control groups. Results. A total of 155 proteins were differentially expressed between the CPIP and control groups: 125 increased and 30 decreased; expressions of proteins related to cell signaling, synaptic plasticity, regulation of cell proliferation, and cytoskeletal formation were increased in the CPIP group. However, proenkephalin A, cereblon, and neuroserpin were decreased in CPIP group. Altered expression of cerebral proteins in the CPIP model indicates cerebral involvement in the pathogenesis of CRPS. Further study is required to elucidate the roles of these proteins in the development and maintenance of CRPS.

  9. Novel Accurate Bacterial Discrimination by MALDI-Time-of-Flight MS Based on Ribosomal Proteins Coding in S10-spc-alpha Operon at Strain Level S10-GERMS

    NASA Astrophysics Data System (ADS)

    Tamura, Hiroto; Hotta, Yudai; Sato, Hiroaki

    2013-08-01

    Matrix-assisted laser-desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) is one of the most widely used mass-based approaches for bacterial identification and classification because of the simple sample preparation and extremely rapid analysis within a few minutes. To establish the accurate MALDI-TOF MS bacterial discrimination method at strain level, the ribosomal subunit proteins coded in the S 10-spc-alpha operon, which encodes half of the ribosomal subunit protein and is highly conserved in eubacterial genomes, were selected as reliable biomarkers. This method, named the S10-GERMS method, revealed that the strains of genus Pseudomonas were successfully identified and discriminated at species and strain levels, respectively; therefore, the S10-GERMS method was further applied to discriminate the pathovar of P. syringae. The eight selected biomarkers (L24, L30, S10, S12, S14, S16, S17, and S19) suggested the rapid discrimination of P. syringae at the strain (pathovar) level. The S10-GERMS method appears to be a powerful tool for rapid and reliable bacterial discrimination and successful phylogenetic characterization. In this article, an overview of the utilization of results from the S10-GERMS method is presented, highlighting the characterization of the Lactobacillus casei group and discrimination of the bacteria of genera Bacillus and Sphingopyxis despite only two and one base difference in the 16S rRNA gene sequence, respectively.

  10. Aptamer-conjugated live human immune cell based biosensors for the accurate detection of C-reactive protein

    NASA Astrophysics Data System (ADS)

    Hwang, Jangsun; Seo, Youngmin; Jo, Yeonho; Son, Jaewoo; Choi, Jonghoon

    2016-10-01

    C-reactive protein (CRP) is a pentameric protein that is present in the bloodstream during inflammatory events, e.g., liver failure, leukemia, and/or bacterial infection. The level of CRP indicates the progress and prognosis of certain diseases; it is therefore necessary to measure CRP levels in the blood accurately. The normal concentration of CRP is reported to be 1-3 mg/L. Inflammatory events increase the level of CRP by up to 500 times; accordingly, CRP is a biomarker of acute inflammatory disease. In this study, we demonstrated the preparation of DNA aptamer-conjugated peripheral blood mononuclear cells (Apt-PBMCs) that specifically capture human CRP. Live PBMCs functionalized with aptamers could detect different levels of human CRP by producing immune complexes with reporter antibody. The binding behavior of Apt-PBMCs toward highly concentrated CRP sites was also investigated. The immune responses of Apt-PBMCs were evaluated by measuring TNF-alpha secretion after stimulating the PBMCs with lipopolysaccharides. In summary, engineered Apt-PBMCs have potential applications as live cell based biosensors and for in vitro tracing of CRP secretion sites.

  11. Accurate, rapid identification of dislocation lines in coherent diffractive imaging via a min-max optimization formulation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ulvestad, A.; Menickelly, M.; Wild, S. M.

    Defects such as dislocations impact materials properties and their response during external stimuli. Imaging these defects in their native operating conditions to establish the structure-function relationship and, ultimately, to improve performance via defect engineering has remained a considerable challenge for both electron-based and x-ray-based imaging techniques. While Bragg coherent x-ray diffractive imaging (BCDI) is successful in many cases, nuances in identifying the dislocations has left manual identification as the preferred method. Derivative-based methods are also used, but they can be inaccurate and are computationally inefficient. Here we demonstrate a derivative-free method that is both more accurate and more computationally efficientmore » than either derivative-or human-based methods for identifying 3D dislocation lines in nanocrystal images produced by BCDI. We formulate the problem as a min-max optimization problem and show exceptional accuracy for experimental images. We demonstrate a 227x speedup for a typical experimental dataset with higher accuracy over current methods. We discuss the possibility of using this algorithm as part of a sparsity-based phase retrieval process. We also provide MATLAB code for use by other researchers.« less

  12. Accurate, rapid identification of dislocation lines in coherent diffractive imaging via a min-max optimization formulation

    NASA Astrophysics Data System (ADS)

    Ulvestad, A.; Menickelly, M.; Wild, S. M.

    2018-01-01

    Defects such as dislocations impact materials properties and their response during external stimuli. Imaging these defects in their native operating conditions to establish the structure-function relationship and, ultimately, to improve performance via defect engineering has remained a considerable challenge for both electron-based and x-ray-based imaging techniques. While Bragg coherent x-ray diffractive imaging (BCDI) is successful in many cases, nuances in identifying the dislocations has left manual identification as the preferred method. Derivative-based methods are also used, but they can be inaccurate and are computationally inefficient. Here we demonstrate a derivative-free method that is both more accurate and more computationally efficient than either derivative- or human-based methods for identifying 3D dislocation lines in nanocrystal images produced by BCDI. We formulate the problem as a min-max optimization problem and show exceptional accuracy for experimental images. We demonstrate a 227x speedup for a typical experimental dataset with higher accuracy over current methods. We discuss the possibility of using this algorithm as part of a sparsity-based phase retrieval process. We also provide MATLAB code for use by other researchers.

  13. Proteomic identification of plant proteins probed by mammalian nitric oxide synthase antibodies.

    PubMed

    Butt, Yoki Kwok-Chu; Lum, John Hon-Kei; Lo, Samuel Chun-Lap

    2003-03-01

    Several studies suggest that a mammalian-like nitric oxide synthase (NOS) exists in plants. Researchers have attempted to verify its presence using two approaches: (i) determination of NOS functional activity and (ii) probing with mammalian NOS antibodies. However, up to now, neither a NOS-like gene nor a protein has been found in plants. While there is still some controversy over whether the NOS functional activity seen is due to nitrate reductase, using the mammalian NOS antibodies in western blot analysis, several groups have reported the presence of immunoreactive protein bands in plant homogenates. Based on these results, immunohistochemical studies using these antibodies have also been used to localize NOS in plant tissues. However, plant NOS has never been positively identified or characterized. Thus, we used a proteomic approach to verify the identities of plant proteins that cross-reacted with the mammalian NOS antibodies. Proteins extracted from maize (Zea mays L.) embryonic axes were separated by two-dimensional gel electrophoresis and subjected to western blot analysis with the mammalian neuronal NOS and inducible NOS antibodies. Twenty immunoreactive protein spots recognized on a corresponding Coomassie blue-stained two-dimensional gel were subjected to tryptic digestion, followed by identification using matrix-assisted laser desorption/ionization-time of flight mass spectrometry. Fifteen proteins were successfully identified and they have described functions that are unrelated to NO metabolism. The remaining five proteins could not be identified. The amino acid sequences of these identified proteins and those used to raise the antibodies were aligned. However, no homologous region could be found. Our results demonstrate that the mammalian NOS antibodies recognize many NOS-unrelated plant proteins. Therefore, it is inappropriate to infer the presence of plant NOS using this immunological technique.

  14. Large-scale identification of target proteins of a glycosyltransferase isozyme by Lectin-IGOT-LC/MS, an LC/MS-based glycoproteomic approach

    PubMed Central

    Sugahara, Daisuke; Kaji, Hiroyuki; Sugihara, Kazushi; Asano, Masahide; Narimatsu, Hisashi

    2012-01-01

    Model organisms containing deletion or mutation in a glycosyltransferase-gene exhibit various physiological abnormalities, suggesting that specific glycan motifs on certain proteins play important roles in vivo. Identification of the target proteins of glycosyltransferase isozymes is the key to understand the roles of glycans. Here, we demonstrated the proteome-scale identification of the target proteins specific for a glycosyltransferase isozyme, β1,4-galactosyltransferase-I (β4GalT-I). Although β4GalT-I is the most characterized glycosyltransferase, its distinctive contribution to β1,4-galactosylation has been hardly described so far. We identified a large number of candidates for the target proteins specific to β4GalT-I by comparative analysis of β4GalT-I-deleted and wild-type mice using the LC/MS-based technique with the isotope-coded glycosylation site-specific tagging (IGOT) of lectin-captured N-glycopeptides. Our approach to identify the target proteins in a proteome-scale offers common features and trends in the target proteins, which facilitate understanding of the mechanism that controls assembly of a particular glycan motif on specific proteins. PMID:23002422

  15. Mass Spectrometric Identification of the Arginine and Lysine deficient Proline Rich Glutamine Rich Wheat Storage Proteins

    USDA-ARS?s Scientific Manuscript database

    Tandem mass spectrometry (MS/MS) of enzymatic digest has made possible identification of a wide variety of proteins and complex samples prepared by such techniques as RP-HPLC or 2-D gel electrophoresis. Success requires peptide fragmentation to be indicative of the peptide amino acid sequence. The f...

  16. The Yeast Saccharomyces cerevisiae: a versatile model system for the identification and characterization of bacterial virulence proteins.

    PubMed

    Siggers, Keri A; Lesser, Cammie F

    2008-07-17

    Microbial pathogens utilize complex secretion systems to deliver proteins into host cells. These effector proteins target and usurp host cell processes to promote infection and cause disease. While secretion systems are conserved, each pathogen delivers its own unique set of effectors. The identification and characterization of these effector proteins has been difficult, often limited by the lack of detectable signal sequences and functional redundancy. Model systems including yeast, worms, flies, and fish are being used to circumvent these issues. This technical review details the versatility and utility of yeast Saccharomyces cerevisiae as a system to identify and characterize bacterial effectors.

  17. High-throughput screening of T7 phage display and protein microarrays as a methodological approach for the identification of IgE-reactive components.

    PubMed

    San Segundo-Acosta, Pablo; Garranzo-Asensio, María; Oeo-Santos, Carmen; Montero-Calle, Ana; Quiralte, Joaquín; Cuesta-Herranz, Javier; Villalba, Mayte; Barderas, Rodrigo

    2018-05-01

    Olive pollen and yellow mustard seeds are major allergenic sources with high clinical relevance. To aid with the identification of IgE-reactive components, the development of sensitive methodological approaches is required. Here, we have combined T7 phage display and protein microarrays for the identification of allergenic peptides and mimotopes from olive pollen and mustard seeds. The identification of these allergenic sequences involved the construction and biopanning of T7 phage display libraries of mustard seeds and olive pollen using sera from allergic patients to both biological sources together with the construction of phage microarrays printed with 1536 monoclonal phages from the third/four rounds of biopanning. The screening of the phage microarrays with individual sera from allergic patients enabled the identification of 10 and 9 IgE-reactive unique amino acid sequences from olive pollen and mustard seeds, respectively. Five immunoreactive amino acid sequences displayed on phages were selected for their expression as His6-GST tag fusion proteins and validation. After immunological characterization, we assessed the IgE-reactivity of the constructs. Our results show that protein microarrays printed with T7 phages displaying peptides from allergenic sources might be used to identify allergenic components -peptides, proteins or mimotopes- through their screening with specific IgE antibodies from allergic patients. Copyright © 2018 Elsevier B.V. All rights reserved.

  18. Subpathway-GM: identification of metabolic subpathways via joint power of interesting genes and metabolites and their topologies within pathways.

    PubMed

    Li, Chunquan; Han, Junwei; Yao, Qianlan; Zou, Chendan; Xu, Yanjun; Zhang, Chunlong; Shang, Desi; Zhou, Lingyun; Zou, Chaoxia; Sun, Zeguo; Li, Jing; Zhang, Yunpeng; Yang, Haixiu; Gao, Xu; Li, Xia

    2013-05-01

    Various 'omics' technologies, including microarrays and gas chromatography mass spectrometry, can be used to identify hundreds of interesting genes, proteins and metabolites, such as differential genes, proteins and metabolites associated with diseases. Identifying metabolic pathways has become an invaluable aid to understanding the genes and metabolites associated with studying conditions. However, the classical methods used to identify pathways fail to accurately consider joint power of interesting gene/metabolite and the key regions impacted by them within metabolic pathways. In this study, we propose a powerful analytical method referred to as Subpathway-GM for the identification of metabolic subpathways. This provides a more accurate level of pathway analysis by integrating information from genes and metabolites, and their positions and cascade regions within the given pathway. We analyzed two colorectal cancer and one metastatic prostate cancer data sets and demonstrated that Subpathway-GM was able to identify disease-relevant subpathways whose corresponding entire pathways might be ignored using classical entire pathway identification methods. Further analysis indicated that the power of a joint genes/metabolites and subpathway strategy based on their topologies may play a key role in reliably recalling disease-relevant subpathways and finding novel subpathways.

  19. Subpathway-GM: identification of metabolic subpathways via joint power of interesting genes and metabolites and their topologies within pathways

    PubMed Central

    Li, Chunquan; Han, Junwei; Yao, Qianlan; Zou, Chendan; Xu, Yanjun; Zhang, Chunlong; Shang, Desi; Zhou, Lingyun; Zou, Chaoxia; Sun, Zeguo; Li, Jing; Zhang, Yunpeng; Yang, Haixiu; Gao, Xu; Li, Xia

    2013-01-01

    Various ‘omics’ technologies, including microarrays and gas chromatography mass spectrometry, can be used to identify hundreds of interesting genes, proteins and metabolites, such as differential genes, proteins and metabolites associated with diseases. Identifying metabolic pathways has become an invaluable aid to understanding the genes and metabolites associated with studying conditions. However, the classical methods used to identify pathways fail to accurately consider joint power of interesting gene/metabolite and the key regions impacted by them within metabolic pathways. In this study, we propose a powerful analytical method referred to as Subpathway-GM for the identification of metabolic subpathways. This provides a more accurate level of pathway analysis by integrating information from genes and metabolites, and their positions and cascade regions within the given pathway. We analyzed two colorectal cancer and one metastatic prostate cancer data sets and demonstrated that Subpathway-GM was able to identify disease-relevant subpathways whose corresponding entire pathways might be ignored using classical entire pathway identification methods. Further analysis indicated that the power of a joint genes/metabolites and subpathway strategy based on their topologies may play a key role in reliably recalling disease-relevant subpathways and finding novel subpathways. PMID:23482392

  20. Matrix-assisted laser desorption/ionization time-of-flight mass spectrometry as a tool for fast identification of protein binders in color layers of paintings.

    PubMed

    Hynek, Radovan; Kuckova, Stepanka; Hradilova, Janka; Kodicek, Milan

    2004-01-01

    Identification of materials in color layers of paintings is necessary for correct decisions concerning restoration procedures as well as proving the authenticity of the painting. The proteins are usually important components of the painting layers. In this paper it has been demonstrated that matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOFMS) can be used for fast and reliable identification of proteins in color layers even in old, highly aged matrices. The digestion can be easily performed directly on silica wafers which are routinely used for infrared analysis. The amount of material necessary for such an analysis is extremely small. Peptide mass mapping using digestion with trypsin followed by MALDI-TOFMS and identification of the protein was successfully used for determination of the binder from a painting of the 19th century. Copyright 2004 John Wiley & Sons, Ltd.

  1. Enhancing bioactive peptide release and identification using targeted enzymatic hydrolysis of milk proteins.

    PubMed

    Nongonierma, Alice B; FitzGerald, Richard J

    2018-06-01

    Milk proteins have been extensively studied for their ability to yield a range of bioactive peptides following enzymatic hydrolysis/digestion. However, many hurdles still exist regarding the widespread utilization of milk protein-derived bioactive peptides as health enhancing agents for humans. These mostly arise from the fact that most milk protein-derived bioactive peptides are not highly potent. In addition, they may be degraded during gastrointestinal digestion and/or have a low intestinal permeability. The targeted release of bioactive peptides during the enzymatic hydrolysis of milk proteins may allow the generation of particularly potent bioactive hydrolysates and peptides. Therefore, the development of milk protein hydrolysates capable of improving human health requires, in the first instance, optimized targeted release of specific bioactive peptides. The targeted hydrolysis of milk proteins has been aided by a range of in silico tools. These include peptide cutters and predictive modeling linking bioactivity to peptide structure [i.e., molecular docking, quantitative structure activity relationship (QSAR)], or hydrolysis parameters [design of experiments (DOE)]. Different targeted enzymatic release strategies employed during the generation of milk protein hydrolysates are reviewed herein and their limitations are outlined. In addition, specific examples are provided to demonstrate how in silico tools may help in the identification and discovery of potent milk protein-derived peptides. It is anticipated that the development of novel strategies employing a range of in silico tools may help in the generation of milk protein hydrolysates containing potent and bioavailable peptides, which in turn may be used to validate their health promoting effects in humans. Graphical abstract The targeted enzymatic hydrolysis of milk proteins may allow the generation of highly potent and bioavailable bioactive peptides.

  2. Molecular Dynamics in Mixed Solvents Reveals Protein-Ligand Interactions, Improves Docking, and Allows Accurate Binding Free Energy Predictions.

    PubMed

    Arcon, Juan Pablo; Defelipe, Lucas A; Modenutti, Carlos P; López, Elias D; Alvarez-Garcia, Daniel; Barril, Xavier; Turjanski, Adrián G; Martí, Marcelo A

    2017-04-24

    One of the most important biological processes at the molecular level is the formation of protein-ligand complexes. Therefore, determining their structure and underlying key interactions is of paramount relevance and has direct applications in drug development. Because of its low cost relative to its experimental sibling, molecular dynamics (MD) simulations in the presence of different solvent probes mimicking specific types of interactions have been increasingly used to analyze protein binding sites and reveal protein-ligand interaction hot spots. However, a systematic comparison of different probes and their real predictive power from a quantitative and thermodynamic point of view is still missing. In the present work, we have performed MD simulations of 18 different proteins in pure water as well as water mixtures of ethanol, acetamide, acetonitrile and methylammonium acetate, leading to a total of 5.4 μs simulation time. For each system, we determined the corresponding solvent sites, defined as space regions adjacent to the protein surface where the probability of finding a probe atom is higher than that in the bulk solvent. Finally, we compared the identified solvent sites with 121 different protein-ligand complexes and used them to perform molecular docking and ligand binding free energy estimates. Our results show that combining solely water and ethanol sites allows sampling over 70% of all possible protein-ligand interactions, especially those that coincide with ligand-based pharmacophoric points. Most important, we also show how the solvent sites can be used to significantly improve ligand docking in terms of both accuracy and precision, and that accurate predictions of ligand binding free energies, along with relative ranking of ligand affinity, can be performed.

  3. APRICOT: an integrated computational pipeline for the sequence-based identification and characterization of RNA-binding proteins.

    PubMed

    Sharan, Malvika; Förstner, Konrad U; Eulalio, Ana; Vogel, Jörg

    2017-06-20

    RNA-binding proteins (RBPs) have been established as core components of several post-transcriptional gene regulation mechanisms. Experimental techniques such as cross-linking and co-immunoprecipitation have enabled the identification of RBPs, RNA-binding domains (RBDs) and their regulatory roles in the eukaryotic species such as human and yeast in large-scale. In contrast, our knowledge of the number and potential diversity of RBPs in bacteria is poorer due to the technical challenges associated with the existing global screening approaches. We introduce APRICOT, a computational pipeline for the sequence-based identification and characterization of proteins using RBDs known from experimental studies. The pipeline identifies functional motifs in protein sequences using position-specific scoring matrices and Hidden Markov Models of the functional domains and statistically scores them based on a series of sequence-based features. Subsequently, APRICOT identifies putative RBPs and characterizes them by several biological properties. Here we demonstrate the application and adaptability of the pipeline on large-scale protein sets, including the bacterial proteome of Escherichia coli. APRICOT showed better performance on various datasets compared to other existing tools for the sequence-based prediction of RBPs by achieving an average sensitivity and specificity of 0.90 and 0.91 respectively. The command-line tool and its documentation are available at https://pypi.python.org/pypi/bio-apricot. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  4. Identification of immunogenic proteins and evaluation of four recombinant proteins as potential vaccine antigens from Vibrio anguillarum in flounder (Paralichthys olivaceus).

    PubMed

    Xing, Jing; Xu, Hongsen; Wang, Yang; Tang, Xiaoqian; Sheng, Xiuzhen; Zhan, Wenbin

    2017-05-31

    Vibrio anguillarum is a severe bacterial pathogen that can infect a wide range of fish species. Identification of immunogenic proteins and development of vaccine are essential for disease prevention. In this study, immunogenic proteins were screened and identified from V. anguillarum, and then protective efficacy of the immunogenic proteins was evaluated. Immunogenic proteins in V. anguillarum whole cell were detected by Western blotting (WB) using immunized flounder (Paralichthys olivaceus) serum, and then identified by Mass spectrometry (MS). The recombinant proteins of four identified immunogenic proteins were produced and immunized to fish, and then percentages of surface membrane immunoglobulin-positive (sIg+) cells in peripheral blood lymphocytes (PBL), total antibodies, antibodies against V. anguillarum, antibodies against recombinant proteins and relative percent survival (RPS) were measured, respectively. The results showed that five immunogenic proteins, VAA, Groel, OmpU, PteF and SpK, were identified; their recombinant proteins, rOmpU, rGroel, rSpK and rVAA, could induce the proliferation of sIg+ cells in PBL and production of total antibodies, antibodies against V. anguillarum and antibodies against the recombinant proteins; their protection against V. anguillarum showed 64.86%, 72.97%, 21.62% and 78.38% RPS, respectively. The results revealed that the immunoproteomic technique using fish anti-V. anguillarum serum provided an efficient way to screen the immunogenic protein for vaccine antigen. Moreover, the rVAA, rGroel and rOmpU had potential to be vaccine candidates against V. anguillarum infection. Copyright © 2017 Elsevier Ltd. All rights reserved.

  5. Identification of DNA-binding proteins by combining auto-cross covariance transformation and ensemble learning.

    PubMed

    Liu, Bin; Wang, Shanyi; Dong, Qiwen; Li, Shumin; Liu, Xuan

    2016-04-20

    DNA-binding proteins play a pivotal role in various intra- and extra-cellular activities ranging from DNA replication to gene expression control. With the rapid development of next generation of sequencing technique, the number of protein sequences is unprecedentedly increasing. Thus it is necessary to develop computational methods to identify the DNA-binding proteins only based on the protein sequence information. In this study, a novel method called iDNA-KACC is presented, which combines the Support Vector Machine (SVM) and the auto-cross covariance transformation. The protein sequences are first converted into profile-based protein representation, and then converted into a series of fixed-length vectors by the auto-cross covariance transformation with Kmer composition. The sequence order effect can be effectively captured by this scheme. These vectors are then fed into Support Vector Machine (SVM) to discriminate the DNA-binding proteins from the non DNA-binding ones. iDNA-KACC achieves an overall accuracy of 75.16% and Matthew correlation coefficient of 0.5 by a rigorous jackknife test. Its performance is further improved by employing an ensemble learning approach, and the improved predictor is called iDNA-KACC-EL. Experimental results on an independent dataset shows that iDNA-KACC-EL outperforms all the other state-of-the-art predictors, indicating that it would be a useful computational tool for DNA binding protein identification. .

  6. Computational Identification and Comparative Analysis of Secreted and Transmembrane Proteins in Six Burkholderia Species.

    PubMed

    Nguyen, Thao Thi; Lee, Hyun-Hee; Park, Jungwook; Park, Inmyoung; Seo, Young-Su

    2017-04-01

    As a step towards discovering novel pathogenesis-related proteins, we performed a genome scale computational identification and characterization of secreted and transmembrane (TM) proteins, which are mainly responsible for bacteria-host interactions and interactions with other bacteria, in the genomes of six representative Burkholderia species. The species comprised plant pathogens ( B. glumae BGR1, B. gladioli BSR3), human pathogens ( B. pseudomallei K96243, B. cepacia LO6), and plant-growth promoting endophytes ( Burkholderia sp. KJ006, B. phytofirmans PsJN). The proportions of putative classically secreted proteins (CSPs) and TM proteins among the species were relatively high, up to approximately 20%. Lower proportions of putative type 3 non-classically secreted proteins (T3NCSPs) (~10%) and unclassified non-classically secreted proteins (NCSPs) (~5%) were observed. The numbers of TM proteins among the three clusters (plant pathogens, human pathogens, and endophytes) were different, while the distribution of these proteins according to the number of TM domains was conserved in which TM proteins possessing 1, 2, 4, or 12 TM domains were the dominant groups in all species. In addition, we observed conservation in the protein size distribution of the secreted protein groups among the species. There were species-specific differences in the functional characteristics of these proteins in the various groups of CSPs, T3NCSPs, and unclassified NCSPs. Furthermore, we assigned the complete sets of the conserved and unique NCSP candidates of the collected Burkholderia species using sequence similarity searching. This study could provide new insights into the relationship among plant-pathogenic, human-pathogenic, and endophytic bacteria.

  7. Eyewitness identification accuracy and response latency: the unruly 10-12-second rule.

    PubMed

    Weber, Nathan; Brewer, Neil; Wells, Gary L; Semmler, Carolyn; Keast, Amber

    2004-09-01

    Data are reported from 3,213 research eyewitnesses confirming that accurate eyewitness identifications from lineups are made faster than are inaccurate identifications. However, consistent with predictions from the recognition and search literatures, the authors did not find support for the "10-12-s rule" in which lineup identifications faster than 10-12 s maximally discriminate between accurate and inaccurate identifications (D. Dunning & S. Perretta, 2002). Instead, the time frame that proved most discriminating was highly variable across experiments, ranging from 5 s to 29 s, and the maximally discriminating time was often unimpressive in its ability to sort accurate from inaccurate identifications. The authors suggest several factors that are likely to moderate the 10-12-s rule. (c) 2004 APA, all rights reserved.

  8. Signal peptide discrimination and cleavage site identification using SVM and NN.

    PubMed

    Kazemian, H B; Yusuf, S A; White, K

    2014-02-01

    About 15% of all proteins in a genome contain a signal peptide (SP) sequence, at the N-terminus, that targets the protein to intracellular secretory pathways. Once the protein is targeted correctly in the cell, the SP is cleaved, releasing the mature protein. Accurate prediction of the presence of these short amino-acid SP chains is crucial for modelling the topology of membrane proteins, since SP sequences can be confused with transmembrane domains due to similar composition of hydrophobic amino acids. This paper presents a cascaded Support Vector Machine (SVM)-Neural Network (NN) classification methodology for SP discrimination and cleavage site identification. The proposed method utilises a dual phase classification approach using SVM as a primary classifier to discriminate SP sequences from Non-SP. The methodology further employs NNs to predict the most suitable cleavage site candidates. In phase one, a SVM classification utilises hydrophobic propensities as a primary feature vector extraction using symmetric sliding window amino-acid sequence analysis for discrimination of SP and Non-SP. In phase two, a NN classification uses asymmetric sliding window sequence analysis for prediction of cleavage site identification. The proposed SVM-NN method was tested using Uni-Prot non-redundant datasets of eukaryotic and prokaryotic proteins with SP and Non-SP N-termini. Computer simulation results demonstrate an overall accuracy of 0.90 for SP and Non-SP discrimination based on Matthews Correlation Coefficient (MCC) tests using SVM. For SP cleavage site prediction, the overall accuracy is 91.5% based on cross-validation tests using the novel SVM-NN model. © 2013 Published by Elsevier Ltd.

  9. Matrix-assisted laser desorption/ionization coupled with quadrupole/orthogonal acceleration time-of-flight mass spectrometry for protein discovery, identification, and structural analysis.

    PubMed

    Baldwin, M A; Medzihradszky, K F; Lock, C M; Fisher, B; Settineri, T A; Burlingame, A L

    2001-04-15

    The design and operation of a novel UV-MALDI ionization source on a commercial QqoaTOF mass spectrometer (Applied Biosystem/MDS Sciex QSTAR Pulsar) is described. Samples are loaded on a 96-well target plate, the movement of which is under software control and can be readily automated. Unlike conventional high-energy MALDI-TOF, the ions are produced with low energies (5-10 eV) in a region of relatively low vacuum (8 mTorr). Thus, they are cooled by extensive low-energy collisions before selection in the quadrupole mass analyzer (Q1), potentially giving a quasi-continuous ion beam ideally suited to the oaTOF used for mass analysis of the fragment ions, although ion yields from individual laser shots may vary widely. Ion dissociation is induced by collisions with argon in an rf-only quadrupole cell, giving typical low-energy CID spectra for protonated peptide ions. Ions separated in the oaTOF are registered by a four-anode detector and time-to-digital converter and accumulated in "bins" that are 625 ps wide. Peak shapes depend upon the number of ion counts in adjacent bins. As expected, the accuracy of mass measurement is shown to be dependent upon the number of ions recorded for a particular peak. With internal calibration, mass accuracy better than 10 ppm is attainable for peaks that contain sufficient ions to give well-defined Gaussian profiles. By virtue of its high resolution, capability for accurate mass measurements, and sensitivity in the low-femotomole range, this instrument is ideally suited to protein identification for proteomic applications by generation of peptide tags, manual sequence interpretation, identification of modifications such as phosphorylation, and protein structural elucidation. Unlike the multiply charged ions typical of electrospray ionization, the singly charged MALDI-generated peptide ions show a linear dependence of optimal collision energy upon molecular mass, which is advantageous for automated operation. It is shown that the novel

  10. Draft Genome Sequences of Two Species of "Difficult-to-Identify" Human-Pathogenic Corynebacteria: Implications for Better Identification Tests.

    PubMed

    Pacheco, Luis G C; Mattos-Guaraldi, Ana L; Santos, Carolina S; Veras, Adonney A O; Guimarães, Luis C; Abreu, Vinícius; Pereira, Felipe L; Soares, Siomar C; Dorella, Fernanda A; Carvalho, Alex F; Leal, Carlos G; Figueiredo, Henrique C P; Ramos, Juliana N; Vieira, Veronica V; Farfour, Eric; Guiso, Nicole; Hirata, Raphael; Azevedo, Vasco; Silva, Artur; Ramos, Rommel T J

    2015-01-01

    Non-diphtheriae Corynebacterium species have been increasingly recognized as the causative agents of infections in humans. Differential identification of these bacteria in the clinical microbiology laboratory by the most commonly used biochemical tests is challenging, and normally requires additional molecular methods. Herein, we present the annotated draft genome sequences of two isolates of "difficult-to-identify" human-pathogenic corynebacterial species: C. xerosis and C. minutissimum. The genome sequences of ca. 2.7 Mbp, with a mean number of 2,580 protein encoding genes, were also compared with the publicly available genome sequences of strains of C. amycolatum and C. striatum. These results will aid the exploration of novel biochemical reactions to improve existing identification tests as well as the development of more accurate molecular identification methods through detection of species-specific target genes for isolate's identification or drug susceptibility profiling.

  11. Identification of a novel heteroglycan-interacting protein, HIP 1.3, from Arabidopsis thaliana.

    PubMed

    Fettke, Joerg; Nunes-Nesi, Adriano; Fernie, Alisdair R; Steup, Martin

    2011-08-15

    Plastidial degradation of transitory starch yields mainly maltose and glucose. Following the export into the cytosol, maltose acts as donor for a glucosyl transfer to cytosolic heteroglycans as mediated by a cytosolic transglucosidase (DPE2; EC 2.4.1.25) and the second glucosyl residue is liberated as glucose. The cytosolic phosphorylase (Pho2/PHS2; EC 2.4.1.1) also interacts with heteroglycans using the same intramolecular sites as DPE2. Thus, the two glucosyl transferases interconnect the cytosolic pools of glucose and glucose 1-phosphate. Due to the complex monosaccharide pattern, other heteroglycan-interacting proteins (HIPs) are expected to exist. Identification of those proteins was approached by using two types of affinity chromatography. Heteroglycans from leaves of Arabidopsis thaliana (Col-0) covalently bound to Sepharose served as ligands that were reacted with a complex mixture of buffer-soluble proteins from Arabidopsis leaves. Binding proteins were eluted by sodium chloride. For identification, SDS-PAGE, tryptic digestion and MALDI-TOF analyses were applied. A strongly interacting polypeptide (approximately 40kDa; designated as HIP1.3) was observed as product of locus At1g09340. Arabidopsis mutants deficient in HIP1.3 were reduced in growth and contained heteroglycans displaying an altered monosaccharide pattern. Wild type plants express HIP1.3 most strongly in leaves. As revealed by immuno fluorescence, HIP1.3 is located in the cytosol of mesophyll cells but mostly associated with the cytosolic surface of the chloroplast envelope membranes. In an HIP1.3-deficient mutant the immunosignal was undetectable. Metabolic profiles from leaves of this mutant and wild type plants as well were determined by GC-MS. As compared to the wild type control, more than ten metabolites, such as ascorbic acid, fructose, fructose bisphosphate, glucose, glycine, were elevated in darkness but decreased in the light. Although the biochemical function of HIP1.3 has not yet

  12. Identification of proteins in the aqueous humor associated with cataract development using iTRAQ methodology.

    PubMed

    Xiang, Minhong; Zhang, Xingru; Li, Qingsong; Wang, Hanmin; Zhang, Zhenyong; Han, Zhumei; Ke, Meiqing; Chen, Xingxing

    2017-05-01

    Proteins in the aqueous humor (AH) are important in the induction of cataract development. The identification of cataract-associated proteins assists in identifying patients and predisposed to the condition and improve treatment efficacy. Proteomics analysis has previously been used for identifying protein markers associated with eye diseases; however, few studies have examined the proteomic alterations in cataract development due to high myopia, glaucoma and diabetes. The present study, using the isobaric tagging for relative and absolute protein quantification methodology, aimed to examine cataract-associated proteins in the AH from patients with high myopia, glaucoma or diabetes, and controls. The results revealed that 445 proteins were identified in the AH groups, compared with the control groups, and 146, 264 and 130 proteins were differentially expressed in the three groups of patients, respectively. In addition, 44 of these proteins were determined to be cataract‑associated, and the alterations of five randomly selected proteins were confirmed using enzyme-linked immunosorbent assays. The biological functions of these 44 cataract-associated proteins were analyzed using Gen Ontology/pathways annotation, in addition to protein‑protein interaction network analysis. The results aimed to expand current knowledge of the pathophysiologic characteristics of cataract development and provided a panel of candidates for biomarkers of the disease, which may assist in further diagnosis and the monitoring of cataract development.

  13. A peptide affinity column for the identification of integrin alpha IIb-binding proteins.

    PubMed

    Daxecker, Heide; Raab, Markus; Bernard, Elise; Devocelle, Marc; Treumann, Achim; Moran, Niamh

    2008-03-01

    To understand the regulation of integrin alpha(IIb)beta(3), a critical platelet adhesion molecule, we have developed a peptide affinity chromatography method using the known integrin regulatory motif, LAMWKVGFFKR. Using standard Fmoc chemistry, this peptide was synthesized onto a Toyopearl AF-Amino-650 M resin on a 6-aminohexanoic acid (Ahx) linker. Peptide density was controlled by acetylation of 83% of the Ahx amino groups. Four recombinant human proteins (CIB1, PP1, ICln and RN181), previously identified as binding to this integrin regulatory motif, were specifically retained by the column containing the integrin peptide but not by a column presenting an irrelevant peptide. Hemoglobin, creatine kinase, bovine serum albumin, fibrinogen and alpha-tubulin failed to bind under the chosen conditions. Immunodetection methods confirmed the binding of endogenous platelet proteins, including CIB1, PP1, ICln RN181, AUP-1 and beta3-integrin, from a detergent-free platelet lysate. Thus, we describe a reproducible method that facilitates the reliable extraction of specific integrin-binding proteins from complex biological matrices. This methodology may enable the sensitive and specific identification of proteins that interact with linear, membrane-proximal peptide motifs such as the integrin regulatory motif LAMWKVGFFKR.

  14. Accurate Cell Division in Bacteria: How Does a Bacterium Know Where its Middle Is?

    NASA Astrophysics Data System (ADS)

    Howard, Martin; Rutenberg, Andrew

    2004-03-01

    I will discuss the physical principles lying behind the acquisition of accurate positional information in bacteria. A good application of these ideas is to the rod-shaped bacterium E. coli which divides precisely at its cellular midplane. This positioning is controlled by the Min system of proteins. These proteins coherently oscillate from end to end of the bacterium. I will present a reaction-diffusion model that describes the diffusion of the Min proteins, and their binding/unbinding from the cell membrane. The system possesses an instability that spontaneously generates the Min oscillations, which control accurate placement of the midcell division site. I will then discuss the role of fluctuations in protein dynamics, and investigate whether fluctuations set optimal protein concentration levels. Finally I will examine cell division in a different bacteria, B. subtilis. where different physical principles are used to regulate accurate cell division. See: Howard, Rutenberg, de Vet: Dynamic compartmentalization of bacteria: accurate division in E. coli. Phys. Rev. Lett. 87 278102 (2001). Howard, Rutenberg: Pattern formation inside bacteria: fluctuations due to the low copy number of proteins. Phys. Rev. Lett. 90 128102 (2003). Howard: A mechanism for polar protein localization in bacteria. J. Mol. Biol. 335 655-663 (2004).

  15. Methodology for identification of pore forming antimicrobial peptides from soy protein subunits β-conglycinin and glycinin.

    PubMed

    Xiang, Ning; Lyu, Yuan; Zhu, Xiao; Bhunia, Arun K; Narsimhan, Ganesan

    2016-11-01

    Antimicrobial peptides (AMPs) inactivate microbial cells through pore formation in cell membrane. Because of their different mode of action compared to antibiotics, AMPs can be effectively used to combat drug resistant bacteria in human health. AMPs can also be used to replace antibiotics in animal feed and immobilized on food packaging films. In this research, we developed a methodology based on mechanistic evaluation of peptide-lipid bilayer interaction to identify AMPs from soy protein. Production of AMPs from soy protein is an attractive, cost-saving alternative for commercial consideration, because soy protein is an abundant and common protein resource. This methodology is also applicable for identification of AMPs from any protein. Initial screening of peptide segments from soy glycinin (11S) and soy β-conglycinin (7S) subunits was based on their hydrophobicity, hydrophobic moment and net charge. Delicate balance between hydrophilic and hydrophobic interactions is necessary for pore formation. High hydrophobicity decreases the peptide solubility in aqueous phase whereas high hydrophilicity limits binding of the peptide to the bilayer. Out of several candidates chosen from the initial screening, two peptides satisfied the criteria for antimicrobial activity, viz. (i) lipid-peptide binding in surface state and (ii) pore formation in transmembrane state of the aggregate. This method of identification of antimicrobial activity via molecular dynamics simulation was shown to be robust in that it is insensitive to the number of peptides employed in the simulation, initial peptide structure and force field. Their antimicrobial activity against Listeria monocytogenes and Escherichia coli was further confirmed by spot-on-lawn test. Copyright © 2016 Elsevier Inc. All rights reserved.

  16. Identification of Naegleria fowleri proteins linked to primary amoebic meningoencephalitis.

    PubMed

    Jamerson, Melissa; Schmoyer, Jacqueline A; Park, Jay; Marciano-Cabral, Francine; Cabral, Guy A

    2017-03-01

    Naegleria fowleri (N. fowleri) causes primary amoebic meningoencephalitis, a rapidly fatal disease of the central nervous system. N. fowleri can exist in cyst, flagellate or amoebic forms, depending on environmental conditions. The amoebic form can invade the brain following introduction into the nasal passages. When applied intranasally to a mouse model, cultured N. fowleri amoebae exhibit low virulence. However, upon serial passage in mouse brain, the amoebae acquire a highly virulent state. In the present study, a proteomics approach was applied to the identification of N. fowleri amoeba proteins whose expression was associated with the highly virulent state in mice. Mice were inoculated intranasally with axenically cultured amoebae or with mouse-passaged amoebae. Examination by light and electron microscopy revealed no morphological differences. However, mouse-passaged amoebae were more virulent in mice as indicated by exhibiting a two log10 titre decrease in median infective dose 50 (ID50). Scatter plot analysis of amoebic lysates revealed a subset of proteins, the expression of which was associated with highly virulent amoebae. MS-MS indicated that this subset contained proteins that shared homology with those linked to cytoskeletal rearrangement and the invasion process. Invasion assays were performed in the presence of a select inhibitor to expand on the findings. The collective results suggest that N. fowleri gene products linked to cytoskeletal rearrangement and invasion may be candidate targets in the management of primary amoebic meningoencephalitis.

  17. A Simple and Practical Dictionary-based Approach for Identification of Proteins in Medline Abstracts

    PubMed Central

    Egorov, Sergei; Yuryev, Anton; Daraselia, Nikolai

    2004-01-01

    Objective: The aim of this study was to develop a practical and efficient protein identification system for biomedical corpora. Design: The developed system, called ProtScan, utilizes a carefully constructed dictionary of mammalian proteins in conjunction with a specialized tokenization algorithm to identify and tag protein name occurrences in biomedical texts and also takes advantage of Medline “Name-of-Substance” (NOS) annotation. The dictionaries for ProtScan were constructed in a semi-automatic way from various public-domain sequence databases followed by an intensive expert curation step. Measurements: The recall and precision of the system have been determined using 1,000 randomly selected and hand-tagged Medline abstracts. Results: The developed system is capable of identifying protein occurrences in Medline abstracts with a 98% precision and 88% recall. It was also found to be capable of processing approximately 300 abstracts per second. Without utilization of NOS annotation, precision and recall were found to be 98.5% and 84%, respectively. Conclusion: The developed system appears to be well suited for protein-based Medline indexing and can help to improve biomedical information retrieval. Further approaches to ProtScan's recall improvement also are discussed. PMID:14764613

  18. Use of a tandem affinity purification assay to detect interactions between West Nile and dengue viral proteins and proteins of the mosquito vector

    PubMed Central

    Colpitts, Tonya M.; Cox, Jonathan; Nguyen, Annie; Feitosa, Fabiana; Krishnan, Manoj N.; Fikrig, Erol

    2011-01-01

    West Nile and dengue viruses are (re)emerging mosquito-borne flaviviruses that cause significant morbidity and mortality in man. The identification of mosquito proteins that associate with flaviviruses may provide novel targets to inhibit infection of the vector or block transmission to humans. Here, a tandem affinity purification (TAP) assay was used to identify 18 mosquito proteins that interact with dengue and West Nile capsid, envelope, NS2A or NS2B proteins. We further analyzed the interaction of mosquito cadherin with dengue and West Nile virus envelope protein using co-immunoprecipitation and immunofluorescence. Blocking the function of select mosquito factors, including actin, myosin, PI3-kinase and myosin light chain kinase, reduced both dengue and West Nile virus infection in mosquito cells. We show that the TAP method may be used in insect cells to accurately identify flaviviral-host protein interactions. Our data also provides several targets for interrupting flavivirus infection in mosquito vectors. PMID:21700306

  19. Identification of the Kelch Family Protein Nd1-L as a Novel Molecular Interactor of KRIT1

    PubMed Central

    Cutano, Valentina; Martino, Chiara

    2012-01-01

    Loss-of-function mutations of the KRIT1 gene (CCM1) have been associated with the Cerebral Cavernous Malformation (CCM) disease, which is characterized by serious alterations of brain capillary architecture. The KRIT1 protein contains multiple interaction domains and motifs, suggesting that it might act as a scaffold for the assembly of functional protein complexes involved in signaling networks. In previous work, we defined structure-function relationships underlying KRIT1 intramolecular and intermolecular interactions and nucleocytoplasmic shuttling, and found that KRIT1 plays an important role in molecular mechanisms involved in the maintenance of the intracellular Reactive Oxygen Species (ROS) homeostasis to prevent oxidative cellular damage. Here we report the identification of the Kelch family protein Nd1-L as a novel molecular interactor of KRIT1. This interaction was discovered through yeast two-hybrid screening of a mouse embryo cDNA library, and confirmed by pull-down and co-immunoprecipitation assays of recombinant proteins, as well as by co-immunoprecipitation of endogenous proteins in human endothelial cells. Furthermore, using distinct KRIT1 isoforms and mutants, we defined the role of KRIT1 domains in the Nd1-L/KRIT1 interaction. Finally, functional assays showed that Nd1-L may contribute to the regulation of KRIT1 nucleocytoplasmic shuttling and cooperate with KRIT1 in modulating the expression levels of the antioxidant protein SOD2, opening a novel avenue for future mechanistic studies. The identification of Nd1-L as a novel KRIT1 interacting protein provides a novel piece of the molecular puzzle involving KRIT1 and suggests a potential functional cooperation in cellular responses to oxidative stress, thus expanding the framework of molecular complexes and mechanisms that may underlie the pathogenesis of CCM disease. PMID:22970292

  20. Hidden Markov models incorporating fuzzy measures and integrals for protein sequence identification and alignment.

    PubMed

    Bidargaddi, Niranjan P; Chetty, Madhu; Kamruzzaman, Joarder

    2008-06-01

    Profile hidden Markov models (HMMs) based on classical HMMs have been widely applied for protein sequence identification. The formulation of the forward and backward variables in profile HMMs is made under statistical independence assumption of the probability theory. We propose a fuzzy profile HMM to overcome the limitations of that assumption and to achieve an improved alignment for protein sequences belonging to a given family. The proposed model fuzzifies the forward and backward variables by incorporating Sugeno fuzzy measures and Choquet integrals, thus further extends the generalized HMM. Based on the fuzzified forward and backward variables, we propose a fuzzy Baum-Welch parameter estimation algorithm for profiles. The strong correlations and the sequence preference involved in the protein structures make this fuzzy architecture based model as a suitable candidate for building profiles of a given family, since the fuzzy set can handle uncertainties better than classical methods.

  1. On plate graphite supported sample processing for simultaneous lipid and protein identification by matrix assisted laser desorption ionization mass spectrometry.

    PubMed

    Calvano, Cosima Damiana; van der Werf, Inez Dorothé; Sabbatini, Luigia; Palmisano, Francesco

    2015-05-01

    The simultaneous identification of lipids and proteins by matrix assisted laser desorption ionization-mass spectrometry (MALDI-MS) after direct on-plate processing of micro-samples supported on colloidal graphite is demonstrated. Taking advantages of large surface area and thermal conductivity, graphite provided an ideal substrate for on-plate proteolysis and lipid extraction. Indeed proteins could be efficiently digested on-plate within 15 min, providing sequence coverages comparable to those obtained by conventional in-solution overnight digestion. Interestingly, detection of hydrophilic phosphorylated peptides could be easily achieved without any further enrichment step. Furthermore, lipids could be simultaneously extracted/identified without any additional treatment/processing step as demonstrated for model complex samples such as milk and egg. The present approach is simple, efficient, of large applicability and offers great promise for protein and lipid identification in very small samples. Copyright © 2015 Elsevier B.V. All rights reserved.

  2. Maximizing Selective Cleavages at Aspartic Acid and Proline Residues for the Identification of Intact Proteins

    NASA Astrophysics Data System (ADS)

    Foreman, David J.; Dziekonski, Eric T.; McLuckey, Scott A.

    2018-04-01

    A new approach for the identification of intact proteins has been developed that relies on the generation of relatively few abundant products from specific cleavage sites. This strategy is intended to complement standard approaches that seek to generate many fragments relatively non-selectively. Specifically, this strategy seeks to maximize selective cleavage at aspartic acid and proline residues via collisional activation of precursor ions formed via electrospray ionization (ESI) under denaturing conditions. A statistical analysis of the SWISS-PROT database was used to predict the number of arginine residues for a given intact protein mass and predict a m/z range where the protein carries a similar charge to the number of arginine residues thereby enhancing cleavage at aspartic acid residues by limiting proton mobility. Cleavage at aspartic acid residues is predicted to be most favorable in the m/z range of 1500-2500, a range higher than that normally generated by ESI at low pH. Gas-phase proton transfer ion/ion reactions are therefore used for precursor ion concentration from relatively high charge states followed by ion isolation and subsequent generation of precursor ions within the optimal m/z range via a second proton transfer reaction step. It is shown that the majority of product ion abundance is concentrated into cleavages C-terminal to aspartic acid residues and N-terminal to proline residues for ions generated by this process. Implementation of a scoring system that weights both ion fragment type and ion fragment area demonstrated identification of standard proteins, ranging in mass from 8.5 to 29.0 kDa. [Figure not available: see fulltext.

  3. Maximizing Selective Cleavages at Aspartic Acid and Proline Residues for the Identification of Intact Proteins.

    PubMed

    Foreman, David J; Dziekonski, Eric T; McLuckey, Scott A

    2018-04-30

    A new approach for the identification of intact proteins has been developed that relies on the generation of relatively few abundant products from specific cleavage sites. This strategy is intended to complement standard approaches that seek to generate many fragments relatively non-selectively. Specifically, this strategy seeks to maximize selective cleavage at aspartic acid and proline residues via collisional activation of precursor ions formed via electrospray ionization (ESI) under denaturing conditions. A statistical analysis of the SWISS-PROT database was used to predict the number of arginine residues for a given intact protein mass and predict a m/z range where the protein carries a similar charge to the number of arginine residues thereby enhancing cleavage at aspartic acid residues by limiting proton mobility. Cleavage at aspartic acid residues is predicted to be most favorable in the m/z range of 1500-2500, a range higher than that normally generated by ESI at low pH. Gas-phase proton transfer ion/ion reactions are therefore used for precursor ion concentration from relatively high charge states followed by ion isolation and subsequent generation of precursor ions within the optimal m/z range via a second proton transfer reaction step. It is shown that the majority of product ion abundance is concentrated into cleavages C-terminal to aspartic acid residues and N-terminal to proline residues for ions generated by this process. Implementation of a scoring system that weights both ion fragment type and ion fragment area demonstrated identification of standard proteins, ranging in mass from 8.5 to 29.0 kDa. Graphical Abstract ᅟ.

  4. Rapid detection, classification and accurate alignment of up to a million or more related protein sequences.

    PubMed

    Neuwald, Andrew F

    2009-08-01

    The patterns of sequence similarity and divergence present within functionally diverse, evolutionarily related proteins contain implicit information about corresponding biochemical similarities and differences. A first step toward accessing such information is to statistically analyze these patterns, which, in turn, requires that one first identify and accurately align a very large set of protein sequences. Ideally, the set should include many distantly related, functionally divergent subgroups. Because it is extremely difficult, if not impossible for fully automated methods to align such sequences correctly, researchers often resort to manual curation based on detailed structural and biochemical information. However, multiply-aligning vast numbers of sequences in this way is clearly impractical. This problem is addressed using Multiply-Aligned Profiles for Global Alignment of Protein Sequences (MAPGAPS). The MAPGAPS program uses a set of multiply-aligned profiles both as a query to detect and classify related sequences and as a template to multiply-align the sequences. It relies on Karlin-Altschul statistics for sensitivity and on PSI-BLAST (and other) heuristics for speed. Using as input a carefully curated multiple-profile alignment for P-loop GTPases, MAPGAPS correctly aligned weakly conserved sequence motifs within 33 distantly related GTPases of known structure. By comparison, the sequence- and structurally based alignment methods hmmalign and PROMALS3D misaligned at least 11 and 23 of these regions, respectively. When applied to a dataset of 65 million protein sequences, MAPGAPS identified, classified and aligned (with comparable accuracy) nearly half a million putative P-loop GTPase sequences. A C++ implementation of MAPGAPS is available at http://mapgaps.igs.umaryland.edu. Supplementary data are available at Bioinformatics online.

  5. Chapter 01: Wood identification and pattern recognition

    Treesearch

    Alex Wiedenhoeft

    2011-01-01

    Wood identification is a combination of art and science. Although the bulk of this manual focuses on the scientific characteristics used to make accurate field identifications of wood, the contribution of the artistic component to the identification process should be neither overlooked nor understated. Though the accumulation of scientific knowledge and experience is...

  6. Evaluation of mass spectrometric data using principal component analysis for determination of the effects of organic lakes on protein binder identification.

    PubMed

    Hrdlickova Kuckova, Stepanka; Rambouskova, Gabriela; Hynek, Radovan; Cejnar, Pavel; Oltrogge, Doris; Fuchs, Robert

    2015-11-01

    Matrix-assisted laser desorption/ionisation-time of flight (MALDI-TOF) mass spectrometry is commonly used for the identification of proteinaceous binders and their mixtures in artworks. The determination of protein binders is based on a comparison between the m/z values of tryptic peptides in the unknown sample and a reference one (egg, casein, animal glues etc.), but this method has greater potential to study changes due to ageing and the influence of organic/inorganic components on protein identification. However, it is necessary to then carry out statistical evaluation on the obtained data. Before now, it has been complicated to routinely convert the mass spectrometric data into a statistical programme, to extract and match the appropriate peaks. Only several 'homemade' computer programmes without user-friendly interfaces are available for these purposes. In this paper, we would like to present our completely new, publically available, non-commercial software, ms-alone and multiMS-toolbox, for principal component analyses of MALDI-TOF MS data for R software, and their application to the study of the influence of heterogeneous matrices (organic lakes) for protein identification. Using this new software, we determined the main factors that influence the protein analyses of artificially aged model mixtures of organic lakes and fish glue, prepared according to historical recipes that were used for book illumination, using MALDI-TOF peptide mass mapping. Copyright © 2015 John Wiley & Sons, Ltd.

  7. Evaluation of techniques for increasing recall in a dictionary approach to gene and protein name identification.

    PubMed

    Schuemie, Martijn J; Mons, Barend; Weeber, Marc; Kors, Jan A

    2007-06-01

    Gene and protein name identification in text requires a dictionary approach to relate synonyms to the same gene or protein, and to link names to external databases. However, existing dictionaries are incomplete. We investigate two complementary methods for automatic generation of a comprehensive dictionary: combination of information from existing gene and protein databases and rule-based generation of spelling variations. Both methods have been reported in literature before, but have hitherto not been combined and evaluated systematically. We combined gene and protein names from several existing databases of four different organisms. The combined dictionaries showed a substantial increase in recall on three different test sets, as compared to any single database. Application of 23 spelling variation rules to the combined dictionaries further increased recall. However, many rules appeared to have no effect and some appear to have a detrimental effect on precision.

  8. Microorganism Identification Based On MALDI-TOF-MS Fingerprints

    NASA Astrophysics Data System (ADS)

    Elssner, Thomas; Kostrzewa, Markus; Maier, Thomas; Kruppa, Gary

    Advances in MALDI-TOF mass spectrometry have enabled the ­development of a rapid, accurate and specific method for the identification of bacteria directly from colonies picked from culture plates, which we have named the MALDI Biotyper. The picked colonies are placed on a target plate, a drop of matrix solution is added, and a pattern of protein molecular weights and intensities, "the protein fingerprint" of the bacteria, is produced by the MALDI-TOF mass spectrometer. The obtained protein mass fingerprint representing a molecular signature of the microorganism is then matched against a database containing a library of previously measured protein mass fingerprints, and scores for the match to every library entry are produced. An ID is obtained if a score is returned over a pre-set threshold. The sensitivity of the techniques is such that only approximately 104 bacterial cells are needed, meaning that an overnight culture is sufficient, and the results are obtained in minutes after culture. The improvement in time to result over biochemical methods, and the capability to perform a non-targeted identification of bacteria and spores, potentially makes this method suitable for use in the detect-to-treat timeframe in a bioterrorism event. In the case of white-powder samples, the infectious spore is present in sufficient quantity in the powder so that the MALDI Biotyper result can be obtained directly from the white powder, without the need for culture. While spores produce very different patterns from the vegetative colonies of the corresponding bacteria, this problem is overcome by simply including protein fingerprints of the spores in the library. Results on spores can be returned within minutes, making the method suitable for use in the "detect-to-protect" timeframe.

  9. Identification of membrane-associated proteins with pathogenic potential expressed by Corynebacterium pseudotuberculosis grown in animal serum.

    PubMed

    Raynal, José Tadeu; Bastos, Bruno Lopes; Vilas-Boas, Priscilla Carolinne Bagano; Sousa, Thiago de Jesus; Costa-Silva, Marcos; de Sá, Maria da Conceição Aquino; Portela, Ricardo Wagner; Moura-Costa, Lília Ferreira; Azevedo, Vasco; Meyer, Roberto

    2018-01-25

    Previous works defining antigens that might be used as vaccine targets against Corynebacterium pseudotuberculosis, which is the causative agent of sheep and goat caseous lymphadenitis, have focused on secreted proteins produced in a chemically defined culture media. Considering that such antigens might not reflect the repertoire of proteins expressed during infection conditions, this experiment aimed to investigate the membrane-associated proteins with pathogenic potential expressed by C. pseudotuberculosis grown directly in animal serum. Its membrane-associated proteins have been extracted using an organic solvent enrichment methodology, followed by LC-MS/MS and bioinformatics analysis for protein identification and classification. The results revealed 22 membrane-associated proteins characterized as potentially pathogenic. An interaction network analysis indicated that the four potentially pathogenic proteins ciuA, fagA, OppA4 and OppCD were biologically connected within two distinct network pathways, which were both associated with the ABC Transporters KEGG pathway. These results suggest that C. pseudotuberculosis pathogenesis might be associated with the transport and uptake of nutrients; other seven identified potentially pathogenic membrane proteins also suggest that pathogenesis might involve events of bacterial resistance and adhesion. The proteins herein reported potentially reflect part of the protein repertoire expressed during real infection conditions and might be tested as vaccine antigens.

  10. Demonstration of Protein-Based Human Identification Using the Hair Shaft Proteome

    PubMed Central

    Leppert, Tami; Anex, Deon S.; Hilmer, Jonathan K.; Matsunami, Nori; Baird, Lisa; Stevens, Jeffery; Parsawar, Krishna; Durbin-Johnson, Blythe P.; Rocke, David M.; Nelson, Chad; Fairbanks, Daniel J.; Wilson, Andrew S.; Rice, Robert H.; Woodward, Scott R.; Bothner, Brian; Hart, Bradley R.; Leppert, Mark

    2016-01-01

    Human identification from biological material is largely dependent on the ability to characterize genetic polymorphisms in DNA. Unfortunately, DNA can degrade in the environment, sometimes below the level at which it can be amplified by PCR. Protein however is chemically more robust than DNA and can persist for longer periods. Protein also contains genetic variation in the form of single amino acid polymorphisms. These can be used to infer the status of non-synonymous single nucleotide polymorphism alleles. To demonstrate this, we used mass spectrometry-based shotgun proteomics to characterize hair shaft proteins in 66 European-American subjects. A total of 596 single nucleotide polymorphism alleles were correctly imputed in 32 loci from 22 genes of subjects’ DNA and directly validated using Sanger sequencing. Estimates of the probability of resulting individual non-synonymous single nucleotide polymorphism allelic profiles in the European population, using the product rule, resulted in a maximum power of discrimination of 1 in 12,500. Imputed non-synonymous single nucleotide polymorphism profiles from European–American subjects were considerably less frequent in the African population (maximum likelihood ratio = 11,000). The converse was true for hair shafts collected from an additional 10 subjects with African ancestry, where some profiles were more frequent in the African population. Genetically variant peptides were also identified in hair shaft datasets from six archaeological skeletal remains (up to 260 years old). This study demonstrates that quantifiable measures of identity discrimination and biogeographic background can be obtained from detecting genetically variant peptides in hair shaft protein, including hair from bioarchaeological contexts. PMID:27603779

  11. Identification of Tyrosine Phosphorylated Proteins by SH2 Domain Affinity Purification and Mass Spectrometry.

    PubMed

    Buhs, Sophia; Gerull, Helwe; Nollau, Peter

    2017-01-01

    Phosphotyrosine signaling plays a major role in the control of many important biological functions such as cell proliferation and apoptosis. Deciphering of phosphotyrosine-dependent signaling is therefore of great interest paving the way for the understanding of physiological and pathological processes of signal transduction. On the basis of the specific binding of SH2 domains to phosphotyrosine residues, we here present an experimental workflow for affinity purification and subsequent identification of tyrosine phosphorylated proteins by mass spectrometry. In combination with SH2 profiling, a broadly applicable platform for the characterization of phosphotyrosine profiles in cell extracts, our pull down strategy enables researchers by now to identify proteins in signaling cascades which are differentially phosphorylated and selectively recognized by distinct SH2 domains.

  12. Identification of a novel protein for memory regulation in the hippocampus.

    PubMed

    Zhang, Xue-Han; Zhang, Hui; Tu, Yanyang; Gao, Xiang; Zhou, Changfu; Jin, Meilei; Zhao, Guoping; Jing, Naihe; Li, Bao-Ming; Yu, Lei

    2005-08-26

    Memory formation, maintenance, and retrieval are a dynamic process, reflecting a combined outcome of new memory formation on one hand, and older memory suppression/clearance on the other. Although much knowledge has been gained regarding new memory formation, less is known about the molecular components and processes that serve the function of memory suppression/clearance. Here, we report the identification of a novel protein, termed hippyragranin (HGN), that is expressed in the rat hippocampus and its expression is reduced by hippocampal denervation. Inhibition of HGN by antisense oligonucleotide in area CA1 results in enhanced performance in Morris water maze, as well as elevated long-term potentiation. These results suggest that HGN is involved in negative memory regulation.

  13. Quantitative identification of proteins that influence miRNA biogenesis by RNA pull-down-SILAC mass spectrometry (RP-SMS).

    PubMed

    Choudhury, Nila Roy; Michlewski, Gracjan

    2018-06-08

    RNA-binding proteins mediate and control gene expression. As some examples, they regulate pre-mRNA synthesis and processing; mRNA localisation, translation and decay; and microRNA (miRNA) biogenesis and function. Here, we present a detailed protocol for RNA pull-down coupled to stable isotope labelling by amino acids in cell culture (SILAC) mass spectrometry (RP-SMS) that enables quantitative, fast and specific detection of RNA-binding proteins that regulate miRNA biogenesis. In general, this method allows for the identification of RNA-protein complexes formed using in vitro or chemically synthesized RNAs and protein extracts derived from cultured cells. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.

  14. A Non-parametric Cutout Index for Robust Evaluation of Identified Proteins*

    PubMed Central

    Serang, Oliver; Paulo, Joao; Steen, Hanno; Steen, Judith A.

    2013-01-01

    This paper proposes a novel, automated method for evaluating sets of proteins identified using mass spectrometry. The remaining peptide-spectrum match score distributions of protein sets are compared to an empirical absent peptide-spectrum match score distribution, and a Bayesian non-parametric method reminiscent of the Dirichlet process is presented to accurately perform this comparison. Thus, for a given protein set, the process computes the likelihood that the proteins identified are correctly identified. First, the method is used to evaluate protein sets chosen using different protein-level false discovery rate (FDR) thresholds, assigning each protein set a likelihood. The protein set assigned the highest likelihood is used to choose a non-arbitrary protein-level FDR threshold. Because the method can be used to evaluate any protein identification strategy (and is not limited to mere comparisons of different FDR thresholds), we subsequently use the method to compare and evaluate multiple simple methods for merging peptide evidence over replicate experiments. The general statistical approach can be applied to other types of data (e.g. RNA sequencing) and generalizes to multivariate problems. PMID:23292186

  15. Investigation and identification of functional post-translational modification sites associated with drug binding and protein-protein interactions.

    PubMed

    Su, Min-Gang; Weng, Julia Tzu-Ya; Hsu, Justin Bo-Kai; Huang, Kai-Yao; Chi, Yu-Hsiang; Lee, Tzong-Yi

    2017-12-21

    tools for exploring the structural characteristics of PTMs, is presented. In addition, all tertiary structures of PTM sites on proteins can be visualized using the JSmol program. Resolving the function of PTM sites is important for understanding the role that proteins play in biological mechanisms. Our work attempted to delineate the structural correlation between PTM sites and PPI or drug-target binding. CurxPTM could help scientists narrow the scope of their PTM research and enhance the efficiency of PTM identification in the face of big proteome data. CruxPTM is now available at http://csb.cse.yzu.edu.tw/CruxPTM/ .

  16. Nuclear Magnetic Resonance Spectroscopy-Based Identification of Yeast.

    PubMed

    Himmelreich, Uwe; Sorrell, Tania C; Daniel, Heide-Marie

    2017-01-01

    Rapid and robust high-throughput identification of environmental, industrial, or clinical yeast isolates is important whenever relatively large numbers of samples need to be processed in a cost-efficient way. Nuclear magnetic resonance (NMR) spectroscopy generates complex data based on metabolite profiles, chemical composition and possibly on medium consumption, which can not only be used for the assessment of metabolic pathways but also for accurate identification of yeast down to the subspecies level. Initial results on NMR based yeast identification where comparable with conventional and DNA-based identification. Potential advantages of NMR spectroscopy in mycological laboratories include not only accurate identification but also the potential of automated sample delivery, automated analysis using computer-based methods, rapid turnaround time, high throughput, and low running costs.We describe here the sample preparation, data acquisition and analysis for NMR-based yeast identification. In addition, a roadmap for the development of classification strategies is given that will result in the acquisition of a database and analysis algorithms for yeast identification in different environments.

  17. Targeted Identification of SUMOylation Sites in Human Proteins Using Affinity Enrichment and Paralog-specific Reporter Ions*

    PubMed Central

    Lamoliatte, Frederic; Bonneil, Eric; Durette, Chantal; Caron-Lizotte, Olivier; Wildemann, Dirk; Zerweck, Johannes; Wenshuk, Holger; Thibault, Pierre

    2013-01-01

    Protein modification by small ubiquitin-like modifier (SUMO) modulates the activities of numerous proteins involved in different cellular functions such as gene transcription, cell cycle, and DNA repair. Comprehensive identification of SUMOylated sites is a prerequisite to determine how SUMOylation regulates protein function. However, mapping SUMOylated Lys residues by mass spectrometry (MS) is challenging because of the dynamic nature of this modification, the existence of three functionally distinct human SUMO paralogs, and the large SUMO chain remnant that remains attached to tryptic peptides. To overcome these problems, we created HEK293 cell lines that stably express functional SUMO paralogs with an N-terminal His6-tag and an Arg residue near the C terminus that leave a short five amino acid SUMO remnant upon tryptic digestion. We determined the fragmentation patterns of our short SUMO remnant peptides by collisional activation and electron transfer dissociation using synthetic peptide libraries. Activation using higher energy collisional dissociation on the LTQ-Orbitrap Elite identified SUMO paralog-specific fragment ions and neutral losses of the SUMO remnant with high mass accuracy (< 5 ppm). We exploited these features to detect SUMO modified tryptic peptides in complex cell extracts by correlating mass measurements of precursor and fragment ions using a data independent acquisition method. We also generated bioinformatics tools to retrieve MS/MS spectra containing characteristic fragment ions to the identification of SUMOylated peptide by conventional Mascot database searches. In HEK293 cell extracts, this MS approach uncovered low abundance SUMOylated peptides and 37 SUMO3-modified Lys residues in target proteins, most of which were previously unknown. Interestingly, we identified mixed SUMO-ubiquitin chains with ubiquitylated SUMO proteins (K20 and K32) and SUMOylated ubiquitin (K63), suggesting a complex crosstalk between these two modifications. PMID

  18. Combining Structural Modeling with Ensemble Machine Learning to Accurately Predict Protein Fold Stability and Binding Affinity Effects upon Mutation

    PubMed Central

    Garcia Lopez, Sebastian; Kim, Philip M.

    2014-01-01

    Advances in sequencing have led to a rapid accumulation of mutations, some of which are associated with diseases. However, to draw mechanistic conclusions, a biochemical understanding of these mutations is necessary. For coding mutations, accurate prediction of significant changes in either the stability of proteins or their affinity to their binding partners is required. Traditional methods have used semi-empirical force fields, while newer methods employ machine learning of sequence and structural features. Here, we show how combining both of these approaches leads to a marked boost in accuracy. We introduce ELASPIC, a novel ensemble machine learning approach that is able to predict stability effects upon mutation in both, domain cores and domain-domain interfaces. We combine semi-empirical energy terms, sequence conservation, and a wide variety of molecular details with a Stochastic Gradient Boosting of Decision Trees (SGB-DT) algorithm. The accuracy of our predictions surpasses existing methods by a considerable margin, achieving correlation coefficients of 0.77 for stability, and 0.75 for affinity predictions. Notably, we integrated homology modeling to enable proteome-wide prediction and show that accurate prediction on modeled structures is possible. Lastly, ELASPIC showed significant differences between various types of disease-associated mutations, as well as between disease and common neutral mutations. Unlike pure sequence-based prediction methods that try to predict phenotypic effects of mutations, our predictions unravel the molecular details governing the protein instability, and help us better understand the molecular causes of diseases. PMID:25243403

  19. Multi-level machine learning prediction of protein-protein interactions in Saccharomyces cerevisiae.

    PubMed

    Zubek, Julian; Tatjewski, Marcin; Boniecki, Adam; Mnich, Maciej; Basu, Subhadip; Plewczynski, Dariusz

    2015-01-01

    Accurate identification of protein-protein interactions (PPI) is the key step in understanding proteins' biological functions, which are typically context-dependent. Many existing PPI predictors rely on aggregated features from protein sequences, however only a few methods exploit local information about specific residue contacts. In this work we present a two-stage machine learning approach for prediction of protein-protein interactions. We start with the carefully filtered data on protein complexes available for Saccharomyces cerevisiae in the Protein Data Bank (PDB) database. First, we build linear descriptions of interacting and non-interacting sequence segment pairs based on their inter-residue distances. Secondly, we train machine learning classifiers to predict binary segment interactions for any two short sequence fragments. The final prediction of the protein-protein interaction is done using the 2D matrix representation of all-against-all possible interacting sequence segments of both analysed proteins. The level-I predictor achieves 0.88 AUC for micro-scale, i.e., residue-level prediction. The level-II predictor improves the results further by a more complex learning paradigm. We perform 30-fold macro-scale, i.e., protein-level cross-validation experiment. The level-II predictor using PSIPRED-predicted secondary structure reaches 0.70 precision, 0.68 recall, and 0.70 AUC, whereas other popular methods provide results below 0.6 threshold (recall, precision, AUC). Our results demonstrate that multi-scale sequence features aggregation procedure is able to improve the machine learning results by more than 10% as compared to other sequence representations. Prepared datasets and source code for our experimental pipeline are freely available for download from: http://zubekj.github.io/mlppi/ (open source Python implementation, OS independent).

  20. Semi-supervised protein subcellular localization.

    PubMed

    Xu, Qian; Hu, Derek Hao; Xue, Hong; Yu, Weichuan; Yang, Qiang

    2009-01-30

    Protein subcellular localization is concerned with predicting the location of a protein within a cell using computational method. The location information can indicate key functionalities of proteins. Accurate predictions of subcellular localizations of protein can aid the prediction of protein function and genome annotation, as well as the identification of drug targets. Computational methods based on machine learning, such as support vector machine approaches, have already been widely used in the prediction of protein subcellular localization. However, a major drawback of these machine learning-based approaches is that a large amount of data should be labeled in order to let the prediction system learn a classifier of good generalization ability. However, in real world cases, it is laborious, expensive and time-consuming to experimentally determine the subcellular localization of a protein and prepare instances of labeled data. In this paper, we present an approach based on a new learning framework, semi-supervised learning, which can use much fewer labeled instances to construct a high quality prediction model. We construct an initial classifier using a small set of labeled examples first, and then use unlabeled instances to refine the classifier for future predictions. Experimental results show that our methods can effectively reduce the workload for labeling data using the unlabeled data. Our method is shown to enhance the state-of-the-art prediction results of SVM classifiers by more than 10%.

  1. Improved Recovery and Identification of Membrane Proteins from Rat Hepatic Cells using a Centrifugal Proteomic Reactor*

    PubMed Central

    Zhou, Hu; Wang, Fangjun; Wang, Yuwei; Ning, Zhibin; Hou, Weimin; Wright, Theodore G.; Sundaram, Meenakshi; Zhong, Shumei; Yao, Zemin; Figeys, Daniel

    2011-01-01

    Despite their importance in many biological processes, membrane proteins are underrepresented in proteomic analysis because of their poor solubility (hydrophobicity) and often low abundance. We describe a novel approach for the identification of plasma membrane proteins and intracellular microsomal proteins that combines membrane fractionation, a centrifugal proteomic reactor for streamlined protein extraction, protein digestion and fractionation by centrifugation, and high performance liquid chromatography-electrospray ionization-tandem MS. The performance of this approach was illustrated for the study of the proteome of ER and Golgi microsomal membranes in rat hepatic cells. The centrifugal proteomic reactor identified 945 plasma membrane proteins and 955 microsomal membrane proteins, of which 63 and 47% were predicted as bona fide membrane proteins, respectively. Among these proteins, >800 proteins were undetectable by the conventional in-gel digestion approach. The majority of the membrane proteins only identified by the centrifugal proteomic reactor were proteins with ≥2 transmembrane segments or proteins with high molecular mass (e.g. >150 kDa) and hydrophobicity. The improved proteomic reactor allowed the detection of a group of endocytic and/or signaling receptor proteins on the plasma membrane, as well as apolipoproteins and glycerolipid synthesis enzymes that play a role in the assembly and secretion of apolipoprotein B100-containing very low density lipoproteins. Thus, the centrifugal proteomic reactor offers a new analytical tool for structure and function studies of membrane proteins involved in lipid and lipoprotein metabolism. PMID:21749988

  2. Separation and identification of Musa acuminate Colla (banana) leaf proteins by two-dimensional gel electrophoresis and mass spectrometry.

    PubMed

    Lu, Y; Qi, Y X; Zhang, H; Zhang, H Q; Pu, J J; Xie, Y X

    2013-12-19

    To establish a proteomic reference map of Musa acuminate Colla (banana) leaf, we separated and identified leaf proteins using two-dimensional polyacrylamide gel electrophoresis (2D-PAGE) and mass spectrometry (MS). Tryptic digests of 44 spots were subjected to peptide mass fingerprinting (PMF) by matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) MS. Three spots that were not identified by MALDI-TOF MS analysis were identified by searching against the NCBInr, SwissProt, and expressed sequence tag (EST) databases. We identified 41 unique proteins. The majority of the identified leaf proteins were found to be involved in energy metabolism. The results indicate that 2D-PAGE is a sensitive and powerful technique for the separation and identification of Musa leaf proteins. A summary of the identified proteins and their putative functions is discussed.

  3. Purification, identification and preliminary crystallographic studies of Pru du amandin, an allergenic protein from Prunus dulcis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gaur, Vineet; Sethi, Dhruv K.; Salunke, Dinakar M., E-mail: dinakar@nii.res.in

    The purification, identification, crystallization and preliminary crystallographic studies of an allergy-related protein, Pru du amandin, from P. dulcis nuts are reported. Food allergies appear to be one of the foremost causes of hypersensitivity reactions. Nut allergies account for most food allergies and are often permanent. The 360 kDa hexameric protein Pru du amandin, a known allergen, was purified from almonds (Prunus dulcis) by ammonium sulfate fractionation and ion-exchange chromatography. The protein was identified by a BLAST homology search against the nonredundant sequence database. Pru du amandin belongs to the 11S legumin family of seed storage proteins characterized by the presencemore » of a cupin motif. Crystals were obtained by the hanging-drop vapour-diffusion method. The crystals belong to space group P4{sub 1} (or P4{sub 3}), with unit-cell parameters a = b = 150.7, c = 164.9 Å.« less

  4. Experimental strategies for the identification and characterization of adhesive proteins in animals: a review

    PubMed Central

    Hennebert, Elise; Maldonado, Barbara; Ladurner, Peter; Flammang, Patrick; Santos, Romana

    2015-01-01

    Adhesive secretions occur in both aquatic and terrestrial animals, in which they perform diverse functions. Biological adhesives can therefore be remarkably complex and involve a large range of components with different functions and interactions. However, being mainly protein based, biological adhesives can be characterized by classical molecular methods. This review compiles experimental strategies that were successfully used to identify, characterize and obtain the full-length sequence of adhesive proteins from nine biological models: echinoderms, barnacles, tubeworms, mussels, sticklebacks, slugs, velvet worms, spiders and ticks. A brief description and practical examples are given for a variety of tools used to study adhesive molecules at different levels from genes to secreted proteins. In most studies, proteins, extracted from secreted materials or from adhesive organs, are analysed for the presence of post-translational modifications and submitted to peptide sequencing. The peptide sequences are then used directly for a BLAST search in genomic or transcriptomic databases, or to design degenerate primers to perform RT-PCR, both allowing the recovery of the sequence of the cDNA coding for the investigated protein. These sequences can then be used for functional validation and recombinant production. In recent years, the dual proteomic and transcriptomic approach has emerged as the best way leading to the identification of novel adhesive proteins and retrieval of their complete sequences. PMID:25657842

  5. Accurate and exact CNV identification from targeted high-throughput sequence data.

    PubMed

    Nord, Alex S; Lee, Ming; King, Mary-Claire; Walsh, Tom

    2011-04-12

    Massively parallel sequencing of barcoded DNA samples significantly increases screening efficiency for clinically important genes. Short read aligners are well suited to single nucleotide and indel detection. However, methods for CNV detection from targeted enrichment are lacking. We present a method combining coverage with map information for the identification of deletions and duplications in targeted sequence data. Sequencing data is first scanned for gains and losses using a comparison of normalized coverage data between samples. CNV calls are confirmed by testing for a signature of sequences that span the CNV breakpoint. With our method, CNVs can be identified regardless of whether breakpoints are within regions targeted for sequencing. For CNVs where at least one breakpoint is within targeted sequence, exact CNV breakpoints can be identified. In a test data set of 96 subjects sequenced across ~1 Mb genomic sequence using multiplexing technology, our method detected mutations as small as 31 bp, predicted quantitative copy count, and had a low false-positive rate. Application of this method allows for identification of gains and losses in targeted sequence data, providing comprehensive mutation screening when combined with a short read aligner.

  6. Electro-Optic Identification (EOID) Research Program

    DTIC Science & Technology

    2002-09-30

    The goal of this research is to provide computer-assisted identification of underwater mines in electro - optic imagery. Identification algorithms will...greatly reduce the time and risk to reacquire mine-like-objects for positive classification and identification. The objectives are to collect electro ... optic data under a wide range of operating and environmental conditions and develop precise algorithms that can provide accurate target recognition on this data for all possible conditions.

  7. Computational prediction of human salivary proteins from blood circulation and application to diagnostic biomarker identification.

    PubMed

    Wang, Jiaxin; Liang, Yanchun; Wang, Yan; Cui, Juan; Liu, Ming; Du, Wei; Xu, Ying

    2013-01-01

    Proteins can move from blood circulation into salivary glands through active transportation, passive diffusion or ultrafiltration, some of which are then released into saliva and hence can potentially serve as biomarkers for diseases if accurately identified. We present a novel computational method for predicting salivary proteins that come from circulation. The basis for the prediction is a set of physiochemical and sequence features we found to be discerning between human proteins known to be movable from circulation to saliva and proteins deemed to be not in saliva. A classifier was trained based on these features using a support-vector machine to predict protein secretion into saliva. The classifier achieved 88.56% average recall and 90.76% average precision in 10-fold cross-validation on the training data, indicating that the selected features are informative. Considering the possibility that our negative training data may not be highly reliable (i.e., proteins predicted to be not in saliva), we have also trained a ranking method, aiming to rank the known salivary proteins from circulation as the highest among the proteins in the general background, based on the same features. This prediction capability can be used to predict potential biomarker proteins for specific human diseases when coupled with the information of differentially expressed proteins in diseased versus healthy control tissues and a prediction capability for blood-secretory proteins. Using such integrated information, we predicted 31 candidate biomarker proteins in saliva for breast cancer.

  8. Computational Prediction of Human Salivary Proteins from Blood Circulation and Application to Diagnostic Biomarker Identification

    PubMed Central

    Wang, Jiaxin; Liang, Yanchun; Wang, Yan; Cui, Juan; Liu, Ming; Du, Wei; Xu, Ying

    2013-01-01

    Proteins can move from blood circulation into salivary glands through active transportation, passive diffusion or ultrafiltration, some of which are then released into saliva and hence can potentially serve as biomarkers for diseases if accurately identified. We present a novel computational method for predicting salivary proteins that come from circulation. The basis for the prediction is a set of physiochemical and sequence features we found to be discerning between human proteins known to be movable from circulation to saliva and proteins deemed to be not in saliva. A classifier was trained based on these features using a support-vector machine to predict protein secretion into saliva. The classifier achieved 88.56% average recall and 90.76% average precision in 10-fold cross-validation on the training data, indicating that the selected features are informative. Considering the possibility that our negative training data may not be highly reliable (i.e., proteins predicted to be not in saliva), we have also trained a ranking method, aiming to rank the known salivary proteins from circulation as the highest among the proteins in the general background, based on the same features. This prediction capability can be used to predict potential biomarker proteins for specific human diseases when coupled with the information of differentially expressed proteins in diseased versus healthy control tissues and a prediction capability for blood-secretory proteins. Using such integrated information, we predicted 31 candidate biomarker proteins in saliva for breast cancer. PMID:24324552

  9. Combinatorial Approach for Large-scale Identification of Linked Peptides from Tandem Mass Spectrometry Spectra*

    PubMed Central

    Wang, Jian; Anania, Veronica G.; Knott, Jeff; Rush, John; Lill, Jennie R.; Bourne, Philip E.; Bandeira, Nuno

    2014-01-01

    The combination of chemical cross-linking and mass spectrometry has recently been shown to constitute a powerful tool for studying protein–protein interactions and elucidating the structure of large protein complexes. However, computational methods for interpreting the complex MS/MS spectra from linked peptides are still in their infancy, making the high-throughput application of this approach largely impractical. Because of the lack of large annotated datasets, most current approaches do not capture the specific fragmentation patterns of linked peptides and therefore are not optimal for the identification of cross-linked peptides. Here we propose a generic approach to address this problem and demonstrate it using disulfide-bridged peptide libraries to (i) efficiently generate large mass spectral reference data for linked peptides at a low cost and (ii) automatically train an algorithm that can efficiently and accurately identify linked peptides from MS/MS spectra. We show that using this approach we were able to identify thousands of MS/MS spectra from disulfide-bridged peptides through comparison with proteome-scale sequence databases and significantly improve the sensitivity of cross-linked peptide identification. This allowed us to identify 60% more direct pairwise interactions between the protein subunits in the 20S proteasome complex than existing tools on cross-linking studies of the proteasome complexes. The basic framework of this approach and the MS/MS reference dataset generated should be valuable resources for the future development of new tools for the identification of linked peptides. PMID:24493012

  10. MOWGLI: prediction of protein-MannOse interacting residues With ensemble classifiers usinG evoLutionary Information.

    PubMed

    Pai, Priyadarshini P; Mondal, Sukanta

    2016-10-01

    Proteins interact with carbohydrates to perform various cellular interactions. Of the many carbohydrate ligands that proteins bind with, mannose constitute an important class, playing important roles in host defense mechanisms. Accurate identification of mannose-interacting residues (MIR) may provide important clues to decipher the underlying mechanisms of protein-mannose interactions during infections. This study proposes an approach using an ensemble of base classifiers for prediction of MIR using their evolutionary information in the form of position-specific scoring matrix. The base classifiers are random forests trained by different subsets of training data set Dset128 using 10-fold cross-validation. The optimized ensemble of base classifiers, MOWGLI, is then used to predict MIR on protein chains of the test data set Dtestset29 which showed a promising performance with 92.0% accurate prediction. An overall improvement of 26.6% in precision was observed upon comparison with the state-of-art. It is hoped that this approach, yielding enhanced predictions, could be eventually used for applications in drug design and vaccine development.

  11. Bacillus anthracis secretome time course under host-simulated conditions and identification of immunogenic proteins.

    PubMed

    Walz, Alexander; Mujer, Cesar V; Connolly, Joseph P; Alefantis, Tim; Chafin, Ryan; Dake, Clarissa; Whittington, Jessica; Kumar, Srikanta P; Khan, Akbar S; DelVecchio, Vito G

    2007-07-27

    progression of pathogenicity, identification of therapeutics and diagnostic markers, and vaccine development. This study also adds to the continuously growing list of identified Bacillus anthracis secretome proteins.

  12. Bacillus anthracis secretome time course under host-simulated conditions and identification of immunogenic proteins

    PubMed Central

    Walz, Alexander; Mujer, Cesar V; Connolly, Joseph P; Alefantis, Tim; Chafin, Ryan; Dake, Clarissa; Whittington, Jessica; Kumar, Srikanta P; Khan, Akbar S; DelVecchio, Vito G

    2007-01-01

    relevant in elucidation of the progression of pathogenicity, identification of therapeutics and diagnostic markers, and vaccine development. This study also adds to the continuously growing list of identified Bacillus anthracis secretome proteins. PMID:17662140

  13. A coevolution analysis for identifying protein-protein interactions by Fourier transform.

    PubMed

    Yin, Changchuan; Yau, Stephen S-T

    2017-01-01

    Protein-protein interactions (PPIs) play key roles in life processes, such as signal transduction, transcription regulations, and immune response, etc. Identification of PPIs enables better understanding of the functional networks within a cell. Common experimental methods for identifying PPIs are time consuming and expensive. However, recent developments in computational approaches for inferring PPIs from protein sequences based on coevolution theory avoid these problems. In the coevolution theory model, interacted proteins may show coevolutionary mutations and have similar phylogenetic trees. The existing coevolution methods depend on multiple sequence alignments (MSA); however, the MSA-based coevolution methods often produce high false positive interactions. In this paper, we present a computational method using an alignment-free approach to accurately detect PPIs and reduce false positives. In the method, protein sequences are numerically represented by biochemical properties of amino acids, which reflect the structural and functional differences of proteins. Fourier transform is applied to the numerical representation of protein sequences to capture the dissimilarities of protein sequences in biophysical context. The method is assessed for predicting PPIs in Ebola virus. The results indicate strong coevolution between the protein pairs (NP-VP24, NP-VP30, NP-VP40, VP24-VP30, VP24-VP40, and VP30-VP40). The method is also validated for PPIs in influenza and E.coli genomes. Since our method can reduce false positive and increase the specificity of PPI prediction, it offers an effective tool to understand mechanisms of disease pathogens and find potential targets for drug design. The Python programs in this study are available to public at URL (https://github.com/cyinbox/PPI).

  14. A coevolution analysis for identifying protein-protein interactions by Fourier transform

    PubMed Central

    Yin, Changchuan; Yau, Stephen S. -T.

    2017-01-01

    Protein-protein interactions (PPIs) play key roles in life processes, such as signal transduction, transcription regulations, and immune response, etc. Identification of PPIs enables better understanding of the functional networks within a cell. Common experimental methods for identifying PPIs are time consuming and expensive. However, recent developments in computational approaches for inferring PPIs from protein sequences based on coevolution theory avoid these problems. In the coevolution theory model, interacted proteins may show coevolutionary mutations and have similar phylogenetic trees. The existing coevolution methods depend on multiple sequence alignments (MSA); however, the MSA-based coevolution methods often produce high false positive interactions. In this paper, we present a computational method using an alignment-free approach to accurately detect PPIs and reduce false positives. In the method, protein sequences are numerically represented by biochemical properties of amino acids, which reflect the structural and functional differences of proteins. Fourier transform is applied to the numerical representation of protein sequences to capture the dissimilarities of protein sequences in biophysical context. The method is assessed for predicting PPIs in Ebola virus. The results indicate strong coevolution between the protein pairs (NP-VP24, NP-VP30, NP-VP40, VP24-VP30, VP24-VP40, and VP30-VP40). The method is also validated for PPIs in influenza and E.coli genomes. Since our method can reduce false positive and increase the specificity of PPI prediction, it offers an effective tool to understand mechanisms of disease pathogens and find potential targets for drug design. The Python programs in this study are available to public at URL (https://github.com/cyinbox/PPI). PMID:28430779

  15. Incorporation of unnatural sugars for the identification of glycoproteins.

    PubMed

    Zaro, Balyn W; Hang, Howard C; Pratt, Matthew R

    2013-01-01

    Glycosylation is an abundant post-translational modification that alters the fate and function of its substrate proteins. To aid in understanding the significance of protein glycosylation, identification of target proteins is key. As with all proteomics experiments, mass spectrometry has been established as the desired method for substrate identification. However, these approaches require selective enrichment and purification of modified proteins. Chemical reporters in combination with bioorthogonal reactions have emerged as robust tools for identifying post-translational modifications including glycosylation. We provide here a method for the use of bioorthogonal chemical reporters for isolation and identification of glycosylated proteins. More specifically, this protocol is a representative procedure from our own work using an alkyne-bearing O-GlcNAc chemical reporter (GlcNAlk) and a chemically cleavable azido-azo-biotin probe for the identification of O-GlcNAc-modified proteins.

  16. Transmembrane proteins in the Protein Data Bank: identification and classification.

    PubMed

    Tusnády, Gábor E; Dosztányi, Zsuzsanna; Simon, István

    2004-11-22

    Integral membrane proteins play important roles in living cells. Although these proteins are estimated to constitute 25% of proteins at a genomic scale, the Protein Data Bank (PDB) contains only a few hundred membrane proteins due to the difficulties with experimental techniques. The presence of transmembrane proteins in the structure data bank, however, is quite invisible, as the annotation of these entries is rather poor. Even if a protein is identified as a transmembrane one, the possible location of the lipid bilayer is not indicated in the PDB because these proteins are crystallized without their natural lipid bilayer, and currently no method is publicly available to detect the possible membrane plane using the atomic coordinates of membrane proteins. Here, we present a new geometrical approach to distinguish between transmembrane and globular proteins using structural information only and to locate the most likely position of the lipid bilayer. An automated algorithm (TMDET) is given to determine the membrane planes relative to the position of atomic coordinates, together with a discrimination function which is able to separate transmembrane and globular proteins even in cases of low resolution or incomplete structures such as fragments or parts of large multi chain complexes. This method can be used for the proper annotation of protein structures containing transmembrane segments and paves the way to an up-to-date database containing the structure of all known transmembrane proteins and fragments (PDB_TM) which can be automatically updated. The algorithm is equally important for the purpose of constructing databases purely of globular proteins.

  17. Identification of Putative ORF5 Protein of Porcine Circovirus Type 2 and Functional Analysis of GFP-Fused ORF5 Protein

    PubMed Central

    Xu, Han; Wang, Tao; Zhang, Yanming

    2015-01-01

    Porcine circovirus type 2 (PCV2) is the essential infectious agent responsible for causing porcine circovirus-associated diseases in pigs. To date, eleven RNAs and five viral proteins of PCV2 have been detected. Here, we identified a novel viral gene within the PCV2 genome, termed ORF5, that exists at both the transcriptional and translational level during productive infection of PCV2 in porcine alveolar macrophages 3D4/2 (PAMs). Northern blot analysis was used to demonstrate that the ORF5 gene measures 180 bp in length and overlaps completely with ORF1 when read in the same direction. Site-directed mutagenesis was used to show that the ORF5 protein is not essential for PCV2 replication. To investigate the biological functions of the novel protein, we constructed a recombinant eukaryotic expression plasmid capable of expressing PCV2 ORF5. The results show that the GFP-tagged PCV2 ORF5 protein localizes to the endoplasmic reticulum (ER), is degraded via the proteasome, inhibits PAM growth and prolongs the S-phase of the cell cycle. Further studies show that the GFP-tagged PCV2 ORF5 protein induces ER stress and activates NF-κB, which was further confirmed by a significant upregulation in IL-6, IL-8 and COX-2 expression. In addition, five cellular proteins (GPNMB, CYP1A1, YWHAB, ZNF511 and SRSF3) were found to interact with ORF5 via yeast two-hybrid assay. These findings provide novel information on the identification and functional analysis of the PCV2 ORF5 protein and are likely to be of benefit in elucidating the molecular mechanisms of PCV2 pathogenicity. However, additional experiments are needed to validate the expression and function of the ORF5 protein during PCV2 infection in vitro before any definitive conclusion can be drawn. PMID:26035722

  18. Accurate high-throughput structure mapping and prediction with transition metal ion FRET

    PubMed Central

    Yu, Xiaozhen; Wu, Xiongwu; Bermejo, Guillermo A.; Brooks, Bernard R.; Taraska, Justin W.

    2013-01-01

    Mapping the landscape of a protein’s conformational space is essential to understanding its functions and regulation. The limitations of many structural methods have made this process challenging for most proteins. Here, we report that transition metal ion FRET (tmFRET) can be used in a rapid, highly parallel screen, to determine distances from multiple locations within a protein at extremely low concentrations. The distances generated through this screen for the protein Maltose Binding Protein (MBP) match distances from the crystal structure to within a few angstroms. Furthermore, energy transfer accurately detects structural changes during ligand binding. Finally, fluorescence-derived distances can be used to guide molecular simulations to find low energy states. Our results open the door to rapid, accurate mapping and prediction of protein structures at low concentrations, in large complex systems, and in living cells. PMID:23273426

  19. Discerning cultural identification from a thinly sliced behavioral sample.

    PubMed

    Hamamura, Takeshi; Li, Liman Man Wai

    2012-12-01

    This research examined whether individual differences in cultural identification can be discerned at zero acquaintance. This issue was examined in Hong Kong, where the idiosyncrasy of cultural identification is a salient social-psychological issue. The participants were able to perceive accurately the targets' identification with Western culture from a video clip and from a still image. Findings also indicated that a stereotype of Western cultural identity (i.e., extraversion and particular hairstyle) facilitated these perceptions. Specifically, (a) the participants with a stronger stereotype were more accurate in perceiving Western cultural identification, (b) the targets who were experimentally manipulated to appear extraverted were rated as more strongly identifying with Western culture, and (c) the participants relatively unfamiliar with these stereotypes did not correctly perceive Western cultural identification. Implications of these findings on research on multiculturalism are also discussed.

  20. [Electrophoretic patterns of cell wall protein as a criterion for the identification and classification of Corynebacteria].

    PubMed

    Mykhal's'kyĭ, L O; Furtat, I M; Dem'ianenko, F P; Kostiuchyk, A A

    2001-01-01

    Electrophoretic patterns of cell wall protein of three industrial strains, that were used for production of lysin, and eight collection strains from the genus Corynevacterium were studied to analyze their similarity as well as to estimate an opportunity of using this parameter as an additional criterion for identification and classification of corynebacteria. Similarity coefficient of cell wall overall and main protein electrophoretic patterns were determined by a specially created computer program. Electrophoretic analysis showed that every specie had an individual protein profile. There were determined biopolymers common for the specie, genus and individual among the overall majors and minors. The obtained results showed, that the patterns of main proteins were more conservative and informative in comparison with those ones of overall proteins. The definition of similarity coefficient by the main protein patterns has correlated with the protein profile characteristics of every analyzed strain, and it managed to distribute them into the separate groups. The similarity coefficient of preparations by the main protein patterns allows to separate one specie or a strain from another, and that gives us a chance to claim that this parameter could be used as an additional criterion for differentiation and referring the corynebacteria to a certain taxonomic group.

  1. Comparison of identification methods for oral asaccharolytic Eubacterium species.

    PubMed

    Wade, W G; Slayne, M A; Aldred, M J

    1990-12-01

    Thirty one strains of oral, asaccharolytic Eubacterium spp. and the type strains of E. brachy, E. nodatum and E. timidum were subjected to three identification techniques--protein-profile analysis, determination of metabolic end-products, and the API ATB32A identification kit. Five clusters were obtained from numerical analysis of protein profiles and excellent correlations were seen with the other two methods. Protein profiles alone allowed unequivocal identification.

  2. A Multifaceted Study of Scedosporium boydii Cell Wall Changes during Germination and Identification of GPI-Anchored Proteins

    PubMed Central

    Ghamrawi, Sarah; Gastebois, Amandine; Zykwinska, Agata; Vandeputte, Patrick; Marot, Agnès; Mabilleau, Guillaume; Cuenot, Stéphane; Bouchara, Jean-Philippe

    2015-01-01

    Scedosporium boydii is a pathogenic filamentous fungus that causes a wide range of human infections, notably respiratory infections in patients with cystic fibrosis. The development of new therapeutic strategies targeting S. boydii necessitates a better understanding of the physiology of this fungus and the identification of new molecular targets. In this work, we studied the conidium-to-germ tube transition using a variety of techniques including scanning and transmission electron microscopy, atomic force microscopy, two-phase partitioning, microelectrophoresis and cationized ferritin labeling, chemical force spectroscopy, lectin labeling, and nanoLC-MS/MS for cell wall GPI-anchored protein analysis. We demonstrated that the cell wall undergoes structural changes with germination accompanied with a lower hydrophobicity, electrostatic charge and binding capacity to cationized ferritin. Changes during germination also included a higher accessibility of some cell wall polysaccharides to lectins and less CH3/CH3 interactions (hydrophobic adhesion forces mainly due to glycoproteins). We also extracted and identified 20 GPI-anchored proteins from the cell wall of S. boydii, among which one was detected only in the conidial wall extract and 12 only in the mycelial wall extract. The identified sequences belonged to protein families involved in virulence in other fungi like Gelp/Gasp, Crhp, Bglp/Bgtp families and a superoxide dismutase. These results highlighted the cell wall remodeling during germination in S. boydii with the identification of a substantial number of cell wall GPI-anchored conidial or hyphal specific proteins, which provides a basis to investigate the role of these molecules in the host-pathogen interaction and fungal virulence. PMID:26038837

  3. A Multifaceted Study of Scedosporium boydii Cell Wall Changes during Germination and Identification of GPI-Anchored Proteins.

    PubMed

    Ghamrawi, Sarah; Gastebois, Amandine; Zykwinska, Agata; Vandeputte, Patrick; Marot, Agnès; Mabilleau, Guillaume; Cuenot, Stéphane; Bouchara, Jean-Philippe

    2015-01-01

    Scedosporium boydii is a pathogenic filamentous fungus that causes a wide range of human infections, notably respiratory infections in patients with cystic fibrosis. The development of new therapeutic strategies targeting S. boydii necessitates a better understanding of the physiology of this fungus and the identification of new molecular targets. In this work, we studied the conidium-to-germ tube transition using a variety of techniques including scanning and transmission electron microscopy, atomic force microscopy, two-phase partitioning, microelectrophoresis and cationized ferritin labeling, chemical force spectroscopy, lectin labeling, and nanoLC-MS/MS for cell wall GPI-anchored protein analysis. We demonstrated that the cell wall undergoes structural changes with germination accompanied with a lower hydrophobicity, electrostatic charge and binding capacity to cationized ferritin. Changes during germination also included a higher accessibility of some cell wall polysaccharides to lectins and less CH3/CH3 interactions (hydrophobic adhesion forces mainly due to glycoproteins). We also extracted and identified 20 GPI-anchored proteins from the cell wall of S. boydii, among which one was detected only in the conidial wall extract and 12 only in the mycelial wall extract. The identified sequences belonged to protein families involved in virulence in other fungi like Gelp/Gasp, Crhp, Bglp/Bgtp families and a superoxide dismutase. These results highlighted the cell wall remodeling during germination in S. boydii with the identification of a substantial number of cell wall GPI-anchored conidial or hyphal specific proteins, which provides a basis to investigate the role of these molecules in the host-pathogen interaction and fungal virulence.

  4. Identification of Immunoreactive Leishmania infantum Protein Antigens to Asymptomatic Dog Sera through Combined Immunoproteomics and Bioinformatics Analysis

    PubMed Central

    Samiotaki, Martina; Panayotou, George; Karagouni, Evdokia

    2016-01-01

    Leishmania infantum is the etiologic agent of zoonotic visceral leishmaniasis (VL) in countries in the Mediterranean basin, where dogs are the domestic reservoirs and represent important elements in the transmission of the disease. Since the major focal areas of human VL exhibit a high prevalence of seropositive dogs, the control of canine VL could reduce the infection rate in humans. Efforts toward this have focused on the improvement of diagnostic tools, as well as on vaccine development. The identification of parasite antigens including suitable major histocompatibility complex (MHC) class I- and/or II-restricted epitopes is very important since disease protection is characterized by strong and long-lasting CD8+ T and CD4+ Th1 cell-dominated immunity. In the present study, total protein extract from late-log phase L. infantum promastigotes was analyzed by two-dimensional western blots and probed with sera from asymptomatic and symptomatic dogs. A total of 42 protein spots were found to differentially react with IgG from asymptomatic dogs, while 17 of these identified by Coommasie stain were extracted and analyzed. Of these, 21 proteins were identified by mass spectrometry; they were mainly involved in metabolism and stress responses. An in silico analysis predicted that the chaperonin HSP60, dihydrolipoamide dehydrogenase, enolase, cyclophilin 2, cyclophilin 40, and one hypothetical protein contain promiscuous MHCI and/or MHCII epitopes. Our results suggest that the combination of immunoproteomics and bioinformatics analyses is a promising method for the identification of novel candidate antigens for vaccine development or with potential use in the development of sensitive diagnostic tests. PMID:26906226

  5. ChIP-seq Accurately Predicts Tissue-Specific Activity of Enhancers

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Visel, Axel; Blow, Matthew J.; Li, Zirong

    2009-02-01

    A major yet unresolved quest in decoding the human genome is the identification of the regulatory sequences that control the spatial and temporal expression of genes. Distant-acting transcriptional enhancers are particularly challenging to uncover since they are scattered amongst the vast non-coding portion of the genome. Evolutionary sequence constraint can facilitate the discovery of enhancers, but fails to predict when and where they are active in vivo. Here, we performed chromatin immunoprecipitation with the enhancer-associated protein p300, followed by massively-parallel sequencing, to map several thousand in vivo binding sites of p300 in mouse embryonic forebrain, midbrain, and limb tissue. Wemore » tested 86 of these sequences in a transgenic mouse assay, which in nearly all cases revealed reproducible enhancer activity in those tissues predicted by p300 binding. Our results indicate that in vivo mapping of p300 binding is a highly accurate means for identifying enhancers and their associated activities and suggest that such datasets will be useful to study the role of tissue-specific enhancers in human biology and disease on a genome-wide scale.« less

  6. Identification of TOEFAZ1-interacting proteins reveals key regulators of Trypanosoma brucei cytokinesis.

    PubMed

    Hilton, Nicholas A; Sladewski, Thomas E; Perry, Jenna A; Pataki, Zemplen; Sinclair-Davis, Amy N; Muniz, Richard S; Tran, Holly L; Wurster, Jenna I; Seo, Jiwon; de Graffenried, Christopher L

    2018-05-21

    The protist parasite Trypanosoma brucei is an obligate extracellular pathogen that retains its highly-polarized morphology during cell division and has evolved a novel cytokinetic process independent of non-muscle myosin II. The polo-like kinase homolog TbPLK is essential for transmission of cell polarity during division and for cytokinesis. We previously identified a putative TbPLK substrate named Tip of the Extending FAZ 1 (TOEFAZ1) as an essential kinetoplastid-specific component of the T. brucei cytokinetic machinery. We performed a proximity-dependent biotinylation (BioID) screen using TOEFAZ1 as a means to identify additional proteins that are involved in cytokinesis. Using quantitative proteomic methods, we identified nearly 500 TOEFAZ1-proximal proteins and characterized 59 in further detail. Among the candidates, we identified an essential putative phosphatase that regulates the expression level and localization of both TOEFAZ1 and TbPLK, a previously uncharacterized protein that is necessary for the assembly of a new cell posterior, and a microtubule plus-end directed orphan kinesin that is required for completing cleavage furrow ingression. The identification of these proteins provides new insight into T. brucei cytokinesis and establishes TOEFAZ1 as a key component of this essential and uniquely-configured process in kinetoplastids. This article is protected by copyright. All rights reserved. © 2018 John Wiley & Sons Ltd.

  7. Identification of small peptides arising from hydrolysis of meat proteins in dry fermented sausages.

    PubMed

    López, Constanza M; Bru, Elena; Vignolo, Graciela M; Fadda, Silvina G

    2015-06-01

    In this study, proteolysis and low molecular weight (LMW) peptides (<3kDa) from commercial Argentinean fermented sausages were characterized by applying a peptidomic approach. Protein profiles and peptides obtained by Tricine-SDS-PAGE and RP-HPLC-MS, respectively, allowed distinguishing two different types of fermented sausages, although no specific biomarkers relating to commercial brands or quality were recognized. From electrophoresis, α-actin, myoglobin, creatine kinase M-type and L-lactate dehydrogenase were degraded at different intensities. In addition, a partial characterization of fermented sausage peptidome through the identification of 36 peptides, in the range of 1000-2100 Da, arising from sarcoplasmic (28) and myofibrillar (8) proteins was achieved. These peptides had been originated from α-actin, myoglobin, and creatine kinase M-type, but also from the hydrolysis of other proteins not previously reported. Although muscle enzymes exerted a major role on peptidogenesis, microbial contribution cannot be excluded as it was postulated herein. This work represents a first peptidomic approach for fermented sausages, thereby providing a baseline to define key peptides acting as potential biomarkers. Copyright © 2015 Elsevier Ltd. All rights reserved.

  8. Identification of Sequence Specificity of 5-Methylcytosine Oxidation by Tet1 Protein with High-Throughput Sequencing.

    PubMed

    Kizaki, Seiichiro; Chandran, Anandhakumar; Sugiyama, Hiroshi

    2016-03-02

    Tet (ten-eleven translocation) family proteins have the ability to oxidize 5-methylcytosine (mC) to 5-hydroxymethylcytosine (hmC), 5-formylcytosine (fC), and 5-carboxycytosine (caC). However, the oxidation reaction of Tet is not understood completely. Evaluation of genomic-level epigenetic changes by Tet protein requires unbiased identification of the highly selective oxidation sites. In this study, we used high-throughput sequencing to investigate the sequence specificity of mC oxidation by Tet1. A 6.6×10(4) -member mC-containing random DNA-sequence library was constructed. The library was subjected to Tet-reactive pulldown followed by high-throughput sequencing. Analysis of the obtained sequence data identified the Tet1-reactive sequences. We identified mCpG as a highly reactive sequence of Tet1 protein. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  9. Identification of S-glutathionylation sites in species-specific proteins by incorporating five sequence-derived features into the general pseudo-amino acid composition.

    PubMed

    Zhao, Xiaowei; Ning, Qiao; Ai, Meiyue; Chai, Haiting; Yang, Guifu

    2016-06-07

    As a selective and reversible protein post-translational modification, S-glutathionylation generates mixed disulfides between glutathione (GSH) and cysteine residues, and plays an important role in regulating protein activity, stability, and redox regulation. To fully understand S-glutathionylation mechanisms, identification of substrates and specific S-Glutathionylated sites is crucial. Experimental identification of S-glutathionylated sites is labor-intensive and time consuming, so establishing an effective computational method is much desirable due to their convenient and fast speed. Therefore, in this study, a new bioinformatics tool named SSGlu (Species-Specific identification of Protein S-glutathionylation Sites) was developed to identify species-specific protein S-glutathionylated sites, utilizing support vector machines that combine multiple sequence-derived features with a two-step feature selection. By 5-fold cross validation, the performance of SSGlu was measured with an AUC of 0.8105 and 0.8041 for Homo sapiens and Mus musculus, respectively. Additionally, SSGlu was compared with the existing methods, and the higher MCC and AUC of SSGlu demonstrated that SSGlu was very promising to predict S-glutathionylated sites. Furthermore, a site-specific analysis showed that S-glutathionylation intimately correlated with the features derived from its surrounding sites. The conclusions derived from this study might help to understand more of the S-glutathionylation mechanism and guide the related experimental validation. For public access, SSGlu is freely accessible at http://59.73.198.144:8080/SSGlu/. Copyright © 2016 Elsevier Ltd. All rights reserved.

  10. Toward the accurate first-principles prediction of ionization equilibria in proteins.

    PubMed

    Khandogin, Jana; Brooks, Charles L

    2006-08-08

    The calculation of pK(a) values for ionizable sites in proteins has been traditionally based on numerical solutions of the Poisson-Boltzmann equation carried out using a high-resolution protein structure. In this paper, we present a method based on continuous constant pH molecular dynamics (CPHMD) simulations, which allows the first-principles description of protein ionization equilibria. Our method utilizes an improved generalized Born implicit solvent model with an approximate Debye-Hückel screening function to account for salt effects and the replica-exchange (REX) protocol for enhanced conformational and protonation state sampling. The accuracy and robustness of the present method are demonstrated by 1 ns REX-CPHMD titration simulations of 10 proteins, which exhibit anomalously large pK(a) shifts for the carboxylate and histidine side chains. The experimental pK(a) values of these proteins are reliably reproduced with a root-mean-square error ranging from 0.6 unit for proteins containing few buried ionizable side chains to 1.0 unit or slightly higher for proteins containing ionizable side chains deeply buried in the core and experiencing strong charge-charge interactions. This unprecedented level of agreement with experimental benchmarks for the de novo calculation of pK(a) values suggests that the CPHMD method is maturing into a practical tool for the quantitative prediction of protein ionization equilibria, and this, in turn, opens a door to atomistic simulations of a wide variety of pH-coupled conformational phenomena in biological macromolecules such as protein folding or misfolding, aggregation, ligand binding, membrane interaction, and catalysis.

  11. Animal models of protein allergenicity: potential benefits, pitfalls and challenges.

    PubMed

    Dearman, R J; Kimber, I

    2009-04-01

    Food allergy is an important health issue. With an increasing interest in novel foods derived from transgenic crop plants, there is a growing need for the development of approaches suitable for the characterization of the allergenic potential of proteins. There are methods available currently (such as homology searches and serological testing) that are very effective at identifying proteins that are likely to cross-react with known allergens. However, animal models may play a role in the identification of truly novel proteins, such as bacterial or fungal proteins, that have not been experienced previously in the diet. We consider here the potential benefits, pitfalls and challenges of the selection of various animal models, including the mouse, the rat, the dog and the neonatal swine. The advantages and disadvantages of various experimental end-points are discussed, including the measurement of specific IgE by ELISA, Western blotting or functional tests such as the passive cutaneous anaphylaxis assay, and the assessment of challenge-induced clinical symptoms in previously sensitized animals. The experimental variables of route of exposure to test proteins and the incorporation of adjuvant to increase the sensitivity of the responses are considered also. It is important to emphasize that currently none of these approaches has been validated for the purposes of hazard identification in the context of a safety assessment. However, the available evidence suggests that the judicious use of an accurate and robust animal model could provide important additional data that would contribute significantly to the assessment of the potential allergenicity of novel proteins.

  12. HubAlign: an accurate and efficient method for global alignment of protein-protein interaction networks.

    PubMed

    Hashemifar, Somaye; Xu, Jinbo

    2014-09-01

    High-throughput experimental techniques have produced a large amount of protein-protein interaction (PPI) data. The study of PPI networks, such as comparative analysis, shall benefit the understanding of life process and diseases at the molecular level. One way of comparative analysis is to align PPI networks to identify conserved or species-specific subnetwork motifs. A few methods have been developed for global PPI network alignment, but it still remains challenging in terms of both accuracy and efficiency. This paper presents a novel global network alignment algorithm, denoted as HubAlign, that makes use of both network topology and sequence homology information, based upon the observation that topologically important proteins in a PPI network usually are much more conserved and thus, more likely to be aligned. HubAlign uses a minimum-degree heuristic algorithm to estimate the topological and functional importance of a protein from the global network topology information. Then HubAlign aligns topologically important proteins first and gradually extends the alignment to the whole network. Extensive tests indicate that HubAlign greatly outperforms several popular methods in terms of both accuracy and efficiency, especially in detecting functionally similar proteins. HubAlign is available freely for non-commercial purposes at http://ttic.uchicago.edu/∼hashemifar/software/HubAlign.zip. Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press.

  13. Accurate Prediction of Protein Contact Maps by Coupling Residual Two-Dimensional Bidirectional Long Short-Term Memory with Convolutional Neural Networks.

    PubMed

    Hanson, Jack; Paliwal, Kuldip; Litfin, Thomas; Yang, Yuedong; Zhou, Yaoqi

    2018-06-19

    Accurate prediction of a protein contact map depends greatly on capturing as much contextual information as possible from surrounding residues for a target residue pair. Recently, ultra-deep residual convolutional networks were found to be state-of-the-art in the latest Critical Assessment of Structure Prediction techniques (CASP12, (Schaarschmidt et al., 2018)) for protein contact map prediction by attempting to provide a protein-wide context at each residue pair. Recurrent neural networks have seen great success in recent protein residue classification problems due to their ability to propagate information through long protein sequences, especially Long Short-Term Memory (LSTM) cells. Here we propose a novel protein contact map prediction method by stacking residual convolutional networks with two-dimensional residual bidirectional recurrent LSTM networks, and using both one-dimensional sequence-based and two-dimensional evolutionary coupling-based information. We show that the proposed method achieves a robust performance over validation and independent test sets with the Area Under the receiver operating characteristic Curve (AUC)>0.95 in all tests. When compared to several state-of-the-art methods for independent testing of 228 proteins, the method yields an AUC value of 0.958, whereas the next-best method obtains an AUC of 0.909. More importantly, the improvement is over contacts at all sequence-position separations. Specifically, a 8.95%, 5.65% and 2.84% increase in precision were observed for the top L∕10 predictions over the next best for short, medium and long-range contacts, respectively. This confirms the usefulness of ResNets to congregate the short-range relations and 2D-BRLSTM to propagate the long-range dependencies throughout the entire protein contact map 'image'. SPOT-Contact server url: http://sparks-lab.org/jack/server/SPOT-Contact/. Supplementary data is available at Bioinformatics online.

  14. Fluorescent protein vectors for pancreatic islet cell identification in live-cell imaging.

    PubMed

    Shuai, Hongyan; Xu, Yunjian; Yu, Qian; Gylfe, Erik; Tengholm, Anders

    2016-10-01

    The islets of Langerhans contain different types of endocrine cells, which are crucial for glucose homeostasis. β- and α-cells that release insulin and glucagon, respectively, are most abundant, whereas somatostatin-producing δ-cells and particularly pancreatic polypeptide-releasing PP-cells are more scarce. Studies of islet cell function are hampered by difficulties to identify the different cell types, especially in live-cell imaging experiments when immunostaining is unsuitable. The aim of the present study was to create a set of vectors for fluorescent protein expression with cell-type-specific promoters and evaluate their applicability in functional islet imaging. We constructed six adenoviral vectors for expression of red and green fluorescent proteins controlled by the insulin, preproglucagon, somatostatin, or pancreatic polypeptide promoters. After transduction of mouse and human islets or dispersed islet cells, a majority of the fluorescent cells also immunostained for the appropriate hormone. Recordings of the sub-plasma membrane Ca(2+) and cAMP concentrations with a fluorescent indicator and a protein biosensor, respectively, showed that labeled cells respond to glucose and other modulators of secretion and revealed a striking variability in Ca(2+) signaling among α-cells. The measurements allowed comparison of the phase relationship of Ca(2+) oscillations between different types of cells within intact islets. We conclude that the fluorescent protein vectors allow easy identification of specific islet cell types and can be used in live-cell imaging together with organic dyes and genetically encoded biosensors. This approach will facilitate studies of normal islet physiology and help to clarify molecular defects and disturbed cell interactions in diabetic islets.

  15. Identification of liver protein targets modified by tienilic acid metabolites using a two-dimensional Western blot-mass spectrometry approach

    NASA Astrophysics Data System (ADS)

    Methogo, Ruth Menque; Dansette, Patrick M.; Klarskov, Klaus

    2007-12-01

    A combined approach based on two-dimensional electrophoresis-immuno-blotting and nanoliquid chromatography coupled on-line with electrospray ionization mass spectrometry (nLC-MS/MS) was used to identify proteins modified by a reactive intermediate of tienilic acid (TA). Liver homogenates from rats exposed to TA were fractionated using ultra centrifugation; four fractions were obtained and subjected to 2D electrophoresis. Following transfer to PVDF membranes, modified proteins were visualized after India ink staining, using an anti-serum raised against TA and ECL detection. Immuno-reactive spots were localized on the PVDF membrane by superposition of the ECL image, protein spots of interest were excised, digested on the membrane with trypsin followed by nLC-MS/MS analysis and protein identification. A total of 15 proteins were identified as likely targets modified by a TA reactive metabolite. These include selenium binding protein 2, senescence marker protein SMP-30, adenosine kinase, Acy1 protein, adenosylhomocysteinase, capping protein (actin filament), protein disulfide isomerase, fumarylacetoacetase, arginase chain A, ketohexokinase, proteasome endopeptidase complex, triosephosphate isomerase, superoxide dismutase, dna-type molecular chaperone hsc73 and malate dehydrogenase.

  16. Affinity purification combined with mass spectrometry to identify herpes simplex virus protein-protein interactions.

    PubMed

    Meckes, David G

    2014-01-01

    The identification and characterization of herpes simplex virus protein interaction complexes are fundamental to understanding the molecular mechanisms governing the replication and pathogenesis of the virus. Recent advances in affinity-based methods, mass spectrometry configurations, and bioinformatics tools have greatly increased the quantity and quality of protein-protein interaction datasets. In this chapter, detailed and reliable methods that can easily be implemented are presented for the identification of protein-protein interactions using cryogenic cell lysis, affinity purification, trypsin digestion, and mass spectrometry.

  17. Affinity Proteomics for Fast, Sensitive, Quantitative Analysis of Proteins in Plasma.

    PubMed

    O'Grady, John P; Meyer, Kevin W; Poe, Derrick N

    2017-01-01

    The improving efficacy of many biological therapeutics and identification of low-level biomarkers are driving the analytical proteomics community to deal with extremely high levels of sample complexity relative to their analytes. Many protein quantitation and biomarker validation procedures utilize an immunoaffinity enrichment step to purify the sample and maximize the sensitivity of the corresponding liquid chromatography tandem mass spectrometry measurements. In order to generate surrogate peptides with better mass spectrometric properties, protein enrichment is followed by a proteolytic cleavage step. This is often a time-consuming multistep process. Presented here is a workflow which enables rapid protein enrichment and proteolytic cleavage to be performed in a single, easy-to-use reactor. Using this strategy Klotho, a low-abundance biomarker found in plasma, can be accurately quantitated using a protocol that takes under 5 h from start to finish.

  18. FastRNABindR: Fast and Accurate Prediction of Protein-RNA Interface Residues.

    PubMed

    El-Manzalawy, Yasser; Abbas, Mostafa; Malluhi, Qutaibah; Honavar, Vasant

    2016-01-01

    A wide range of biological processes, including regulation of gene expression, protein synthesis, and replication and assembly of many viruses are mediated by RNA-protein interactions. However, experimental determination of the structures of protein-RNA complexes is expensive and technically challenging. Hence, a number of computational tools have been developed for predicting protein-RNA interfaces. Some of the state-of-the-art protein-RNA interface predictors rely on position-specific scoring matrix (PSSM)-based encoding of the protein sequences. The computational efforts needed for generating PSSMs severely limits the practical utility of protein-RNA interface prediction servers. In this work, we experiment with two approaches, random sampling and sequence similarity reduction, for extracting a representative reference database of protein sequences from more than 50 million protein sequences in UniRef100. Our results suggest that random sampled databases produce better PSSM profiles (in terms of the number of hits used to generate the profile and the distance of the generated profile to the corresponding profile generated using the entire UniRef100 data as well as the accuracy of the machine learning classifier trained using these profiles). Based on our results, we developed FastRNABindR, an improved version of RNABindR for predicting protein-RNA interface residues using PSSM profiles generated using 1% of the UniRef100 sequences sampled uniformly at random. To the best of our knowledge, FastRNABindR is the only protein-RNA interface residue prediction online server that requires generation of PSSM profiles for query sequences and accepts hundreds of protein sequences per submission. Our approach for determining the optimal BLAST database for a protein-RNA interface residue classification task has the potential of substantially speeding up, and hence increasing the practical utility of, other amino acid sequence based predictors of protein-protein and protein

  19. Evaluation of matrix-assisted laser desorption/ionization time of flight mass spectrometry for the identification of ceratopogonid and culicid larvae.

    PubMed

    Steinmann, I C; Pflüger, V; Schaffner, F; Mathis, A; Kaufmann, C

    2013-03-01

    Matrix-assisted laser desorption/ionization time of flight mass spectrometry (MALDI-TOF MS) was evaluated for the rapid identification of ceratopogonid larvae. Optimal sample preparation as evaluated with laboratory-reared biting midges Culicoides nubeculosus was the homogenization of gut-less larvae in 10% formic acid, and analysis of 0.2 mg/ml crude protein homogenate mixed with SA matrix at a ratio of 1:1.5. Using 5 larvae each of 4 ceratopogonid species (C. nubeculosus, C. obsoletus, C. decor, and Dasyhelea sp.) and of 2 culicid species (Aedes aegypti, Ae. japonicus), biomarker mass sets between 27 and 33 masses were determined. In a validation study, 67 larvae belonging to the target species were correctly identified by automated database-based identification (91%) or manual full comparison (9%). Four specimens of non-target species did not yield identification. As anticipated for holometabolous insects, the biomarker mass sets of adults cannot be used for the identification of larvae, and vice versa, because they share only very few similar masses as shown for C. nubeculosus, C. obsoletus, and Ae. japonicus. Thus, protein profiling by MALDI-TOF as a quick, inexpensive and accurate alternative tool is applicable to identify insect larvae of vector species collected in the field.

  20. Identification of T1D susceptibility genes within the MHC region by combining protein interaction networks and SNP genotyping data

    PubMed Central

    Brorsson, C.; Hansen, N. T.; Lage, K.; Bergholdt, R.; Brunak, S.; Pociot, F.

    2009-01-01

    Aim To develop novel methods for identifying new genes that contribute to the risk of developing type 1 diabetes within the Major Histocompatibility Complex (MHC) region on chromosome 6, independently of the known linkage disequilibrium (LD) between human leucocyte antigen (HLA)-DRB1, -DQA1, -DQB1 genes. Methods We have developed a novel method that combines single nucleotide polymorphism (SNP) genotyping data with protein–protein interaction (ppi) networks to identify disease-associated network modules enriched for proteins encoded from the MHC region. Approximately 2500 SNPs located in the 4 Mb MHC region were analysed in 1000 affected offspring trios generated by the Type 1 Diabetes Genetics Consortium (T1DGC). The most associated SNP in each gene was chosen and genes were mapped to ppi networks for identification of interaction partners. The association testing and resulting interacting protein modules were statistically evaluated using permutation. Results A total of 151 genes could be mapped to nodes within the protein interaction network and their interaction partners were identified. Five protein interaction modules reached statistical significance using this approach. The identified proteins are well known in the pathogenesis of T1D, but the modules also contain additional candidates that have been implicated in β-cell development and diabetic complications. Conclusions The extensive LD within the MHC region makes it important to develop new methods for analysing genotyping data for identification of additional risk genes for T1D. Combining genetic data with knowledge about functional pathways provides new insight into mechanisms underlying T1D. PMID:19143816

  1. Fatty Acid-binding Proteins Interact with Comparative Gene Identification-58 Linking Lipolysis with Lipid Ligand Shuttling*

    PubMed Central

    Hofer, Peter; Boeszoermenyi, Andras; Jaeger, Doris; Feiler, Ursula; Arthanari, Haribabu; Mayer, Nicole; Zehender, Fabian; Rechberger, Gerald; Oberer, Monika; Zimmermann, Robert; Lass, Achim; Haemmerle, Guenter; Breinbauer, Rolf; Zechner, Rudolf; Preiss-Landl, Karina

    2015-01-01

    The coordinated breakdown of intracellular triglyceride (TG) stores requires the exquisitely regulated interaction of lipolytic enzymes with regulatory, accessory, and scaffolding proteins. Together they form a dynamic multiprotein network designated as the “lipolysome.” Adipose triglyceride lipase (Atgl) catalyzes the initiating step of TG hydrolysis and requires comparative gene identification-58 (Cgi-58) as a potent activator of enzyme activity. Here, we identify adipocyte-type fatty acid-binding protein (A-Fabp) and other members of the fatty acid-binding protein (Fabp) family as interaction partners of Cgi-58. Co-immunoprecipitation, microscale thermophoresis, and solid phase assays proved direct protein/protein interaction between A-Fabp and Cgi-58. Using nuclear magnetic resonance titration experiments and site-directed mutagenesis, we located a potential contact region on A-Fabp. In functional terms, A-Fabp stimulates Atgl-catalyzed TG hydrolysis in a Cgi-58-dependent manner. Additionally, transcriptional transactivation assays with a luciferase reporter system revealed that Fabps enhance the ability of Atgl/Cgi-58-mediated lipolysis to induce the activity of peroxisome proliferator-activated receptors. Our studies identify Fabps as crucial structural and functional components of the lipolysome. PMID:25953897

  2. Generic comparison of protein inference engines.

    PubMed

    Claassen, Manfred; Reiter, Lukas; Hengartner, Michael O; Buhmann, Joachim M; Aebersold, Ruedi

    2012-04-01

    Protein identifications, instead of peptide-spectrum matches, constitute the biologically relevant result of shotgun proteomics studies. How to appropriately infer and report protein identifications has triggered a still ongoing debate. This debate has so far suffered from the lack of appropriate performance measures that allow us to objectively assess protein inference approaches. This study describes an intuitive, generic and yet formal performance measure and demonstrates how it enables experimentalists to select an optimal protein inference strategy for a given collection of fragment ion spectra. We applied the performance measure to systematically explore the benefit of excluding possibly unreliable protein identifications, such as single-hit wonders. Therefore, we defined a family of protein inference engines by extending a simple inference engine by thousands of pruning variants, each excluding a different specified set of possibly unreliable identifications. We benchmarked these protein inference engines on several data sets representing different proteomes and mass spectrometry platforms. Optimally performing inference engines retained all high confidence spectral evidence, without posterior exclusion of any type of protein identifications. Despite the diversity of studied data sets consistently supporting this rule, other data sets might behave differently. In order to ensure maximal reliable proteome coverage for data sets arising in other studies we advocate abstaining from rigid protein inference rules, such as exclusion of single-hit wonders, and instead consider several protein inference approaches and assess these with respect to the presented performance measure in the specific application context.

  3. Preprocessing Significantly Improves the Peptide/Protein Identification Sensitivity of High-resolution Isobarically Labeled Tandem Mass Spectrometry Data*

    PubMed Central

    Sheng, Quanhu; Li, Rongxia; Dai, Jie; Li, Qingrun; Su, Zhiduan; Guo, Yan; Li, Chen; Shyr, Yu; Zeng, Rong

    2015-01-01

    Isobaric labeling techniques coupled with high-resolution mass spectrometry have been widely employed in proteomic workflows requiring relative quantification. For each high-resolution tandem mass spectrum (MS/MS), isobaric labeling techniques can be used not only to quantify the peptide from different samples by reporter ions, but also to identify the peptide it is derived from. Because the ions related to isobaric labeling may act as noise in database searching, the MS/MS spectrum should be preprocessed before peptide or protein identification. In this article, we demonstrate that there are a lot of high-frequency, high-abundance isobaric related ions in the MS/MS spectrum, and removing isobaric related ions combined with deisotoping and deconvolution in MS/MS preprocessing procedures significantly improves the peptide/protein identification sensitivity. The user-friendly software package TurboRaw2MGF (v2.0) has been implemented for converting raw TIC data files to mascot generic format files and can be downloaded for free from https://github.com/shengqh/RCPA.Tools/releases as part of the software suite ProteomicsTools. The data have been deposited to the ProteomeXchange with identifier PXD000994. PMID:25435543

  4. Identification of a novel homolog of the Drosophila staufen protein in the chromosome 8q13-q21.1 region.

    PubMed

    Buchner, G; Bassi, M T; Andolfi, G; Ballabio, A; Franco, B

    1999-11-15

    We report the identification of a new transcript homologous to the Drosophila staufen protein. This transcript, named STAU2 (HGMW-approved gene symbol and name), maps to the chromosome 8q13-q21 region. The full-length STAU2 cDNA is 4058 bp and contains an open reading frame of 479 amino acids. Analysis of the predicted protein product indicated the presence of three double-stranded RNA-binding domains. Best-fit analysis revealed a 48.5% similarity to the Drosophila protein and a 59.9% similarity to the recently described mammalian homolog hStau, indicating that at least two different transcripts with homologies to the fly protein are present in mammals. Copyright 1999 Academic Press.

  5. Multidimensional gas chromatography in combination with accurate mass, tandem mass spectrometry, and element-specific detection for identification of sulfur compounds in tobacco smoke.

    PubMed

    Ochiai, Nobuo; Mitsui, Kazuhisa; Sasamoto, Kikuo; Yoshimura, Yuta; David, Frank; Sandra, Pat

    2014-09-05

    A method is developed for identification of sulfur compounds in tobacco smoke extract. The method is based on large volume injection (LVI) of 10μL of tobacco smoke extract followed by selectable one-dimensional ((1)D) or two-dimensional ((2)D) gas chromatography (GC) coupled to a hybrid quadrupole time-of-flight mass spectrometer (Q-TOF-MS) using electron ionization (EI) and positive chemical ionization (PCI), with parallel sulfur chemiluminescence detection (SCD). In order to identify each individual sulfur compound, sequential heart-cuts of 28 sulfur fractions from (1)D GC to (2)D GC were performed with the three MS detection modes (SCD/EI-TOF-MS, SCD/PCI-TOF-MS, and SCD/PCI-Q-TOF-MS). Thirty sulfur compounds were positively identified by MS library search, linear retention indices (LRI), molecular mass determination using PCI accurate mass spectra, formula calculation using EI and PCI accurate mass spectra, and structure elucidation using collision activated dissociation (CAD) of the protonated molecule. Additionally, 11 molecular formulas were obtained for unknown sulfur compounds. The determined values of the identified and unknown sulfur compounds were in the range of 10-740ngmg total particulate matter (TPM) (RSD: 1.2-12%, n=3). Copyright © 2014 The Authors. Published by Elsevier B.V. All rights reserved.

  6. Rapid identification and classification of Mycobacterium spp. using whole-cell protein barcodes with matrix assisted laser desorption ionization time of flight mass spectrometry in comparison with multigene phylogenetic analysis.

    PubMed

    Wang, Jun; Chen, Wen Feng; Li, Qing X

    2012-02-24

    The need of quick diagnostics and increasing number of bacterial species isolated necessitate development of a rapid and effective phenotypic identification method. Mass spectrometry (MS) profiling of whole cell proteins has potential to satisfy the requirements. The genus Mycobacterium contains more than 154 species that are taxonomically very close and require use of multiple genes including 16S rDNA for phylogenetic identification and classification. Six strains of five Mycobacterium species were selected as model bacteria in the present study because of their 16S rDNA similarity (98.4-99.8%) and the high similarity of the concatenated 16S rDNA, rpoB and hsp65 gene sequences (95.9-99.9%), requiring high identification resolution. The classification of the six strains by MALDI TOF MS protein barcodes was consistent with, but at much higher resolution than, that of the multi-locus sequence analysis of using 16S rDNA, rpoB and hsp65. The species were well differentiated using MALDI TOF MS and MALDI BioTyper™ software after quick preparation of whole-cell proteins. Several proteins were selected as diagnostic markers for species confirmation. An integration of MALDI TOF MS, MALDI BioTyper™ software and diagnostic protein fragments provides a robust phenotypic approach for bacterial identification and classification. Copyright © 2011 Elsevier B.V. All rights reserved.

  7. HFIP Extraction Followed by 2D CTAB/SDS-PAGE Separation: A New Methodology for Protein Identification from Tissue Sections after MALDI Mass Spectrometry Profiling for Personalized Medicine Research

    PubMed Central

    Longuespée, Rémi; Tastet, Christophe; Desmons, Annie; Kerdraon, Olivier; Day, Robert

    2014-01-01

    Abstract Matrix-assisted laser desorption ionization mass spectrometry imaging (MALDI-MSI) and profiling technology have become the easiest methods for quickly accessing the protein composition of a tissue area. Unfortunately, the demand for the identification of these proteins remains unmet. To overcome this bottleneck, we combined several strategies to identify the proteins detected via MALDI profiling including on-tissue protein extraction using hexafluoroIsopropanol (1,1,1,3,3,3-hexafluoro-2-propanol, HFIP) coupled with two-dimensional cetyl trimethylammonium bromide/sodium dodecyl sulfate–polyacrylamide gel electrophoresis (2D CTAB/SDS-PAGE) for separation followed by trypsin digestion and MALDI-MS analyses for identification. This strategy was compared with an on-tissue bottom-up strategy that we previously developed. The data reflect the complementarity of the approaches. An increase in the number of specific proteins identified has been established. This approach demonstrates the potential of adapted extraction procedures and the combination of parallel identification approaches for personalized medicine applications. The anatomical context provides important insight into identifying biomarkers and may be considered a first step for tissue-based biomarker research, as well as the extemporaneous examination of biopsies during surgery. PMID:24841221

  8. A novel strategy for global mapping of O-GlcNAc proteins and peptides using selective enzymatic deglycosylation, HILIC enrichment and mass spectrometry identification.

    PubMed

    Shen, Bingquan; Zhang, Wanjun; Shi, Zhaomei; Tian, Fang; Deng, Yulin; Sun, Changqing; Wang, Guangshun; Qin, Weijie; Qian, Xiaohong

    2017-07-01

    O-GlcNAcylation is a kind of dynamic O-linked glycosylation of nucleocytoplasmic and mitochondrial proteins. It serves as a major nutrient sensor to regulate numerous biological processes including transcriptional regulation, cell metabolism, cellular signaling, and protein degradation. Dysregulation of cellular O-GlcNAcylated levels contributes to the etiologies of many diseases such as diabetes, neurodegenerative disease and cancer. However, deeper insight into the biological mechanism of O-GlcNAcylation is hampered by its extremely low stoichiometry and the lack of efficient enrichment approaches for large-scale identification by mass spectrometry. Herein, we developed a novel strategy for the global identification of O-GlcNAc proteins and peptides using selective enzymatic deglycosylation, HILIC enrichment and mass spectrometry analysis. Standard O-GlcNAc peptides can be efficiently enriched even in the presence of 500-fold more abundant non-O-GlcNAc peptides and identified by mass spectrometry with a low nanogram detection sensitivity. This strategy successfully achieved the first large-scale enrichment and characterization of O-GlcNAc proteins and peptides in human urine. A total of 474 O-GlcNAc peptides corresponding to 457 O-GlcNAc proteins were identified by mass spectrometry analysis, which is at least three times more than that obtained by commonly used enrichment methods. A large number of unreported O-GlcNAc proteins related to cell cycle, biological regulation, metabolic and developmental process were found in our data. The above results demonstrated that this novel strategy is highly efficient in the global enrichment and identification of O-GlcNAc peptides. These data provide new insights into the biological function of O-GlcNAcylation in human urine, which is correlated with the physiological states and pathological changes of human body and therefore indicate the potential of this strategy for biomarker discovery from human urine. Copyright

  9. Identification of increased amounts of eppin protein complex components in sperm cells of diabetic and obese individuals by difference gel electrophoresis.

    PubMed

    Paasch, Uwe; Heidenreich, Falk; Pursche, Theresia; Kuhlisch, Eberhard; Kettner, Karina; Grunewald, Sonja; Kratzsch, Jürgen; Dittmar, Gunnar; Glander, Hans-Jürgen; Hoflack, Bernard; Kriegel, Thomas M

    2011-08-01

    Metabolic disorders like diabetes mellitus and obesity may compromise the fertility of men and women. To unveil disease-associated proteomic changes potentially affecting male fertility, the proteomes of sperm cells from type-1 diabetic, type-2 diabetic, non-diabetic obese and clinically healthy individuals were comparatively analyzed by difference gel electrophoresis. The adaptation of a general protein extraction procedure to the solubilization of proteins from sperm cells allowed for the resolution of 3187 fluorescent spots in the difference gel electrophoresis image of the master gel, which contained the entirety of solubilized sperm proteins. Comparison of the pathological and reference proteomes by applying an average abundance ratio setting of 1.6 and a p ≤ 0.05 criterion resulted in the identification of 79 fluorescent spots containing proteins that were present at significantly changed levels in the sperm cells. Biometric evaluation of the fluorescence data followed by mass spectrometric protein identification revealed altered levels of 12, 71, and 13 protein species in the proteomes of the type-1 diabetic, type-2 diabetic, and non-diabetic obese patients, respectively, with considerably enhanced amounts of the same set of one molecular form of semenogelin-1, one form of clusterin, and two forms of lactotransferrin in each group of pathologic samples. Remarkably, β-galactosidase-1-like protein was the only protein that was detected at decreased levels in all three pathologic situations. The former three proteins are part of the eppin (epididymal proteinase inhibitor) protein complex, which is thought to fulfill fertilization-related functions, such as ejaculate sperm protection, motility regulation and gain of competence for acrosome reaction, whereas the putative role of the latter protein to function as a glycosyl hydrolase during sperm maturation remains to be explored at the protein/enzyme level. The strikingly similar differences detected in the

  10. The plastid ribosomal proteins. Identification of all the proteins in the 30 S subunit of an organelle ribosome (chloroplast).

    PubMed

    Yamaguchi, K; von Knoblauch, K; Subramanian, A R

    2000-09-15

    Identification of all the protein components of a plastid (chloroplast) ribosomal 30 S subunit has been achieved, using two-dimensional gel electropholesis, high performance liquid chromatography purification, N-terminal sequencing, polymerase chain reaction-based screening of cDNA library, nucleotide sequencing, and mass spectrometry (electrospray ionization, matrix-assisted laser desorption/ionization time-of-flight, and reversed-phase HPLC coupled with electrospray ionization mass spectrometry). 25 proteins were identified, of which 21 are orthologues of all Escherichia coli 30 S ribosomal proteins (S1-S21), and 4 are plastid-specific ribosomal proteins (PSRPs) that have no homologues in the mitochondrial, archaebacterial, or cytosolic ribosomal protein sequences in data bases. 12 of the 25 plastid 30 S ribosomal proteins (PRPs) are encoded in the plastid genome, whereas the remaining 13 are encoded by the nuclear genome. Post-translational transit peptide cleavage sites for the maturation of the 13 cytosolically synthesized PRPs, and post-translational N-terminal processing in the maturation of the 12 plastid synthesized PRPs are described. Post-translational modifications in several PRPs were observed: alpha-N-acetylation of S9, N-terminal processings leading to five mature forms of S6 and two mature forms of S10, C-terminal and/or internal modifications in S1, S14, S18, and S19, leading to two distinct forms differing in mass and/or charge (the corresponding modifications are not observed in E. coli). The four PSRPs in spinach plastid 30 S ribosomal subunit (PSRP-1, 26.8 kDa, pI 6.2; PSRP-2, 21.7 kDa, pI 5.0; PSRP-3, 13.8 kDa, pI 4.9; PSRP-4, 5.2 kDa, pI 11.8) comprise 16% (67.6 kDa) of the total protein mass of the 30 S subunit (429.3 kDa). PSRP-1 and PSRP-3 show sequence similarities with hypothetical photosynthetic bacterial proteins, indicating their possible origins in photosynthetic bacteria. We propose the hypothesis that PSRPs form a "plastid

  11. BeStSel: a web server for accurate protein secondary structure prediction and fold recognition from the circular dichroism spectra.

    PubMed

    Micsonai, András; Wien, Frank; Bulyáki, Éva; Kun, Judit; Moussong, Éva; Lee, Young-Ho; Goto, Yuji; Réfrégiers, Matthieu; Kardos, József

    2018-06-11

    Circular dichroism (CD) spectroscopy is a widely used method to study the protein secondary structure. However, for decades, the general opinion was that the correct estimation of β-sheet content is challenging because of the large spectral and structural diversity of β-sheets. Recently, we showed that the orientation and twisting of β-sheets account for the observed spectral diversity, and developed a new method to estimate accurately the secondary structure (PNAS, 112, E3095). BeStSel web server provides the Beta Structure Selection method to analyze the CD spectra recorded by conventional or synchrotron radiation CD equipment. Both normalized and measured data can be uploaded to the server either as a single spectrum or series of spectra. The originality of BeStSel is that it carries out a detailed secondary structure analysis providing information on eight secondary structure components including parallel-β structure and antiparallel β-sheets with three different groups of twist. Based on these, it predicts the protein fold down to the topology/homology level of the CATH protein fold classification. The server also provides a module to analyze the structures deposited in the PDB for BeStSel secondary structure contents in relation to Dictionary of Secondary Structure of Proteins data. The BeStSel server is freely accessible at http://bestsel.elte.hu.

  12. Identification of Open Stomata1-Interacting Proteins Reveals Interactions with Sucrose Non-fermenting1-Related Protein Kinases2 and with Type 2A Protein Phosphatases That Function in Abscisic Acid Responses

    DOE PAGES

    Waadt, Rainer; Manalansan, Bianca; Rauniyar, Navin; ...

    2015-09-04

    The plant hormone abscisic acid (ABA) controls growth and development and regulates plant water status through an established signaling pathway. In the presence of ABA, pyrabactin resistance/regulatory component of ABA receptor proteins inhibit type 2C protein phosphatases (PP2Cs). This, in turn, enables the activation of Sucrose Nonfermenting1-Related Protein Kinases2 (SnRK2). Open Stomata1 (OST1)/SnRK2.6/SRK2E is a major SnRK2-type protein kinase responsible for mediating ABA responses. Arabidopsis (Arabidopsis thaliana) expressing an epitope-tagged OST1 in the recessive ost1-3 mutant background was used for the copurification and identification of OST1-interacting proteins after osmotic stress and ABA treatments. Furthemore, these analyses, which were confirmed usingmore » bimolecular fluorescence complementation and coimmunoprecipitation, unexpectedly revealed homo- and heteromerization of OST1 with SnRK2.2, SnRK2.3, OST1, and SnRK2.8. Furthermore, several OST1-complexed proteins were identified as type 2A protein phosphatase (PP2A) subunits and as proteins involved in lipid and galactolipid metabolism. More detailed analyses suggested an interaction network between ABA-activated SnRK2-type protein kinases and several PP2A-type protein phosphatase regulatory subunits. pp2a double mutants exhibited a reduced sensitivity to ABA during seed germination and stomatal closure and an enhanced ABA sensitivity in root growth regulation. Our analyses add PP2A-type protein phosphatases as another class of protein phosphatases to the interaction network of SnRK2-type protein kinases.« less

  13. Cell-Free Expression and In Situ Immobilization of Parasite Proteins from Clonorchis sinensis for Rapid Identification of Antigenic Candidates

    PubMed Central

    Ju, Jung Won; Kim, Ho-Cheol; Shin, Hyun-Il; Kim, Yu Jung; Kim, Dong-Myung

    2015-01-01

    Progress towards genetic sequencing of human parasites has provided the groundwork for a post-genomic approach to develop novel antigens for the diagnosis and treatment of parasite infections. To fully utilize the genomic data, however, high-throughput methodologies are required for functional analysis of the proteins encoded in the genomic sequences. In this study, we investigated cell-free expression and in situ immobilization of parasite proteins as a novel platform for the discovery of antigenic proteins. PCR-amplified parasite DNA was immobilized on microbeads that were also functionalized to capture synthesized proteins. When the microbeads were incubated in a reaction mixture for cell-free synthesis, proteins expressed from the microbead-immobilized DNA were instantly immobilized on the same microbeads, providing a physical linkage between the genetic information and encoded proteins. This approach of in situ expression and isolation enables streamlined recovery and analysis of cell-free synthesized proteins and also allows facile identification of the genes coding antigenic proteins through direct PCR of the microbead-bound DNA. PMID:26599101

  14. Protein 3D Structure Computed from Evolutionary Sequence Variation

    PubMed Central

    Sheridan, Robert; Hopf, Thomas A.; Pagnani, Andrea; Zecchina, Riccardo; Sander, Chris

    2011-01-01

    The evolutionary trajectory of a protein through sequence space is constrained by its function. Collections of sequence homologs record the outcomes of millions of evolutionary experiments in which the protein evolves according to these constraints. Deciphering the evolutionary record held in these sequences and exploiting it for predictive and engineering purposes presents a formidable challenge. The potential benefit of solving this challenge is amplified by the advent of inexpensive high-throughput genomic sequencing. In this paper we ask whether we can infer evolutionary constraints from a set of sequence homologs of a protein. The challenge is to distinguish true co-evolution couplings from the noisy set of observed correlations. We address this challenge using a maximum entropy model of the protein sequence, constrained by the statistics of the multiple sequence alignment, to infer residue pair couplings. Surprisingly, we find that the strength of these inferred couplings is an excellent predictor of residue-residue proximity in folded structures. Indeed, the top-scoring residue couplings are sufficiently accurate and well-distributed to define the 3D protein fold with remarkable accuracy. We quantify this observation by computing, from sequence alone, all-atom 3D structures of fifteen test proteins from different fold classes, ranging in size from 50 to 260 residues., including a G-protein coupled receptor. These blinded inferences are de novo, i.e., they do not use homology modeling or sequence-similar fragments from known structures. The co-evolution signals provide sufficient information to determine accurate 3D protein structure to 2.7–4.8 Å Cα-RMSD error relative to the observed structure, over at least two-thirds of the protein (method called EVfold, details at http://EVfold.org). This discovery provides insight into essential interactions constraining protein evolution and will facilitate a comprehensive survey of the universe of protein

  15. Mapping protein-protein interactions using yeast two-hybrid assays.

    PubMed

    Mehla, Jitender; Caufield, J Harry; Uetz, Peter

    2015-05-01

    Yeast two-hybrid (Y2H) screens are an efficient system for mapping protein-protein interactions and whole interactomes. The screens can be performed using random libraries or collections of defined open reading frames (ORFs) called ORFeomes. This protocol describes both library and array-based Y2H screening, with an emphasis on array-based assays. Array-based Y2H is commonly used to test a number of "prey" proteins for interactions with a single "bait" (target) protein or pool of proteins. The advantage of this approach is the direct identification of interacting protein pairs without further downstream experiments: The identity of the preys is known and does not require further confirmation. In contrast, constructing and screening a random prey library requires identification of individual prey clones and systematic retesting. Retesting is typically performed in an array format. © 2015 Cold Spring Harbor Laboratory Press.

  16. Protein Corona Composition Does Not Accurately Predict Hematocompatibility of Colloidal Gold Nanoparticles

    PubMed Central

    Dobrovolskaia, Marina A.; Neun, Barry W.; Man, Sonny; Ye, Xiaoying; Hansen, Matthew; Patri, Anil K.; Crist, Rachael M.; McNeil, Scott E.

    2014-01-01

    Proteins bound to nanoparticle surfaces are known to affect particle clearance by influencing immune cell uptake and distribution to the organs of the mononuclear phagocytic system. The composition of the protein corona has been described for several types of nanomaterials, but the role of the corona in nanoparticle biocompatibility is not well established. In this study we investigate the role of nanoparticle surface properties (PEGylation) and incubation times on the protein coronas of colloidal gold nanoparticles. While neither incubation time nor PEG molecular weight affected the specific proteins in the protein corona, the total amount of protein binding was governed by the molecular weight of PEG coating. Furthermore, the composition of the protein corona did not correlate with nanoparticle hematocompatibility. Specialized hematological tests should be used to deduce nanoparticle hematotoxicity. PMID:24512761

  17. Top-down proteomic identification of bacterial protein biomarkers and toxins using MALDI-TOF-TOF-MS/MS and post-source decay

    USDA-ARS?s Scientific Manuscript database

    Matrix-assisted laser desorption/ionization time-of-flight-time-of-flight mass spectrometry(MALDI-TOF-TOF-MS)has provided new capabilities for the rapid identification of digested and non-digested proteins. The tandem (MS/MS) capability of TOF-TOF instruments allows precursor ion selection/isolation...

  18. Identification of a "glycine-loop"-like coiled structure in the 34 AA Pro,Gly,Met repeat domain of the biomineral-associated protein, PM27.

    PubMed

    Wustman, Brandon A; Santos, Rudolpho; Zhang, Bo; Evans, John Spencer

    2002-12-05

    Fracture resistance in biomineralized structures has been linked to the presence of proteins, some of which possess sequences that are associated with elastic behavior. One such protein superfamily, the Pro,Gly-rich sea urchin intracrystalline spicule matrix proteins, form protein-protein supramolecular assemblies that modify the microstructure and fracture-resistant properties of the calcium carbonate mineral phase within embryonic sea urchin spicules and adult sea urchin spines. In this report, we detail the identification of a repetitive keratin-like "glycine-loop"- or coil-like structure within the 34-AA (AA: amino acid) N-terminal domain, (PGMG)(8)PG, of the spicule matrix protein, PM27. The identification of this repetitive structural motif was accomplished using two capped model peptides: a 9-AA sequence, GPGMGPGMG, and a 34-AA peptide representing the entire motif. Using CD, NMR spectrometry, and molecular dynamics simulated annealing/minimization simulations, we have determined that the 9-AA model peptide adopts a loop-like structure at pH 7.4. The structure of the 34-AA polypeptide resembles a coil structure consisting of repeating loop motifs that do not exhibit long-range ordering. Given that loop structures have been associated with protein elastic behavior and protein motion, it is plausible that the 34-AA Pro,Gly,Met repeat sequence motif in PM27 represents a putative elastic or mobile domain. Copyright 2002 Wiley Periodicals, Inc.

  19. Metabolite identification of triptolide by data-dependent accurate mass spectrometric analysis in combination with online hydrogen/deuterium exchange and multiple data-mining techniques.

    PubMed

    Du, Fuying; Liu, Ting; Liu, Tian; Wang, Yongwei; Wan, Yakun; Xing, Jie

    2011-10-30

    Triptolide (TP), the primary active component of the herbal medicine Tripterygium wilfordii Hook F, has shown promising antileukemic and anti-inflammatory activity. The pharmacokinetic profile of TP indicates an extensive metabolic elimination in vivo; however, its metabolic data is rarely available partly because of the difficulty in identifying it due to the absence of appropriate ultraviolet chromophores in the structure and the presence of endogenous interferences in biological samples. In the present study, the biotransformation of TP was investigated by improved data-dependent accurate mass spectrometric analysis, using an LTQ/Orbitrap hybrid mass spectrometer in conjunction with the online hydrogen (H)/deuterium (D) exchange technique for rapid structural characterization. Accurate full-scan MS and MS/MS data were processed with multiple post-acquisition data-mining techniques, which were complementary and effective in detecting both common and uncommon metabolites from biological matrices. As a result, 38 phase I, 9 phase II and 8 N-acetylcysteine (NAC) metabolites of TP were found in rat urine. Accurate MS/MS data were used to support assignments of metabolite structures, and online H/D exchange experiments provided additional evidence for exchangeable hydrogen atoms in the structure. The results showed the main phase I metabolic pathways of TP are hydroxylation, hydrolysis and desaturation, and the resulting metabolites subsequently undergo phase II processes. The presence of NAC conjugates indicated the capability of TP to form reactive intermediate species. This study also demonstrated the effectiveness of LC/HR-MS(n) in combination with multiple post-acquisition data-mining methods and the online H/D exchange technique for the rapid identification of drug metabolites. Copyright © 2011 John Wiley & Sons, Ltd.

  20. Computational identification of binding energy hot spots in protein-RNA complexes using an ensemble approach.

    PubMed

    Pan, Yuliang; Wang, Zixiang; Zhan, Weihua; Deng, Lei

    2018-05-01

    Identifying RNA-binding residues, especially energetically favored hot spots, can provide valuable clues for understanding the mechanisms and functional importance of protein-RNA interactions. Yet, limited availability of experimentally recognized energy hot spots in protein-RNA crystal structures leads to the difficulties in developing empirical identification approaches. Computational prediction of RNA-binding hot spot residues is still in its infant stage. Here, we describe a computational method, PrabHot (Prediction of protein-RNA binding hot spots), that can effectively detect hot spot residues on protein-RNA binding interfaces using an ensemble of conceptually different machine learning classifiers. Residue interaction network features and new solvent exposure characteristics are combined together and selected for classification with the Boruta algorithm. In particular, two new reference datasets (benchmark and independent) have been generated containing 107 hot spots from 47 known protein-RNA complex structures. In 10-fold cross-validation on the training dataset, PrabHot achieves promising performances with an AUC score of 0.86 and a sensitivity of 0.78, which are significantly better than that of the pioneer RNA-binding hot spot prediction method HotSPRing. We also demonstrate the capability of our proposed method on the independent test dataset and gain a competitive advantage as a result. The PrabHot webserver is freely available at http://denglab.org/PrabHot/. leideng@csu.edu.cn. Supplementary data are available at Bioinformatics online.

  1. Highly accurate prediction of protein self-interactions by incorporating the average block and PSSM information into the general PseAAC.

    PubMed

    Zhai, Jing-Xuan; Cao, Tian-Jie; An, Ji-Yong; Bian, Yong-Tao

    2017-11-07

    It is a challenging task for fundamental research whether proteins can interact with their partners. Protein self-interaction (SIP) is a special case of PPIs, which plays a key role in the regulation of cellular functions. Due to the limitations of experimental self-interaction identification, it is very important to develop an effective biological tool for predicting SIPs based on protein sequences. In the study, we developed a novel computational method called RVM-AB that combines the Relevance Vector Machine (RVM) model and Average Blocks (AB) for detecting SIPs from protein sequences. Firstly, Average Blocks (AB) feature extraction method is employed to represent protein sequences on a Position Specific Scoring Matrix (PSSM). Secondly, Principal Component Analysis (PCA) method is used to reduce the dimension of AB vector for reducing the influence of noise. Then, by employing the Relevance Vector Machine (RVM) algorithm, the performance of RVM-AB is assessed and compared with the state-of-the-art support vector machine (SVM) classifier and other exiting methods on yeast and human datasets respectively. Using the fivefold test experiment, RVM-AB model achieved very high accuracies of 93.01% and 97.72% on yeast and human datasets respectively, which are significantly better than the method based on SVM classifier and other previous methods. The experimental results proved that the RVM-AB prediction model is efficient and robust. It can be an automatic decision support tool for detecting SIPs. For facilitating extensive studies for future proteomics research, the RVMAB server is freely available for academic use at http://219.219.62.123:8888/SIP_AB. Copyright © 2017 Elsevier Ltd. All rights reserved.

  2. EF-2DE Analysis and Protein Identification

    USDA-ARS?s Scientific Manuscript database

    Isoelectric focusing followed by SDS-PAGE (IEF-2DE) separates proteins in a two-dimensional matrix of protein pI (Protein Isoelectric Point) and molecular weight (MW). The technique is particularly useful to distinguish protein isoforms (Radwan et al., 2012) and proteins that contain post-translatio...

  3. Oligonucleotide (GTG)5 as a marker for Mycobacterium tuberculosis strain identification.

    PubMed Central

    Wiid, I J; Werely, C; Beyers, N; Donald, P; van Helden, P D

    1994-01-01

    Culture of Mycobacterium tuberculosis provides no information on the identity of a strain or the distribution of such a strain in the community. Strain identification of M. tuberculosis can help to address important epidemiological questions, e.g., the origin of an infection in a patient's household or community, whether reactivation of infection is endogenous or exogenous in origin, and the spread and early detection of organisms with acquired antibiotic resistance. To research this problem, strain identification must be reliable and accurate. Although genetic identification techniques already exist, it is valuable to have genetic identification techniques based on a number of genetic markers to improve the accurate identification of M. tuberculosis strains. We show that oligonucleotide (GTG)5 can be successfully applied to the identification of M. tuberculosis strains. This technique may be particularly useful in cases in which M. tuberculosis strains have few or no insertion elements (e.g., IS6110) or in identifying other strains of mycobacteria when informative probes are lacking. Images PMID:7914207

  4. Improved Identification of Membrane Proteins by MALDI-TOF MS/MS Using Vacuum Sublimated Matrix Spots on an Ultraphobic Chip Surface

    PubMed Central

    Poetsch, Ansgar; Schlüsener, Daniela; Florizone, Christine; Eltis, Lindsay; Menzel, Christoph; Rögner, Matthias; Steinert, Kerstin; Roth, Udo

    2008-01-01

    Integral membrane proteins are notoriously difficult to identify and analyze by mass spectrometry because of their low abundance and limited number of trypsin cleavage sites. Our strategy to address this problem is based on a novel technology for MALDI-MS peptide sample preparation that increases the success rate of membrane protein identification by increasing the sensitivity of the MALDI-TOF system. For this, we used sample plates with predeposited matrix spots of CHCA crystals prepared by vacuum sublimation onto an extremely low wettable (ultraphobic) surface. In experiments using standard peptides, an up to 10-fold gain of sensitivity was found for on-chip preparations compared with classical dried-droplet preparations on a steel target. In order to assess the performance of the chips with membrane proteins, three model proteins (bacteriorhodopsin, subunit IV(a) of ATP synthase, and the cp47 subunit from photosystem II) were analyzed. To mimic realistic analysis conditions, purified proteins were separated by SDS-PAGE and digested with trypsin. The digest MALDI samples were prepared either by dried-droplet technique on steel plates using CHCA as matrix, or applied directly onto the matrix spots of the chip surface. Significantly higher signal-to-noise ratios were observed for all of the spectra resulting from on-chip preparations of different peptides. In a second series of experiments, the membrane proteome of Rhodococcus jostii RHA1 was investigated by AIEC/SDS-PAGE in combination with MALDI-TOF MS/MS. As in the first experiments, Coomassie-stained SDS-PAGE bands were digested and the two different preparation methods were compared. For preparations on the Mass·Spec·Turbo Chip, 43 of 60 proteins were identified, whereas only 30 proteins were reliably identified after classical sample preparation. Comparison of the obtained Mascot scores, which reflect the confidence level of the protein identifications, revealed that for 70% of the identified proteins

  5. HIGH-THROUGHPUT IDENTIFICATION OF CATALYTIC REDOX-ACTIVE CYSTEINE RESIDUES

    EPA Science Inventory

    Cysteine (Cys) residues often play critical roles in proteins; however, identification of their specific functions has been limited to case-by-case experimental approaches. We developed a procedure for high-throughput identification of catalytic redox-active Cys in proteins by se...

  6. Identification of Surface Protein Biomarkers of Listeria monocytogenes via Bioinformatics and Antibody-Based Protein Detection Tools

    PubMed Central

    Zhang, Cathy X. Y.; Brooks, Brian W.; Huang, Hongsheng; Pagotto, Franco

    2016-01-01

    ABSTRACT The Gram-positive bacterium Listeria monocytogenes causes a significant percentage of the fatalities among foodborne illnesses in humans. Surface proteins specifically expressed in a wide range of L. monocytogenes serotypes under selective enrichment culture conditions could serve as potential biomarkers for detection and isolation of this pathogen via antibody-based methods. Our study aimed to identify such biomarkers. Interrogation of the L. monocytogenes serotype 4b strain F2365 genome identified 130 putative or known surface proteins. The homologues of four surface proteins, LMOf2365_0578, LMOf2365_0581, LMOf2365_0639, and LMOf2365_2117, were assessed as biomarkers due to the presence of conserved regions among strains of L. monocytogenes which are variable among other Listeria species. Rabbit polyclonal antibodies against the four recombinant proteins revealed the expression of only LMOf2365_0639 on the surface of serotype 4b strain LI0521 cells despite PCR detection of mRNA transcripts for all four proteins in the organism. Three of 35 monoclonal antibodies (MAbs) to LMOf2365_0639, MAbs M3643, M3644, and M3651, specifically recognized 42 (91.3%) of 46 L. monocytogenes lineage I and II isolates grown in nonselective brain heart infusion medium. While M3644 and M3651 reacted with 14 to 15 (82.4 to 88.2%) of 17 L. monocytogenes lineage I and II isolates, M3643 reacted with 22 (91.7%) of 24 lineage I, II, and III isolates grown in selective enrichment media (UVM1, modified Fraser, Palcam, and UVM2 media). The three MAbs exhibited only weak reactivities (the optical densities at 414 nm were close to the cutoff value) to some other Listeria species grown in selective enrichment media. Collectively, the data indicate the potential of LMOf2365_0639 as a surface biomarker of L. monocytogenes, with the aid of specific MAbs, for pathogen detection, identification, and isolation in clinical, environmental, and food samples. IMPORTANCE L. monocytogenes is

  7. Identification of Open Stomata1-Interacting Proteins Reveals Interactions with Sucrose Non-fermenting1-Related Protein Kinases2 and with Type 2A Protein Phosphatases That Function in Abscisic Acid Responses1[OPEN

    PubMed Central

    Waadt, Rainer; Manalansan, Bianca; Rauniyar, Navin; Munemasa, Shintaro; Booker, Matthew A.; Brandt, Benjamin; Waadt, Christian; Nusinow, Dmitri A.; Kay, Steve A.; Kunz, Hans-Henning; Schumacher, Karin; DeLong, Alison; Yates, John R.; Schroeder, Julian I.

    2015-01-01

    The plant hormone abscisic acid (ABA) controls growth and development and regulates plant water status through an established signaling pathway. In the presence of ABA, pyrabactin resistance/regulatory component of ABA receptor proteins inhibit type 2C protein phosphatases (PP2Cs). This, in turn, enables the activation of Sucrose Nonfermenting1-Related Protein Kinases2 (SnRK2). Open Stomata1 (OST1)/SnRK2.6/SRK2E is a major SnRK2-type protein kinase responsible for mediating ABA responses. Arabidopsis (Arabidopsis thaliana) expressing an epitope-tagged OST1 in the recessive ost1-3 mutant background was used for the copurification and identification of OST1-interacting proteins after osmotic stress and ABA treatments. These analyses, which were confirmed using bimolecular fluorescence complementation and coimmunoprecipitation, unexpectedly revealed homo- and heteromerization of OST1 with SnRK2.2, SnRK2.3, OST1, and SnRK2.8. Furthermore, several OST1-complexed proteins were identified as type 2A protein phosphatase (PP2A) subunits and as proteins involved in lipid and galactolipid metabolism. More detailed analyses suggested an interaction network between ABA-activated SnRK2-type protein kinases and several PP2A-type protein phosphatase regulatory subunits. pp2a double mutants exhibited a reduced sensitivity to ABA during seed germination and stomatal closure and an enhanced ABA sensitivity in root growth regulation. These analyses add PP2A-type protein phosphatases as another class of protein phosphatases to the interaction network of SnRK2-type protein kinases. PMID:26175513

  8. Identification of a multi-protein reductive dehalogenase complex in Dehalococcoides mccartyi strain CBDB1 suggests a protein-dependent respiratory electron transport chain obviating quinone involvement.

    PubMed

    Kublik, Anja; Deobald, Darja; Hartwig, Stefanie; Schiffmann, Christian L; Andrades, Adarelys; von Bergen, Martin; Sawers, R Gary; Adrian, Lorenz

    2016-09-01

    Dehalococcoides mccartyi strain CBDB1 is an obligate organohalide-respiring bacterium using only hydrogen as electron donor and halogenated organics as electron acceptor. Here, we studied proteins involved in the respiratory chain under non-denaturing conditions. Using blue native gel electrophoresis (BN-PAGE), gel filtration and ultrafiltration an active dehalogenating protein complex with a molecular mass of 250-270 kDa was identified. The active subunit of reductive dehalogenase (RdhA) colocalised with a complex iron-sulfur molybdoenzyme (CISM) subunit (CbdbA195) and an iron-sulfur cluster containing subunit (CbdbA131) of the hydrogen uptake hydrogenase (Hup). No colocalisation between the catalytically active subunits of hydrogenase and reductive dehalogenase was found. By two-dimensional BN/SDS-PAGE the stability of the complex towards detergents was assessed, demonstrating stepwise disintegration with increasing detergent concentrations. Chemical cross-linking confirmed the presence of a higher molecular mass reductive dehalogenase protein complex composed of RdhA, CISM I and Hup hydrogenase and proved to be a potential tool for stabilising protein-protein interactions of the dehalogenating complex prior to membrane solubilisation. Taken together, the identification of the respiratory dehalogenase protein complex and the absence of indications for quinone participation in the respiration suggest a quinone-independent protein-based respiratory electron transfer chain in D. mccartyi. © 2015 Society for Applied Microbiology and John Wiley & Sons Ltd.

  9. Rapid and Accurate Molecular Identification of the Emerging Multidrug-Resistant Pathogen Candida auris

    PubMed Central

    Zhao, Yanan; Lockhart, Shawn R.; Berrio, Indira

    2017-01-01

    ABSTRACT Candida auris is an emerging multidrug-resistant fungal pathogen causing nosocomial and invasive infections associated with high mortality. C. auris is commonly misidentified as several different yeast species by commercially available phenotypic identification platforms. Thus, there is an urgent need for a reliable diagnostic method. In this paper, we present fast, robust, easy-to-perform and interpret PCR and real-time PCR assays to identify C. auris and related species: Candida duobushaemulonii, Candida haemulonii, and Candida lusitaniae. Targeting rDNA region nucleotide sequences, primers specific for C. auris only or C. auris and related species were designed. A panel of 140 clinical fungal isolates was used in both PCR and real-time PCR assays followed by electrophoresis or melting temperature analysis, respectively. The identification results from the assays were 100% concordant with DNA sequencing results. These molecular assays overcome the deficiencies of existing phenotypic tests to identify C. auris and related species. PMID:28539346

  10. Rapid and Accurate Molecular Identification of the Emerging Multidrug-Resistant Pathogen Candida auris.

    PubMed

    Kordalewska, Milena; Zhao, Yanan; Lockhart, Shawn R; Chowdhary, Anuradha; Berrio, Indira; Perlin, David S

    2017-08-01

    Candida auris is an emerging multidrug-resistant fungal pathogen causing nosocomial and invasive infections associated with high mortality. C. auris is commonly misidentified as several different yeast species by commercially available phenotypic identification platforms. Thus, there is an urgent need for a reliable diagnostic method. In this paper, we present fast, robust, easy-to-perform and interpret PCR and real-time PCR assays to identify C. auris and related species: Candida duobushaemulonii , Candida haemulonii , and Candida lusitaniae Targeting rDNA region nucleotide sequences, primers specific for C. auris only or C. auris and related species were designed. A panel of 140 clinical fungal isolates was used in both PCR and real-time PCR assays followed by electrophoresis or melting temperature analysis, respectively. The identification results from the assays were 100% concordant with DNA sequencing results. These molecular assays overcome the deficiencies of existing phenotypic tests to identify C. auris and related species. Copyright © 2017 Kordalewska et al.

  11. Ligand screening system using fusion proteins of G protein-coupled receptors with G protein alpha subunits.

    PubMed

    Suga, Hinako; Haga, Tatsuya

    2007-01-01

    G protein-coupled receptors (GPCRs) constitute one of the largest families of genes in the human genome, and are the largest targets for drug development. Although a large number of GPCR genes have recently been identified, ligands have not yet been identified for many of them. Various assay systems have been employed to identify ligands for orphan GPCRs, but there is still no simple and general method to screen for ligands of such GPCRs, particularly of G(i)-coupled receptors. We have examined whether fusion proteins of GPCRs with G protein alpha subunit (Galpha) could be utilized for ligand screening and showed that the fusion proteins provide an effective method for the purpose. This article focuses on the followings: (1) characterization of GPCR genes and GPCRs, (2) identification of ligands for orphan GPCRs, (3) characterization of GPCR-Galpha fusion proteins, and (4) identification of ligands for orphan GPCRs using GPCR-Galpha fusion proteins.

  12. MAGGIE Component 1: Identification and Purification of Native and Recombinant Multiprotein Complexes and Modified Proteins from Pyrococcus furiosus

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Adams, Michael W.; W. W. Adams, Michael

    2014-01-07

    Virtualy all cellular processes are carried out by dynamic molecular assemblies or multiprotein complexes (PCs), the composition of which is largely unknown. Structural genomics efforts have demonstrated that less than 25% of the genes in a given prokaryotic genome will yield stable, soluble proteins when expressed using a one-ORF-at-a-time approach. We proposed that much of the remaining 75% of the genes encode proteins that are part of multiprotein complexes or are modified post-translationally, for example, with metals. The problem is that PCs and metalloproteins (MPs) cannot be accurately predicted on a genome-wide scale. The only solution to this dilemma ismore » to experimentally determine PCs and MPs in biomass of a model organism and to develop analytical tools that can then be applied to the biomass of any other organism. In other words, organisms themselves must be analyzed to identify their PCs and MPs: “native proteomes” must be determined. This information can then be utilized to design multiple ORF expression systems to produce recombinant forms of PCs and MPs. Moreover, the information and utility of this approach can be enhanced by using a hyperthermophile, one that grows optimally at 100°C, as a model organism. By analyzing the native proteome at close to 100 °C below the optimum growth temperature, we will trap reversible and dynamic complexes, thereby enabling their identification, purification, and subsequent characterization. The model organism for the current study is Pyrococcus furiosus, a hyperthermophilic archaeon that grows optimally at 100°C. It is grown up to 600-liter scale and kg quantities of biomass are available. In this project we identified native PCs and MPs using P. furiosus biomass (with MS/MS analyses to identify proteins by component 4). In addition, we provided samples of abundant native PCs and MPs for structural characterization (using SAXS by component 5). We also designed and evaluated generic bioinformatics and

  13. Experience with a mouse intranasal test for the predictive identification of respiratory sensitization potential of proteins.

    PubMed

    Blaikie, L; Basketter, D A

    1999-08-01

    The predictive identification of respiratory allergenic potential is an important primary step in the safety evaluation of (novel) proteins, such as the enzymes used in a range of consumer laundry products. In the past this has been achieved by assessing the relative ability of proteins to give rise to the formation of anaphylactic antibody in the guinea pig. Recently, an alternative model has been proposed which assesses the formation of specific IgG1 antibody in a mouse intranasal test (MINT), the assumption being that specific IgG1 antibody is a surrogate for anaphylactic antibody in the mouse. This procedure has undergone successful initial intralaboratory and interlaboratory assessment. In the present work, the MINT has been evaluated in a more thorough intralaboratory study using eight enzymes plus ovalbumin. While the data generated with a reference enzyme protein, Alcalase, showed good reproducibility, results with the remaining eight proteins led to estimates of their relative antigenic or sensitization potential several of which were at variance from those derived from the guinea pig/ human experience. In consequence, it is concluded that the MINT requires substantial further investigation before it can be adopted as a model for the assessment of the relative ability of proteins to behave as respiratory allergens.

  14. ProtPhylo: identification of protein-phenotype and protein-protein functional associations via phylogenetic profiling.

    PubMed

    Cheng, Yiming; Perocchi, Fabiana

    2015-07-01

    ProtPhylo is a web-based tool to identify proteins that are functionally linked to either a phenotype or a protein of interest based on co-evolution. ProtPhylo infers functional associations by comparing protein phylogenetic profiles (co-occurrence patterns of orthology relationships) for more than 9.7 million non-redundant protein sequences from all three domains of life. Users can query any of 2048 fully sequenced organisms, including 1678 bacteria, 255 eukaryotes and 115 archaea. In addition, they can tailor ProtPhylo to a particular kind of biological question by choosing among four main orthology inference methods based either on pair-wise sequence comparisons (One-way Best Hits and Best Reciprocal Hits) or clustering of orthologous proteins across multiple species (OrthoMCL and eggNOG). Next, ProtPhylo ranks phylogenetic neighbors of query proteins or phenotypic properties using the Hamming distance as a measure of similarity between pairs of phylogenetic profiles. Candidate hits can be easily and flexibly prioritized by complementary clues on subcellular localization, known protein-protein interactions, membrane spanning regions and protein domains. The resulting protein list can be quickly exported into a csv text file for further analyses. ProtPhylo is freely available at http://www.protphylo.org. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  15. A Proteomic Approach for the Identification of Up-Regulated Proteins Involved in the Metabolic Process of the Leiomyoma.

    PubMed

    Ura, Blendi; Scrimin, Federica; Arrigoni, Giorgio; Franchin, Cinzia; Monasta, Lorenzo; Ricci, Giuseppe

    2016-04-09

    Uterine leiomyoma is the most common benign smooth muscle cell tumor of the uterus. Proteomics is a powerful tool for the analysis of complex mixtures of proteins. In our study, we focused on proteins that were upregulated in the leiomyoma compared to the myometrium. Paired samples of eight leiomyomas and adjacent myometrium were obtained and submitted to two-dimensional gel electrophoresis (2-DE) and mass spectrometry for protein identification and to Western blotting for 2-DE data validation. The comparison between the patterns revealed 24 significantly upregulated (p < 0.05) protein spots, 12 of which were found to be associated with the metabolic processes of the leiomyoma and not with the normal myometrium. The overexpression of seven proteins involved in the metabolic processes of the leiomyoma was further validated by Western blotting and 2D Western blotting. Four of these proteins have never been associated with the leiomyoma before. The 2-DE approach coupled with mass spectrometry, which is among the methods of choice for comparative proteomic studies, identified a number of proteins overexpressed in the leiomyoma and involved in several biological processes, including metabolic processes. A better understanding of the mechanism underlying the overexpression of these proteins may be important for therapeutic purposes.

  16. A Proteomic Approach for the Identification of Up-Regulated Proteins Involved in the Metabolic Process of the Leiomyoma

    PubMed Central

    Ura, Blendi; Scrimin, Federica; Arrigoni, Giorgio; Franchin, Cinzia; Monasta, Lorenzo; Ricci, Giuseppe

    2016-01-01

    Uterine leiomyoma is the most common benign smooth muscle cell tumor of the uterus. Proteomics is a powerful tool for the analysis of complex mixtures of proteins. In our study, we focused on proteins that were upregulated in the leiomyoma compared to the myometrium. Paired samples of eight leiomyomas and adjacent myometrium were obtained and submitted to two-dimensional gel electrophoresis (2-DE) and mass spectrometry for protein identification and to Western blotting for 2-DE data validation. The comparison between the patterns revealed 24 significantly upregulated (p < 0.05) protein spots, 12 of which were found to be associated with the metabolic processes of the leiomyoma and not with the normal myometrium. The overexpression of seven proteins involved in the metabolic processes of the leiomyoma was further validated by Western blotting and 2D Western blotting. Four of these proteins have never been associated with the leiomyoma before. The 2-DE approach coupled with mass spectrometry, which is among the methods of choice for comparative proteomic studies, identified a number of proteins overexpressed in the leiomyoma and involved in several biological processes, including metabolic processes. A better understanding of the mechanism underlying the overexpression of these proteins may be important for therapeutic purposes. PMID:27070597

  17. Identification of human microRNA targets from isolated argonaute protein complexes.

    PubMed

    Beitzinger, Michaela; Peters, Lasse; Zhu, Jia Yun; Kremmer, Elisabeth; Meister, Gunter

    2007-06-01

    MicroRNAs (miRNAs) constitute a class of small non-coding RNAs that regulate gene expression on the level of translation and/or mRNA stability. Mammalian miRNAs associate with members of the Argonaute (Ago) protein family and bind to partially complementary sequences in the 3' untranslated region (UTR) of specific target mRNAs. Computer algorithms based on factors such as free binding energy or sequence conservation have been used to predict miRNA target mRNAs. Based on such predictions, up to one third of all mammalian mRNAs seem to be under miRNA regulation. However, due to the low degree of complementarity between the miRNA and its target, such computer programs are often imprecise and therefore not very reliable. Here we report the first biochemical identification approach of miRNA targets from human cells. Using highly specific monoclonal antibodies against members of the Ago protein family, we co-immunoprecipitate Ago-bound mRNAs and identify them by cloning. Interestingly, most of the identified targets are also predicted by different computer programs. Moreover, we randomly analyzed six different target candidates and were able to experimentally validate five as miRNA targets. Our data clearly indicate that miRNA targets can be experimentally identified from Ago complexes and therefore provide a new tool to directly analyze miRNA function.

  18. Identification of a key structural element for protein folding within beta-hairpin turns.

    PubMed

    Kim, Jaewon; Brych, Stephen R; Lee, Jihun; Logan, Timothy M; Blaber, Michael

    2003-05-09

    Specific residues in a polypeptide may be key contributors to the stability and foldability of the unique native structure. Identification and prediction of such residues is, therefore, an important area of investigation in solving the protein folding problem. Atypical main-chain conformations can help identify strains within a folded protein, and by inference, positions where unique amino acids may have a naturally high frequency of occurrence due to favorable contributions to stability and folding. Non-Gly residues located near the left-handed alpha-helical region (L-alpha) of the Ramachandran plot are a potential indicator of structural strain. Although many investigators have studied mutations at such positions, no consistent energetic or kinetic contributions to stability or folding have been elucidated. Here we report a study of the effects of Gly, Ala and Asn substitutions found within the L-alpha region at a characteristic position in defined beta-hairpin turns within human acidic fibroblast growth factor, and demonstrate consistent effects upon stability and folding kinetics. The thermodynamic and kinetic data are compared to available data for similar mutations in other proteins, with excellent agreement. The results have identified that Gly at the i+3 position within a subset of beta-hairpin turns is a key contributor towards increasing the rate of folding to the native state of the polypeptide while leaving the rate of unfolding largely unchanged.

  19. Biomarker Candidates of Chlamydophila pneumoniae Proteins and Protein Fragments Identified by Affinity-Proteomics Using FTICR-MS and LC-MS/MS

    NASA Astrophysics Data System (ADS)

    Susnea, Iuliana; Bunk, Sebastian; Wendel, Albrecht; Hermann, Corinna; Przybylski, Michael

    2011-04-01

    We report here an affinity-proteomics approach that combines 2D-gel electrophoresis and immunoblotting with high performance mass spectrometry to the identification of both full length protein antigens and antigenic fragments of Chlamydophila pneumoniae (C. pneumoniae). The present affinity-mass spectrometry approach effectively utilized high resolution FTICR mass spectrometry and LC-tandem-MS for protein identification, and enabled the identification of several new highly antigenic C. pneumoniae proteins that were not hitherto reported or previously detected only in other Chlamydia species, such as Chlamydia trachomatis. Moreover, high resolution affinity-MS provided the identification of several neo-antigenic protein fragments containing N- and C-terminal, and central domains such as fragments of the membrane protein Pmp21 and the secreted chlamydial proteasome-like factor (Cpaf), representing specific biomarker candidates.

  20. An accurate proteomic quantification method: fluorescence labeling absolute quantification (FLAQ) using multidimensional liquid chromatography and tandem mass spectrometry.

    PubMed

    Liu, Junyan; Liu, Yang; Gao, Mingxia; Zhang, Xiangmin

    2012-08-01

    A facile proteomic quantification method, fluorescent labeling absolute quantification (FLAQ), was developed. Instead of using MS for quantification, the FLAQ method is a chromatography-based quantification in combination with MS for identification. Multidimensional liquid chromatography (MDLC) with laser-induced fluorescence (LIF) detection with high accuracy and tandem MS system were employed for FLAQ. Several requirements should be met for fluorescent labeling in MS identification: Labeling completeness, minimum side-reactions, simple MS spectra, and no extra tandem MS fragmentations for structure elucidations. A fluorescence dye, 5-iodoacetamidofluorescein, was finally chosen to label proteins on all cysteine residues. The fluorescent dye was compatible with the process of the trypsin digestion and MALDI MS identification. Quantitative labeling was achieved with optimization of reacting conditions. A synthesized peptide and model proteins, BSA (35 cysteines), OVA (five cysteines), were used for verifying the completeness of labeling. Proteins were separated through MDLC and quantified based on fluorescent intensities, followed by MS identification. High accuracy (RSD% < 1.58) and wide linearity of quantification (1-10(5) ) were achieved by LIF detection. The limit of quantitation for the model protein was as low as 0.34 amol. Parts of proteins in human liver proteome were quantified and demonstrated using FLAQ. © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  1. Biosynthetically directed fractional 13C labeling facilitates identification of Phe and Tyr aromatic signals in proteins.

    PubMed

    Jacob, Jaison; Louis, John M; Nesheiwat, Issa; Torchia, Dennis A

    2002-11-01

    Analysis of 2D [(13)C,(1)H]-HSQC spectra of biosynthetic fractionally (13)C labeled proteins is a reliable, straightforward means to obtain stereospecific assignments of Val and Leu methyl sites in proteins. Herein we show that the same fractionally labeled protein sample facilitates observation and identification of Phe and Tyr aromatic signals. This is the case, in part, because the fractional (13)C labeling yields aromatic rings in which some of the (13)C-(13)C J-couplings, present in uniformly labeled samples, are absent. Also, the number of homonuclear J-coupling partners differs for the delta-, epsilon- and zeta-carbons. This enabled us to vary their signal intensities in distinctly different ways by appropriately setting the (13)C constant-time period in 2D [(13)C,(1)H]-HSQC spectra. We illustrate the application of this approach to an 18 kDa protein, c-VIAF, a modulator of apoptosis. In addition, we show that cancellation of the aromatic (13)C CSA and (13)C-(1)H dipolar interactions can be fruitfully utilized in the case of the fractionally labeled sample to obtain high resolution (13)C constant-time spectra with good sensitivity.

  2. ContaMiner and ContaBase: a webserver and database for early identification of unwantedly crystallized protein contaminants

    PubMed Central

    Hungler, Arnaud; Momin, Afaque; Diederichs, Kay; Arold, Stefan, T.

    2016-01-01

    Solving the phase problem in protein X-ray crystallography relies heavily on the identity of the crystallized protein, especially when molecular replacement (MR) methods are used. Yet, it is not uncommon that a contaminant crystallizes instead of the protein of interest. Such contaminants may be proteins from the expression host organism, protein fusion tags or proteins added during the purification steps. Many contaminants co-purify easily, crystallize and give good diffraction data. Identification of contaminant crystals may take time, since the presence of the contaminant is unexpected and its identity unknown. A webserver (ContaMiner) and a contaminant database (ContaBase) have been established, to allow fast MR-based screening of crystallographic data against currently 62 known contaminants. The web-based ContaMiner (available at http://strube.cbrc.kaust.edu.sa/contaminer/) currently produces results in 5 min to 4 h. The program is also available in a github repository and can be installed locally. ContaMiner enables screening of novel crystals at synchrotron beamlines, and it would be valuable as a routine safety check for ‘crystallization and preliminary X-ray analysis’ publications. Thus, in addition to potentially saving X-ray crystallographers much time and effort, ContaMiner might considerably lower the risk of publishing erroneous data. PMID:27980519

  3. Identification and application of self-binding zipper-like sequences in SARS-CoV spike protein.

    PubMed

    Zhang, Si Min; Liao, Ying; Neo, Tuan Ling; Lu, Yanning; Liu, Ding Xiang; Vahlne, Anders; Tam, James P

    2018-05-22

    Self-binding peptides containing zipper-like sequences, such as the Leu/Ile zipper sequence within the coiled coil regions of proteins and the cross-β spine steric zippers within the amyloid-like fibrils, could bind to the protein-of-origin through homophilic sequence-specific zipper motifs. These self-binding sequences represent opportunities for the development of biochemical tools and/or therapeutics. Here, we report on the identification of a putative self-binding β-zipper-forming peptide within the severe acute respiratory syndrome-associated coronavirus spike (S) protein and its application in viral detection. Peptide array scanning of overlapping peptides covering the entire length of S protein identified 34 putative self-binding peptides of six clusters, five of which contained octapeptide core consensus sequences. The Cluster I consensus octapeptide sequence GINITNFR was predicted by the Eisenberg's 3D profile method to have high amyloid-like fibrillation potential through steric β-zipper formation. Peptide C6 containing the Cluster I consensus sequence was shown to oligomerize and form amyloid-like fibrils. Taking advantage of this, C6 was further applied to detect the S protein expression in vitro by fluorescence staining. Meanwhile, the coiled-coil-forming Leu/Ile heptad repeat sequences within the S protein were under-represented during peptide array scanning, in agreement with that long peptide lengths were required to attain high helix-mediated interaction avidity. The data suggest that short β-zipper-like self-binding peptides within the S protein could be identified through combining the peptide scanning and predictive methods, and could be exploited as biochemical detection reagents for viral infection. Copyright © 2018. Published by Elsevier Ltd.

  4. Mitochondrial and nuclear localization of a novel pea thioredoxin: identification of its mitochondrial target proteins.

    PubMed

    Martí, María C; Olmos, Enrique; Calvete, Juan J; Díaz, Isabel; Barranco-Medina, Sergio; Whelan, James; Lázaro, Juan J; Sevilla, Francisca; Jiménez, Ana

    2009-06-01

    Plants contain several genes encoding thioredoxins (Trxs), small proteins involved in the regulation of the activity of many enzymes through dithiol-disulfide exchange. In addition to chloroplastic and cytoplasmic Trx systems, plant mitochondria contain a reduced nicotinamide adenine dinucleotide phosphate-dependent Trx reductase and a specific Trx o, and to date, there have been no reports of a gene encoding a plant nuclear Trx. We report here the presence in pea (Pisum sativum) mitochondria and nuclei of a Trx isoform (PsTrxo1) that seems to belong to the Trx o group, although it differs from this Trx type by its absence of introns in the genomic sequence. Western-blot analysis with isolated mitochondria and nuclei, immunogold labeling, and green fluorescent protein fusion constructs all indicated that PsTrxo1 is present in both cell compartments. Moreover, the identification by tandem mass spectrometry of the native mitochondrial Trx after gel filtration using the fast-protein liquid chromatography system of highly purified mitochondria and the in vitro uptake assay into isolated mitochondria also corroborated a mitochondrial location for this protein. The recombinant PsTrxo1 protein has been shown to be reduced more effectively by the Saccharomyces cerevisiae mitochondrial Trx reductase Trr2 than by the wheat (Triticum aestivum) cytoplasmic reduced nicotinamide adenine dinucleotide phosphate-dependent Trx reductase. PsTrxo1 was able to activate alternative oxidase, and it was shown to interact with a number of mitochondrial proteins, including peroxiredoxin and enzymes mainly involved in the photorespiratory process.

  5. Identification and Characterization of the UL37 Protein of Herpes Simplex Virus Type 1 and Demonstration that it Interacts with ICP8, the Major DNA Binding Protein of Herpes Simplex Virus

    DTIC Science & Technology

    1992-10-20

    Identification of ORFs HSV DNA binding proteins • 1 3 3 5 7 7 11 17 18 22 reps and its role in HSV replication 23 Biochemical properties . . 23...Figure 1 . 2. 3 • 4. 5. 6. 7. 8. Structural model of the herpesvirus virion Schematic diagram of HSV pathogenesis . Diagram of the main...vaccinia virus- 13. Autoradiogram of an immunoblot of HSV - 1 -infected cell proteins harvested at various times postinfec- 85 tioD probed with anti-UL42

  6. Identification of ovarian cancer-associated proteins in symptomatic women: A novel method for semi-quantitative plasma proteomics.

    PubMed

    Shield-Artin, Kristy L; Bailey, Mark J; Oliva, Karen; Liovic, Ana K; Barker, Gillian; Dellios, Nicole L; Reisman, Simone; Ayhan, Mustafa; Rice, Gregory E

    2012-04-01

    To evaluate the utility of an enhanced biomarker discovery approach in order to identify potential biomarkers relevant to ovarian cancer detection. We combined immuno-depletion, liquid-phase IEF, 1D-DIGE, MALDI-TOF/MS and LC-MS/MS to identify differentially expressed proteins in the plasma of symptomatic ovarian cancer patients, stratified by stage, compared to samples obtained from normal subjects. We demonstrate that this approach is a practical alternative to traditional 2D gel techniques and that it has some advantages, most notably increased protein capacity. Proteins were identified in all 76 bands excised from the gels in this project and confirmed the cancer-associated expression of several well-established biomarkers of ovarian cancer. These included C-reactive protein (CRP), haptoglobin, alpha-2 macroglobulin and A1A2. We also identified new ovarian cancer candidate biomarkers, Protein S100-A9 (S100A9) and multimerin-2. The cancer-associated differential expression of CRP and S100A9 was further confirmed by Western blot and ELISA. The methods developed in this study allow for the increased loading of plasma proteins into the analytical stream when compared to traditional 2D-DIGE. This increased protein identification sensitivity allowed us to identify new putative ovarian cancer biomarkers. © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  7. Protein identification and quantification from riverbank grape, Vitis riparia: Comparing SDS-PAGE and FASP-GPF techniques for shotgun proteomic analysis.

    PubMed

    George, Iniga S; Fennell, Anne Y; Haynes, Paul A

    2015-09-01

    Protein sample preparation optimisation is critical for establishing reproducible high throughput proteomic analysis. In this study, two different fractionation sample preparation techniques (in-gel digestion and in-solution digestion) for shotgun proteomics were used to quantitatively compare proteins identified in Vitis riparia leaf samples. The total number of proteins and peptides identified were compared between filter aided sample preparation (FASP) coupled with gas phase fractionation (GPF) and SDS-PAGE methods. There was a 24% increase in the total number of reproducibly identified proteins when FASP-GPF was used. FASP-GPF is more reproducible, less expensive and a better method than SDS-PAGE for shotgun proteomics of grapevine samples as it significantly increases protein identification across biological replicates. Total peptide and protein information from the two fractionation techniques is available in PRIDE with the identifier PXD001399 (http://proteomecentral.proteomexchange.org/dataset/PXD001399). © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  8. Development of an enzyme-linked-immunosorbent-assay technique for accurate identification of poorly preserved silks unearthed in ancient tombs.

    PubMed

    Zheng, Qin; Wu, Xiaofeng; Zheng, Hailing; Zhou, Yang

    2015-05-01

    We report the preparation of a specific fibroin antibody and its use for the identification of unearthed ancient silk relics. Based on the 12-amino-acid repeat sequence "GAGAGSGAGAGS", which is found in fibroin of the silkworm Bombyx mori, a specific antibody against fibroin was prepared in rabbits through peptide synthesis and carrier-protein coupling. This antibody was highly specific for fibroin found in silk. Using this antibody we have successfully identified four silk samples from different time periods. Our results reveal, for the first time, a method capable of detecting silk from a few milligrams of archaeological fabric that has been buried for thousands of years, confirming that the ancient practice of wearing silk products while praying for rebirth dated back to at least 400 BCE. This method also complements current approaches in silk detection, especially for the characterization of poorly preserved silks, promoting the investigation of silk origins and of ancient clothing cultures.

  9. DIGE Analysis Software and Protein Identification Approaches.

    PubMed

    Hmmier, Abduladim; Dowling, Paul

    2018-01-01

    DIGE is a high-resolution two-dimensional gel electrophoresis method, with excellent dynamic range obtained by fluorescent tag labeling of protein samples. Scanned images of DIGE gels show thousands of protein spots, each spot representing a single or a group of protein isoforms. By using commercially available software, each protein spot is defined by an outline, which is digitized and correlated with the quantity of proteins present in each spot. Software packages include DeCyder, SameSpots, and Dymension 3. In addition, proteins of interest can be excised from post-stained gels and identified with conventional mass spectrometry techniques. High-throughput mass spectrometry is performed using sophisticated instrumentation including matrix-assisted laser desorption ionization time-of-flight (MALDI-TOF), MALDI-TOF/TOF, and liquid chromatography tandem mass spectrometry (LC-MS/MS). Tandem MS (MALDI-TOF/TOF or LC-MS/MS), analyzes fragmented peptides, resulting in amino acid sequence information, especially useful when protein spots are low abundant or where a mixture of proteins is present.

  10. Identification of a follistatin-related protein from the tick Haemaphysalis longicornis and its effect on tick oviposition.

    PubMed

    Zhou, Jinlin; Liao, Min; Hatta, Takeshi; Tanaka, Miho; Xuan, Xuenan; Fujisaki, Kozo

    2006-05-10

    The identification of ovary-associated molecules will lead to a better understanding of the physiology of tick reproduction and vector-pathogen interactions. A gene encoding a follistatin-related protein (FRP) was obtained by random sequencing from the ovary cDNA library of the tick Haemaphysalis longicornis. The full-length cDNA is 1157 bp, including an intact ORF encoding an expected protein with 289 amino acids. Three distinct domains were present in the deduced amino acids, namely, the follistatin-like domain, KAZAL, and two calcium-binding motifs, EFh. The sequence shows homology with the follistatin-related protein (FRP), which was thought to play some roles in the negative regulation of cellular growth. RT-PCR showed that the gene was expressed throughout the developing stages and mainly in the ovary as well as in fat body, hemocytes, salivary glands, and midgut. This gene was expressed in GST-fused recombinant protein with an expected size. The mouse antiserum against the recombinant protein recognized a 56-kDa native protein in both tick ovary and hemolymph. The recombinant proteins were found to have binding activity for both activin A and bone morphogenetic protein-2 (BMP-2). Silencing of FRP by RNAi showed a decrease in tick oviposition, which is consistent with the effect of a recombinant protein vaccine on the adult tick. These results showed that the tick FRP might be involved in tick oviposition. This is the first report of a member of follistatin family proteins in Chelicerata, which include ticks, spiders, and scorpions.

  11. Early Identification of Infants and Toddlers with Deafblindness

    ERIC Educational Resources Information Center

    Anthony, Tanni L.

    2016-01-01

    Data from the 2014 National Center on Deaf-Blindness Count show that fewer than 100 infants and toddlers are currently identified with deafblindness across the United States and that identification rates for this population vary greatly from state to state. The author presents a key rationale for timely and accurate identification of early-onset…

  12. Accurate identification of microseismic P- and S-phase arrivals using the multi-step AIC algorithm

    NASA Astrophysics Data System (ADS)

    Zhu, Mengbo; Wang, Liguan; Liu, Xiaoming; Zhao, Jiaxuan; Peng, Ping'an

    2018-03-01

    Identification of P- and S-phase arrivals is the primary work in microseismic monitoring. In this study, a new multi-step AIC algorithm is proposed. This algorithm consists of P- and S-phase arrival pickers (P-picker and S-picker). The P-picker contains three steps: in step 1, a preliminary P-phase arrival window is determined by the waveform peak. Then a preliminary P-pick is identified using the AIC algorithm. Finally, the P-phase arrival window is narrowed based on the above P-pick. Thus the P-phase arrival can be identified accurately by using the AIC algorithm again. The S-picker contains five steps: in step 1, a narrow S-phase arrival window is determined based on the P-pick and the AIC curve of amplitude biquadratic time-series. In step 2, the S-picker automatically judges whether the S-phase arrival is clear to identify. In step 3 and 4, the AIC extreme points are extracted, and the relationship between the local minimum and the S-phase arrival is researched. In step 5, the S-phase arrival is picked based on the maximum probability criterion. To evaluate of the proposed algorithm, a P- and S-picks classification criterion is also established based on a source location numerical simulation. The field data tests show a considerable improvement of the multi-step AIC algorithm in comparison with the manual picks and the original AIC algorithm. Furthermore, the technique is independent of the kind of SNR. Even in the poor-quality signal group which the SNRs are below 5, the effective picking rates (the corresponding location error is <15 m) of P- and S-phase arrivals are still up to 80.9% and 76.4% respectively.

  13. Identification of urine protein biomarkers with the potential for early detection of lung cancer.

    PubMed

    Zhang, Hongjuan; Cao, Jing; Li, Lin; Liu, Yanbin; Zhao, Hong; Li, Nan; Li, Bo; Zhang, Aiqun; Huang, Huanwei; Chen, She; Dong, Mengqiu; Yu, Lei; Zhang, Jian; Chen, Liang

    2015-07-02

    Lung cancer is the leading cause of cancer-related deaths and has an overall 5-year survival rate lower than 15%. Large-scale clinical trials have demonstrated a significant relative reduction in mortality in high-risk individuals with low-dose computed tomography screening. However, biomarkers capable of identifying the most at-risk population and detecting lung cancer before it becomes clinically apparent are urgently needed in the clinic. Here, we report the identification of urine biomarkers capable of detecting lung cancer. Using the well-characterized inducible Kras (G12D) mouse model of lung cancer, we identified alterations in the urine proteome in tumor-bearing mice compared with sibling controls. Marked differences at the proteomic level were also detected between the urine of patients and that of healthy population controls. Importantly, we identified 7 proteins commonly found to be significantly up-regulated in both tumor-bearing mice and patients. In an independent cohort, we showed that 2 of the 7 proteins were up-regulated in urine samples from lung cancer patients but not in those from controls. The kinetics of these proteins correlated with the disease state in the mouse model. These tumor biomarkers could potentially aid in the early detection of lung cancer.

  14. Identification of lipid- and protein-based binders in paintings by direct on-plate wet chemistry and matrix-assisted laser desorption ionization mass spectrometry.

    PubMed

    Calvano, Cosima Damiana; van der Werf, Inez Dorothé; Palmisano, Francesco; Sabbatini, Luigia

    2015-01-01

    Direct on-target plate processing of small (ca. 100 μg) fragments of paint samples for MALDI-MS identification of lipid- and protein-based binders is described. Fragments were fixed on a conventional stainless steel target plate by colloidal graphite followed by in situ fast tryptic digestion and matrix addition. The new protocol was first developed on paint replicas composed of chicken egg, collagen, and cow milk mixed with inorganic pigments and then successfully applied on historical paint samples taken from a fifteenth century Italian panel painting. The present work contributes a step forward in the simplification of binder identification in very small paint samples since no conventional solvent extraction is required, speeding up the whole sample preparation to 10 min and reducing lipid/protein loss.

  15. Proteogenomics produces comprehensive and highly accurate protein-coding gene annotation in a complete genome assembly of Malassezia sympodialis

    PubMed Central

    Tellgren-Roth, Christian; Baudo, Charles D.; Kennell, John C.; Sun, Sheng; Billmyre, R. Blake; Schröder, Markus S.; Andersson, Anna; Holm, Tina; Sigurgeirsson, Benjamin; Wu, Guangxi; Sankaranarayanan, Sundar Ram; Siddharthan, Rahul; Sanyal, Kaustuv; Lundeberg, Joakim; Nystedt, Björn; Boekhout, Teun; Dawson, Thomas L.; Heitman, Joseph

    2017-01-01

    Abstract Complete and accurate genome assembly and annotation is a crucial foundation for comparative and functional genomics. Despite this, few complete eukaryotic genomes are available, and genome annotation remains a major challenge. Here, we present a complete genome assembly of the skin commensal yeast Malassezia sympodialis and demonstrate how proteogenomics can substantially improve gene annotation. Through long-read DNA sequencing, we obtained a gap-free genome assembly for M. sympodialis (ATCC 42132), comprising eight nuclear and one mitochondrial chromosome. We also sequenced and assembled four M. sympodialis clinical isolates, and showed their value for understanding Malassezia reproduction by confirming four alternative allele combinations at the two mating-type loci. Importantly, we demonstrated how proteomics data could be readily integrated with transcriptomics data in standard annotation tools. This increased the number of annotated protein-coding genes by 14% (from 3612 to 4113), compared to using transcriptomics evidence alone. Manual curation further increased the number of protein-coding genes by 9% (to 4493). All of these genes have RNA-seq evidence and 87% were confirmed by proteomics. The M. sympodialis genome assembly and annotation presented here is at a quality yet achieved only for a few eukaryotic organisms, and constitutes an important reference for future host-microbe interaction studies. PMID:28100699

  16. Protein-protein interface analysis and hot spots identification for chemical ligand design.

    PubMed

    Chen, Jing; Ma, Xiaomin; Yuan, Yaxia; Pei, Jianfeng; Lai, Luhua

    2014-01-01

    Rational design for chemical compounds targeting protein-protein interactions has grown from a dream to reality after a decade of efforts. There are an increasing number of successful examples, though major challenges remain in the field. In this paper, we will first give a brief review of the available methods that can be used to analyze protein-protein interface and predict hot spots for chemical ligand design. New developments of binding sites detection, ligandability and hot spots prediction from the author's group will also be described. Pocket V.3 is an improved program for identifying hot spots in protein-protein interface using only an apo protein structure. It has been developed based on Pocket V.2 that can derive receptor-based pharmacophore model for ligand binding cavity. Given similarities and differences between the essence of pharmacophore and hot spots for guiding design of chemical compounds, not only energetic but also spatial properties of protein-protein interface are used in Pocket V.3 for dealing with protein-protein interface. In order to illustrate the capability of Pocket V.3, two datasets have been used. One is taken from ASEdb and BID having experimental alanine scanning results for testing hot spots prediction. The other is taken from the 2P2I database containing complex structures of protein-ligand binding at the original protein-protein interface for testing hot spots application in ligand design.

  17. Sequence Identification, Recombinant Production, and Analysis of the Self-Assembly of Egg Stalk Silk Proteins from Lacewing Chrysoperla carnea.

    PubMed

    Neuenfeldt, Martin; Scheibel, Thomas

    2017-06-13

    Egg stalk silks of the common green lacewing Chrysoperla carnea likely comprise at least three different silk proteins. Based on the natural spinning process, it was hypothesized that these proteins self-assemble without shear stress, as adult lacewings do not use a spinneret. To examine this, the first sequence identification and determination of the gene expression profile of several silk proteins and various transcript variants thereof was conducted, and then the three major proteins were recombinantly produced in Escherichia coli encoded by their native complementary DNA (cDNA) sequences. Circular dichroism measurements indicated that the silk proteins in aqueous solutions had a mainly intrinsically disordered structure. The largest silk protein, which we named ChryC1, exhibited a lower critical solution temperature (LCST) behavior and self-assembled into fibers or film morphologies, depending on the conditions used. The second silk protein, ChryC2, self-assembled into nanofibrils and subsequently formed hydrogels. Circular dichroism and Fourier transform infrared spectroscopy confirmed conformational changes of both proteins into beta sheet rich structures upon assembly. ChryC3 did not self-assemble into any morphology under the tested conditions. Thereby, through this work, it could be shown that recombinant lacewing silk proteins can be produced and further used for studying the fiber formation of lacewing egg stalks.

  18. The Search Engine for Multi-Proteoform Complexes: An Online Tool for the Identification and Stoichiometry Determination of Protein Complexes.

    PubMed

    Skinner, Owen S; Schachner, Luis F; Kelleher, Neil L

    2016-12-08

    Recent advances in top-down mass spectrometry using native electrospray now enable the analysis of intact protein complexes with relatively small sample amounts in an untargeted mode. Here, we describe how to characterize both homo- and heteropolymeric complexes with high molecular specificity using input data produced by tandem mass spectrometry of whole protein assemblies. The tool described is a "search engine for multi-proteoform complexes," (SEMPC) and is available for free online. The output is a list of candidate multi-proteoform complexes and scoring metrics, which are used to define a distinct set of one or more unique protein subunits, their overall stoichiometry in the intact complex, and their pre- and post-translational modifications. Thus, we present an approach for the identification and characterization of intact protein complexes from native mass spectrometry data. © 2016 by John Wiley & Sons, Inc. Copyright © 2016 John Wiley & Sons, Inc.

  19. Eyewitness Identification Accuracy and Response Latency: The Unruly 10-12-Second Rule

    ERIC Educational Resources Information Center

    Weber, Nathan; Brewer, Neil; Wells, Gary L.; Semmler, Carolyn; Keast, Amber

    2004-01-01

    Data are reported from 3,213 research eyewitnesses confirming that accurate eyewitness identifications from lineups are made faster than are inaccurate identifications. However, consistent with predictions from the recognition and search literatures, the authors did not find support for the "10-12-s rule" in which lineup identifications faster…

  20. Accurate and sensitive quantification of protein-DNA binding affinity.

    PubMed

    Rastogi, Chaitanya; Rube, H Tomas; Kribelbauer, Judith F; Crocker, Justin; Loker, Ryan E; Martini, Gabriella D; Laptenko, Oleg; Freed-Pastor, William A; Prives, Carol; Stern, David L; Mann, Richard S; Bussemaker, Harmen J

    2018-04-17

    Transcription factors (TFs) control gene expression by binding to genomic DNA in a sequence-specific manner. Mutations in TF binding sites are increasingly found to be associated with human disease, yet we currently lack robust methods to predict these sites. Here, we developed a versatile maximum likelihood framework named No Read Left Behind (NRLB) that infers a biophysical model of protein-DNA recognition across the full affinity range from a library of in vitro selected DNA binding sites. NRLB predicts human Max homodimer binding in near-perfect agreement with existing low-throughput measurements. It can capture the specificity of the p53 tetramer and distinguish multiple binding modes within a single sample. Additionally, we confirm that newly identified low-affinity enhancer binding sites are functional in vivo, and that their contribution to gene expression matches their predicted affinity. Our results establish a powerful paradigm for identifying protein binding sites and interpreting gene regulatory sequences in eukaryotic genomes. Copyright © 2018 the Author(s). Published by PNAS.

  1. Accurate and sensitive quantification of protein-DNA binding affinity

    PubMed Central

    Rastogi, Chaitanya; Rube, H. Tomas; Kribelbauer, Judith F.; Crocker, Justin; Loker, Ryan E.; Martini, Gabriella D.; Laptenko, Oleg; Freed-Pastor, William A.; Prives, Carol; Stern, David L.; Mann, Richard S.; Bussemaker, Harmen J.

    2018-01-01

    Transcription factors (TFs) control gene expression by binding to genomic DNA in a sequence-specific manner. Mutations in TF binding sites are increasingly found to be associated with human disease, yet we currently lack robust methods to predict these sites. Here, we developed a versatile maximum likelihood framework named No Read Left Behind (NRLB) that infers a biophysical model of protein-DNA recognition across the full affinity range from a library of in vitro selected DNA binding sites. NRLB predicts human Max homodimer binding in near-perfect agreement with existing low-throughput measurements. It can capture the specificity of the p53 tetramer and distinguish multiple binding modes within a single sample. Additionally, we confirm that newly identified low-affinity enhancer binding sites are functional in vivo, and that their contribution to gene expression matches their predicted affinity. Our results establish a powerful paradigm for identifying protein binding sites and interpreting gene regulatory sequences in eukaryotic genomes. PMID:29610332

  2. rTANDEM, an R/Bioconductor package for MS/MS protein identification.

    PubMed

    Fournier, Frédéric; Joly Beauparlant, Charles; Paradis, René; Droit, Arnaud

    2014-08-01

    rTANDEM is an R/Bioconductor package that interfaces the X!Tandem protein identification algorithm. The package can run the multi-threaded algorithm on proteomic data files directly from R. It also provides functions to convert search parameters and results to/from R as well as functions to manipulate parameters and automate searches. An associated R package, shinyTANDEM, provides a web-based graphical interface to visualize and interpret the results. Together, those two packages form an entry point for a general MS/MS-based proteomic pipeline in R/Bioconductor. rTANDEM and shinyTANDEM are distributed in R/Bioconductor, http://bioconductor.org/packages/release/bioc/. The packages are under open licenses (GPL-3 and Artistice-1.0). frederic.fournier@crchuq.ulaval.ca or arnaud.droit@crchuq.ulaval.ca Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  3. A pseudo MS3 approach for identification of disulfide-bonded proteins: uncommon product ions and database search.

    PubMed

    Chen, Jianzhong; Shiyanov, Pavel; Schlager, John J; Green, Kari B

    2012-02-01

    It has previously been reported that disulfide and backbone bonds of native intact proteins can be concurrently cleaved using electrospray ionization (ESI) and collision-induced dissociation (CID) tandem mass spectrometry (MS/MS). However, the cleavages of disulfide bonds result in different cysteine modifications in product ions, making it difficult to identify the disulfide-bonded proteins via database search. To solve this identification problem, we have developed a pseudo MS(3) approach by combining nozzle-skimmer dissociation (NSD) and CID on a quadrupole time-of-flight (Q-TOF) mass spectrometer using chicken lysozyme as a model. Although many of the product ions were similar to those typically seen in MS/MS spectra of enzymatically derived peptides, additional uncommon product ions were detected including c(i-1) ions (the i(th) residue being aspartic acid, arginine, lysine and dehydroalanine) as well as those from a scrambled sequence. The formation of these uncommon types of product ions, likely caused by the lack of mobile protons, were proposed to involve bond rearrangements via a six-membered ring transition state and/or salt bridge(s). A search of 20 pseudo MS(3) spectra against the Gallus gallus (chicken) database using Batch-Tag, a program originally designed for bottom up MS/MS analysis, identified chicken lysozyme as the only hit with the expectation values less than 0.02 for 12 of the spectra. The pseudo MS(3) approach may help to identify disulfide-bonded proteins and determine the associated post-translational modifications (PTMs); the confidence in the identification may be improved by incorporating the fragmentation characteristics into currently available search programs. © American Society for Mass Spectrometry, 2011

  4. Matrix-assisted laser desorption ionization-time of flight mass spectrometry for fast and accurate identification of Pseudallescheria/Scedosporium species.

    PubMed

    Sitterlé, E; Giraud, S; Leto, J; Bouchara, J P; Rougeron, A; Morio, F; Dauphin, B; Angebault, C; Quesne, G; Beretti, J L; Hassouni, N; Nassif, X; Bougnoux, M E

    2014-09-01

    An increasing number of infections due to Pseudallescheria/Scedosporium species has been reported during the past decades, both in immunocompromised and immunocompetent patients. Additionally, these fungi are now recognized worldwide as common agents of fungal colonization of the airways in cystic fibrosis patients, which represents a risk factor for disseminated infections after lung transplantation. Currently six species are described within the Pseudallescheria/Scedosporium genus, including Scedosporium prolificans and species of the Pseudallescheria/Scedosporium apiospermum complex (i.e. S. apiospermum sensu stricto, Pseudallescheria boydii, Scedosporium aurantiacum, Pseudallescheria minutispora and Scedosporium dehoogii). Precise identification of clinical isolates at the species level is required because these species differ in their antifungal drug susceptibility patterns. Matrix-assisted laser desorption ionization (MALDI)-time of flight (TOF)/mass spectrometry (MS) is a powerful tool to rapidly identify moulds at the species level. We investigated the potential of this technology to discriminate Pseudallescheria/Scedosporium species. Forty-seven reference strains were used to build a reference database library. Profiles from 3-, 5- and 7-day-old cultures of each reference strain were analysed to identify species-specific discriminating profiles. The database was tested for accuracy using a set of 64 clinical or environmental isolates previously identified by multilocus sequencing. All isolates were unequivocally identified at the species level by MALDI-TOF/MS. Our results, obtained using a simple protocol, without prior protein extraction or standardization of the culture, demonstrate that MALDI-TOF/MS is a powerful tool for rapid identification of Pseudallescheria/Scedosporium species that cannot be currently identified by morphological examination in the clinical setting. © 2014 The Authors Clinical Microbiology and Infection © 2014 European Society

  5. Large-scale identification of c-MYC-associated proteins using a combined TAP/MudPIT approach.

    PubMed

    Koch, Heike B; Zhang, Ru; Verdoodt, Berlinda; Bailey, Aaron; Zhang, Chang-Dong; Yates, John R; Menssen, Antje; Hermeking, Heiko

    2007-01-15

    The c-MYC oncogene encodes a transcription factor, which is sufficient and necessary for the induction of cellular proliferation. However, the c-MYC protein is a relatively weak transactivator suggesting that it may have other functions. To identify protein interactors which may reveal new functions or represent regulators of c-MYC we systematically identified proteins associated with c-MYC in vivo using a proteomic approach. We combined tandem affinity purification (TAP) with the mass spectral multidimensional protein identification technology (MudPIT). Thereby, 221 c-MYC-associated proteins were identified. Among them were 17 previously known c-MYC-interactors. Selected new c-MYC-associated proteins (DBC-1, FBX29, KU70, MCM7, Mi2-beta/CHD4, RNA Pol II, RFC2, RFC3, SV40 Large T Antigen, TCP1alpha, U5-116kD, ZNF281) were confirmed independently. For association with MCM7, SV40 Large T Antigen and DBC-1 the functionally important MYC-box II region was required, whereas FBX29 and Mi2-beta interacted via MYC-box II and the BR-HLH-LZ motif. In addition, regulators of c-MYC activity were identified: ectopic expression of FBX29, an E3 ubiquitin ligase, decreased c-MYC protein levels and inhibited c-MYC transactivation, whereas knock-down of FBX29 elevated the concentration of c-MYC. Furthermore, sucrose gradient analysis demonstrated that c-MYC is present in numerous complexes with varying size and composition, which may accommodate the large number of new c-MYC-associated proteins identified here and mediate the diverse functions of c-MYC. Our results suggest that c-MYC, besides acting as a mitogenic transcription factor, regulates cellular proliferation by direct association with protein complexes involved in multiple synthetic processes required for cell division, as for example DNA-replication/repair and RNA-processing. Furthermore, this first comprehensive description of the c-MYC-associated sub-proteome will facilitate further studies aimed to elucidate the biology

  6. PSI/TM-Coffee: a web server for fast and accurate multiple sequence alignments of regular and transmembrane proteins using homology extension on reduced databases.

    PubMed

    Floden, Evan W; Tommaso, Paolo D; Chatzou, Maria; Magis, Cedrik; Notredame, Cedric; Chang, Jia-Ming

    2016-07-08

    The PSI/TM-Coffee web server performs multiple sequence alignment (MSA) of proteins by combining homology extension with a consistency based alignment approach. Homology extension is performed with Position Specific Iterative (PSI) BLAST searches against a choice of redundant and non-redundant databases. The main novelty of this server is to allow databases of reduced complexity to rapidly perform homology extension. This server also gives the possibility to use transmembrane proteins (TMPs) reference databases to allow even faster homology extension on this important category of proteins. Aside from an MSA, the server also outputs topological prediction of TMPs using the HMMTOP algorithm. Previous benchmarking of the method has shown this approach outperforms the most accurate alignment methods such as MSAProbs, Kalign, PROMALS, MAFFT, ProbCons and PRALINE™. The web server is available at http://tcoffee.crg.cat/tmcoffee. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  7. Identification of the DotL Coupling Protein Subcomplex of the Legionella Dot/Icm Type IV Secretion System

    PubMed Central

    Vincent, Carr D.; Friedman, Jonathan R.; Jeong, Kwang Cheol; Sutherland, Molly C.; Vogel, Joseph P.

    2012-01-01

    Summary Legionella pneumophila, the causative agent of Legionnaires’ disease, survives in macrophages by altering the endocytic pathway of its host cell. To accomplish this, the bacterium utilizes a type IVB secretion system to deliver effector molecules into the host cell cytoplasm. In a previous report, we performed an extensive characterization of the L. pneumophila type IVB secretion system that resulted in the identification of a critical five-protein subcomplex that forms the core of the secretion apparatus. Here we describe a second Dot/Icm protein subassembly composed of the type IV coupling protein DotL, the apparatus proteins DotM and DotN, and the secretion adaptor proteins IcmS and IcmW. In the absence of IcmS or IcmW, DotL becomes destabilized at the transition from the exponential to stationary phases of growth, concurrent with the expression of many secreted substrates. Loss of DotL is dependent on ClpA, a regulator of the cytoplasmic protease ClpP. The resulting decreased levels of DotL in the icmS and icmW mutants exacerbates the intracellular defects of these strains and can be partially suppressed by overproduction of DotL. Thus, in addition to their role as chaperones for Legionella T4SS substrates, IcmS and IcmW perform a second function as part of the Dot/Icm type IV coupling protein subcomplex. PMID:22694730

  8. A new coarse-grained model for E. coli cytoplasm: accurate calculation of the diffusion coefficient of proteins and observation of anomalous diffusion.

    PubMed

    Hasnain, Sabeeha; McClendon, Christopher L; Hsu, Monica T; Jacobson, Matthew P; Bandyopadhyay, Pradipta

    2014-01-01

    A new coarse-grained model of the E. coli cytoplasm is developed by describing the proteins of the cytoplasm as flexible units consisting of one or more spheres that follow Brownian dynamics (BD), with hydrodynamic interactions (HI) accounted for by a mean-field approach. Extensive BD simulations were performed to calculate the diffusion coefficients of three different proteins in the cellular environment. The results are in close agreement with experimental or previously simulated values, where available. Control simulations without HI showed that use of HI is essential to obtain accurate diffusion coefficients. Anomalous diffusion inside the crowded cellular medium was investigated with Fractional Brownian motion analysis, and found to be present in this model. By running a series of control simulations in which various forces were removed systematically, it was found that repulsive interactions (volume exclusion) are the main cause for anomalous diffusion, with a secondary contribution from HI.

  9. MoRFchibi SYSTEM: software tools for the identification of MoRFs in protein sequences.

    PubMed

    Malhis, Nawar; Jacobson, Matthew; Gsponer, Jörg

    2016-07-08

    Molecular recognition features, MoRFs, are short segments within longer disordered protein regions that bind to globular protein domains in a process known as disorder-to-order transition. MoRFs have been found to play a significant role in signaling and regulatory processes in cells. High-confidence computational identification of MoRFs remains an important challenge. In this work, we introduce MoRFchibi SYSTEM that contains three MoRF predictors: MoRFCHiBi, a basic predictor best suited as a component in other applications, MoRFCHiBi_ Light, ideal for high-throughput predictions and MoRFCHiBi_ Web, slower than the other two but best for high accuracy predictions. Results show that MoRFchibi SYSTEM provides more than double the precision of other predictors. MoRFchibi SYSTEM is available in three different forms: as HTML web server, RESTful web server and downloadable software at: http://www.chibi.ubc.ca/faculty/joerg-gsponer/gsponer-lab/software/morf_chibi/. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  10. Detecting protein-protein interactions in the intact cell of Bacillus subtilis (ATCC 6633).

    PubMed

    Winters, Michael S; Day, R A

    2003-07-01

    The salt bridge, paired group-specific reagent cyanogen (ethanedinitrile; C(2)N(2)) converts naturally occurring pairs of functional groups into covalently linked products. Cyanogen readily permeates cell walls and membranes. When the paired groups are shared between associated proteins, isolation of the covalently linked proteins allows their identity to be assigned. Examination of organisms of known genome sequence permits identification of the linked proteins by mass spectrometric techniques applied to peptides derived from them. The cyanogen-linked proteins were isolated by polyacrylamide gel electrophoresis. Digestion of the isolated proteins with proteases of known specificity afforded sets of peptides that could be analyzed by mass spectrometry. These data were compared with those derived theoretically from the Swiss Protein Database by computer-based comparisons (Protein Prospector; http://prospector.ucsf.edu). Identification of associated proteins in the ribosome of Bacillus subtilis strain ATCC 6633 showed that there is an association homology with the association patterns of the ribosomal proteins of Haloarcula marismortui and Thermus thermophilus. In addition, other proteins involved in protein biosynthesis were shown to be associated with ribosomal proteins.

  11. Detecting Protein-Protein Interactions in the Intact Cell of Bacillus subtilis (ATCC 6633)

    PubMed Central

    Winters, Michael S.; Day, R. A.

    2003-01-01

    The salt bridge, paired group-specific reagent cyanogen (ethanedinitrile; C2N2) converts naturally occurring pairs of functional groups into covalently linked products. Cyanogen readily permeates cell walls and membranes. When the paired groups are shared between associated proteins, isolation of the covalently linked proteins allows their identity to be assigned. Examination of organisms of known genome sequence permits identification of the linked proteins by mass spectrometric techniques applied to peptides derived from them. The cyanogen-linked proteins were isolated by polyacrylamide gel electrophoresis. Digestion of the isolated proteins with proteases of known specificity afforded sets of peptides that could be analyzed by mass spectrometry. These data were compared with those derived theoretically from the Swiss Protein Database by computer-based comparisons (Protein Prospector; http://prospector.ucsf.edu). Identification of associated proteins in the ribosome of Bacillus subtilis strain ATCC 6633 showed that there is an association homology with the association patterns of the ribosomal proteins of Haloarcula marismortui and Thermus thermophilus. In addition, other proteins involved in protein biosynthesis were shown to be associated with ribosomal proteins. PMID:12837803

  12. Identification of StARD3 as a Lutein-binding Protein in the Macula of the Primate Retina†

    PubMed Central

    Li, Binxing; Vachali, Preejith; Frederick, Jeanne M.; Bernstein, Paul S.

    2011-01-01

    Lutein, zeaxanthin and their metabolites are the xanthophyll carotenoids that form the macular pigment of the human retina. Epidemiological evidence suggests that high levels of these carotenoids in the diet, serum and macula are associated with decreased risk of age-related macular degeneration (AMD), and the AREDS2 study is prospectively testing this hypothesis. Understanding the biochemical mechanisms underlying the selective uptakes of lutein and zeaxanthin into the human macula may provide important insights into the physiology of the human macula in health and disease. GSTP1 is the macular zeaxanthin-binding protein, but the identity of the human macular lutein-binding protein has remained elusive. Prior identification of the silkworm lutein-binding protein (CBP) as a member of the steroidogenic acute regulatory domain (StARD) protein family, and selective labeling of monkey photoreceptor inner segments by anti-CBP antibody provided an important clue toward identifying the primate retina lutein-binding protein. Homology of CBP to all 15 human StARD proteins was analyzed using database searches, western blotting and immunohistochemistry, and we here provide evidence to identify StARD3 (also known as MLN64) as a human retinal lutein-binding protein. Further, recombinant StARD3 selectively binds lutein with high affinity (KD = 0.45 micromolar) when assessed by surface plasmon resonance (SPR) binding assays. Our results demonstrate previously unrecognized, specific interactions of StARD3 with lutein and provide novel avenues to explore its roles in human macular physiology and disease. PMID:21322544

  13. Druggable orthosteric and allosteric hot spots to target protein-protein interactions.

    PubMed

    Ma, Buyong; Nussinov, Ruth

    2014-01-01

    Drug designing targeting protein-protein interactions is challenging. Because structural elucidation and computational analysis have revealed the importance of hot spot residues in stabilizing these interactions, there have been on-going efforts to develop drugs which bind the hot spots and out-compete the native protein partners. The question arises as to what are the key 'druggable' properties of hot spots in protein-protein interactions and whether these mimic the general hot spot definition. Identification of orthosteric (at the protein- protein interaction site) and allosteric (elsewhere) druggable hot spots is expected to help in discovering compounds that can more effectively modulate protein-protein interactions. For example, are there any other significant features beyond their location in pockets in the interface? The interactions of protein-protein hot spots are coupled with conformational dynamics of protein complexes. Currently increasing efforts focus on the allosteric drug discovery. Allosteric drugs bind away from the native binding site and can modulate the native interactions. We propose that identification of allosteric hot spots could similarly help in more effective allosteric drug discovery. While detection of allosteric hot spots is challenging, targeting drugs to these residues has the potential of greatly increasing the hot spot and protein druggability.

  14. Use of MALDI Mass Spectrometry for Identification of Microbes

    NASA Astrophysics Data System (ADS)

    Wilkins, C. L.; Stump, M.; Jones, J.; Lay, J. O.; Fleming, R.

    2003-12-01

    Recently, it has been demonstrated that bacteria can be characterized using whole cells and matrix-assisted laser desorption/ionization mass spectrometry (MALDI-MS). However, identification of specific bacterial proteins usually requires analysis of cellular fractions or purified extracts. This presentation will discuss the first application of Fourier transform mass spectrometry (FTMS) to analysis of bacterial proteins directly from whole cells. In this research it is seen that accurate mass MALDI-FTMS can be used to characterize specific ribosomal proteins directly from Escherichia coli cells. Using the high-accuracy mass measurements and high resolution isotope profile data thus available it is possible to confirm posttranslational modifications proposed previously on the basis of low resolution mass measurements. In our initial work, ribosomal proteins from E. coli whole cells were observed with errors of less than 27 ppm. This was accomplished directly from whole cells without fractionation, concentration, or overt overexpression of characteristic cellular proteins. More recently, by use of carbon and nitrogen isotopically-depleted growth media additional E. coli proteins have been identified with even smaller mass measurement errors. MALDI FTMS also provided information regarding E. coli lipids in the low-mass region. Although ions with m/z values below 1000 were previously observed by FTMS of whole cells, the work to be presented was the first report of detection of ions in the 5000 to 10 000 m/z range by MALDI-FTMS using whole cells. The implications of these results for genus, species, and strain assignments of such organisms will be discussed.

  15. Identification of RNAIII-binding proteins in Staphylococcus aureus using tethered RNAs and streptavidin aptamers based pull-down assay.

    PubMed

    Zhang, Xu; Zhu, Qing; Tian, Tian; Zhao, Changlong; Zang, Jianye; Xue, Ting; Sun, Baolin

    2015-05-15

    It has been widely recognized that small RNAs (sRNAs) play important roles in physiology and virulence control in bacteria. In Staphylococcus aureus, many sRNAs have been identified and some of them have been functionally studied. Since it is difficult to identify RNA-binding proteins (RBPs), very little has been known about the RBPs in S. aureus, especially those associated with sRNAs. Here we adopted a tRNA scaffold streptavidin aptamer based pull-down assay to identify RBPs in S. aureus. The tethered RNA was successfully captured by the streptavidin magnetic beads, and proteins binding to RNAIII were isolated and analyzed by mass spectrometry. We have identified 81 proteins, and expressed heterologously 9 of them in Escherichia coli. The binding ability of the recombinant proteins with RNAIII was further analyzed by electrophoresis mobility shift assay, and the result indicates that proteins CshA, RNase J2, Era, Hu, WalR, Pyk, and FtsZ can bind to RNAIII. This study suggests that some proteins can bind to RNA III in S. aureus, and may be involved in RNA III function. And tRSA based pull-down assay is an effective method to search for RBPs in bacteria, which should facilitate the identification and functional study of RBPs in diverse bacterial species.

  16. Purification and Identification of Membrane Proteins from Urinary Extracellular Vesicles using Triton X-114 Phase Partitioning.

    PubMed

    Hu, Shuiwang; Musante, Luca; Tataruch, Dorota; Xu, Xiaomeng; Kretz, Oliver; Henry, Michael; Meleady, Paula; Luo, Haihua; Zou, Hequn; Jiang, Yong; Holthofer, Harry

    2018-01-05

    Urinary extracellular vesicles (uEVs) have become a promising source for biomarkers accurately reflecting biochemical changes in kidney and urogenital diseases. Characteristically, uEVs are rich in membrane proteins associated with several cellular functions like adhesion, transport, and signaling. Hence, membrane proteins of uEVs should represent an exciting protein class with unique biological properties. In this study, we utilized uEVs to optimize the Triton X-114 detergent partitioning protocol targeted for membrane proteins and proceeded to their subsequent characterization while eliminating effects of Tamm-Horsfall protein, the most abundant interfering protein in urine. This is the first report aiming to enrich and characterize the integral transmembrane proteins present in human urinary vesicles. First, uEVs were enriched using a "hydrostatic filtration dialysis'' appliance, and then the enriched uEVs and lysates were verified by transmission electron microscopy. After using Triton X-114 phase partitioning, we generated an insoluble pellet fraction and aqueous phase (AP) and detergent phase (DP) fractions and analyzed them with LC-MS/MS. Both in- and off-gel protein digestion methods were used to reveal an increased number of membrane proteins of uEVs. After comparing with the identified proteins without phase separation as in our earlier publication, 199 different proteins were detected in DP. Prediction of transmembrane domains (TMDs) from these protein fractions showed that DP had more TMDs than other groups. The analyses of hydrophobicity revealed that the GRAVY score of DP was much higher than those of the other fractions. Furthermore, the analysis of proteins with lipid anchor revealed that DP proteins had more lipid anchors than other fractions. Additionally, KEGG pathway analysis showed that the DP proteins detected participate in endocytosis and signaling, which is consistent with the expected biological functions of membrane proteins. Finally

  17. Identification of bacteria isolated from veterinary clinical specimens using MALDI-TOF MS.

    PubMed

    Pavlovic, Melanie; Wudy, Corinna; Zeller-Peronnet, Veronique; Maggipinto, Marzena; Zimmermann, Pia; Straubinger, Alix; Iwobi, Azuka; Märtlbauer, Erwin; Busch, Ulrich; Huber, Ingrid

    2015-01-01

    Matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) has recently emerged as a rapid and accurate identification method for bacterial species. Although it has been successfully applied for the identification of human pathogens, it has so far not been well evaluated for routine identification of veterinary bacterial isolates. This study was performed to compare and evaluate the performance of MALDI-TOF MS based identification of veterinary bacterial isolates with commercially available conventional test systems. Discrepancies of both methods were resolved by sequencing 16S rDNA and, if necessary, the infB gene for Actinobacillus isolates. A total of 375 consecutively isolated veterinary samples were collected. Among the 357 isolates (95.2%) correctly identified at the genus level by MALDI-TOF MS, 338 of them (90.1% of the total isolates) were also correctly identified at the species level. Conventional methods offered correct species identification for 319 isolates (85.1%). MALDI-TOF identification therefore offered more accurate identification of veterinary bacterial isolates. An update of the in-house mass spectra database with additional reference spectra clearly improved the identification results. In conclusion, the presented data suggest that MALDI-TOF MS is an appropriate platform for classification and identification of veterinary bacterial isolates.

  18. Screening and identification of resistance related proteins from apple leaves inoculated with Marssonina coronaria (EII. & J. J. Davis)

    PubMed Central

    2014-01-01

    Background Apple, an invaluable fruit crop worldwide, is often prone to infection by pathogenic fungi. Identification of potentially resistance-conferring apple proteins is one of the most important aims for studying apple resistance mechanisms and promoting the development of disease-resistant apple strains. In order to find proteins which promote resistance to Marssonina coronaria, a deadly pathogen which has been related to premature apple maturation, proteomes from apple leaves inoculated with M. coronaria were characterized at 3 and 6 days post-inoculation by two dimensional electrophoresis (2-DE). Results Overall, 59 differentially accumulated protein spots between inoculation and non-inoculation were successfully identified and aligned as 35 different proteins or protein families which involved in photosynthesis, amino acid metabolism, transport, energy metabolism, carbohydrate metabolism, binding, antioxidant, defense and stress. Quantitative real-time PCR (qRT-PCR) was also used to examine the change of some defense and stress related genes abundance under inoculated conditions. Conclusions In a conclusion, different proteins in response to Marssonina coronaria were identified by proteomic analysis. Among of these proteins, there are some PR proteins, for example class III endo-chitinase, beta-1,3-glucanase and thaumatine-like protein, and some antioxidant related proteins including aldo/keto reductase AKR, ascorbate peroxidase and phi class glutathione S-transferase protein that were associated with disease resistance. The transcription levels of class III endo-chitinase (L13) and beta-1, 3-glucanase (L17) have a good relation with the abundance of the encoded protein’s accumulation, however, the mRNA abundance of thaumatine-like protein (L22) and ascorbate peroxidase (L28) are not correlated with their protein abundance of encoded protein. To elucidate the resistant mechanism, the data in the present study will promote us to investigate further the

  19. Identification of Human N-Myristoylated Proteins from Human Complementary DNA Resources by Cell-Free and Cellular Metabolic Labeling Analyses.

    PubMed

    Takamitsu, Emi; Otsuka, Motoaki; Haebara, Tatsuki; Yano, Manami; Matsuzaki, Kanako; Kobuchi, Hirotsugu; Moriya, Koko; Utsumi, Toshihiko

    2015-01-01

    To identify physiologically important human N-myristoylated proteins, 90 cDNA clones predicted to encode human N-myristoylated proteins were selected from a human cDNA resource (4,369 Kazusa ORFeome project human cDNA clones) by two bioinformatic N-myristoylation prediction systems, NMT-The MYR Predictor and Myristoylator. After database searches to exclude known human N-myristoylated proteins, 37 cDNA clones were selected as potential human N-myristoylated proteins. The susceptibility of these cDNA clones to protein N-myristoylation was first evaluated using fusion proteins in which the N-terminal ten amino acid residues were fused to an epitope-tagged model protein. Then, protein N-myristoylation of the gene products of full-length cDNAs was evaluated by metabolic labeling experiments both in an insect cell-free protein synthesis system and in transfected human cells. As a result, the products of 13 cDNA clones (FBXL7, PPM1B, SAMM50, PLEKHN, AIFM3, C22orf42, STK32A, FAM131C, DRICH1, MCC1, HID1, P2RX5, STK32B) were found to be human N-myristoylated proteins. Analysis of the role of protein N-myristoylation on the intracellular localization of SAMM50, a mitochondrial outer membrane protein, revealed that protein N-myristoylation was required for proper targeting of SAMM50 to mitochondria. Thus, the strategy used in this study is useful for the identification of physiologically important human N-myristoylated proteins from human cDNA resources.

  20. Identification of Human N-Myristoylated Proteins from Human Complementary DNA Resources by Cell-Free and Cellular Metabolic Labeling Analyses

    PubMed Central

    Takamitsu, Emi; Otsuka, Motoaki; Haebara, Tatsuki; Yano, Manami; Matsuzaki, Kanako; Kobuchi, Hirotsugu; Moriya, Koko; Utsumi, Toshihiko

    2015-01-01

    To identify physiologically important human N-myristoylated proteins, 90 cDNA clones predicted to encode human N-myristoylated proteins were selected from a human cDNA resource (4,369 Kazusa ORFeome project human cDNA clones) by two bioinformatic N-myristoylation prediction systems, NMT-The MYR Predictor and Myristoylator. After database searches to exclude known human N-myristoylated proteins, 37 cDNA clones were selected as potential human N-myristoylated proteins. The susceptibility of these cDNA clones to protein N-myristoylation was first evaluated using fusion proteins in which the N-terminal ten amino acid residues were fused to an epitope-tagged model protein. Then, protein N-myristoylation of the gene products of full-length cDNAs was evaluated by metabolic labeling experiments both in an insect cell-free protein synthesis system and in transfected human cells. As a result, the products of 13 cDNA clones (FBXL7, PPM1B, SAMM50, PLEKHN, AIFM3, C22orf42, STK32A, FAM131C, DRICH1, MCC1, HID1, P2RX5, STK32B) were found to be human N-myristoylated proteins. Analysis of the role of protein N-myristoylation on the intracellular localization of SAMM50, a mitochondrial outer membrane protein, revealed that protein N-myristoylation was required for proper targeting of SAMM50 to mitochondria. Thus, the strategy used in this study is useful for the identification of physiologically important human N-myristoylated proteins from human cDNA resources. PMID:26308446

  1. Nano-LC FTICR tandem mass spectrometry for top-down proteomics: routine baseline unit mass resolution of whole cell lysate proteins up to 72 kDa.

    PubMed

    Tipton, Jeremiah D; Tran, John C; Catherman, Adam D; Ahlf, Dorothy R; Durbin, Kenneth R; Lee, Ji Eun; Kellie, John F; Kelleher, Neil L; Hendrickson, Christopher L; Marshall, Alan G

    2012-03-06

    Current high-throughput top-down proteomic platforms provide routine identification of proteins less than 25 kDa with 4-D separations. This short communication reports the application of technological developments over the past few years that improve protein identification and characterization for masses greater than 25 kDa. Advances in separation science have allowed increased numbers of proteins to be identified, especially by nanoliquid chromatography (nLC) prior to mass spectrometry (MS) analysis. Further, a goal of high-throughput top-down proteomics is to extend the mass range for routine nLC MS analysis up to 80 kDa because gene sequence analysis predicts that ~70% of the human proteome is transcribed to be less than 80 kDa. Normally, large proteins greater than 50 kDa are identified and characterized by top-down proteomics through fraction collection and direct infusion at relatively low throughput. Further, other MS-based techniques provide top-down protein characterization, however at low resolution for intact mass measurement. Here, we present analysis of standard (up to 78 kDa) and whole cell lysate proteins by Fourier transform ion cyclotron resonance mass spectrometry (nLC electrospray ionization (ESI) FTICR MS). The separation platform reduced the complexity of the protein matrix so that, at 14.5 T, proteins from whole cell lysate up to 72 kDa are baseline mass resolved on a nano-LC chromatographic time scale. Further, the results document routine identification of proteins at improved throughput based on accurate mass measurement (less than 10 ppm mass error) of precursor and fragment ions for proteins up to 50 kDa.

  2. Proteomics: Protein Identification Using Online Databases

    ERIC Educational Resources Information Center

    Eurich, Chris; Fields, Peter A.; Rice, Elizabeth

    2012-01-01

    Proteomics is an emerging area of systems biology that allows simultaneous study of thousands of proteins expressed in cells, tissues, or whole organisms. We have developed this activity to enable high school or college students to explore proteomic databases using mass spectrometry data files generated from yeast proteins in a college laboratory…

  3. OpenKnowledge for peer-to-peer experimentation in protein identification by MS/MS

    PubMed Central

    2011-01-01

    Background Traditional scientific workflow platforms usually run individual experiments with little evaluation and analysis of performance as required by automated experimentation in which scientists are being allowed to access numerous applicable workflows rather than being committed to a single one. Experimental protocols and data under a peer-to-peer environment could potentially be shared freely without any single point of authority to dictate how experiments should be run. In such environment it is necessary to have mechanisms by which each individual scientist (peer) can assess, locally, how he or she wants to be involved with others in experiments. This study aims to implement and demonstrate simple peer ranking under the OpenKnowledge peer-to-peer infrastructure by both simulated and real-world bioinformatics experiments involving multi-agent interactions. Methods A simulated experiment environment with a peer ranking capability was specified by the Lightweight Coordination Calculus (LCC) and automatically executed under the OpenKnowledge infrastructure. The peers such as MS/MS protein identification services (including web-enabled and independent programs) were made accessible as OpenKnowledge Components (OKCs) for automated execution as peers in the experiments. The performance of the peers in these automated experiments was monitored and evaluated by simple peer ranking algorithms. Results Peer ranking experiments with simulated peers exhibited characteristic behaviours, e.g., power law effect (a few dominant peers dominate), similar to that observed in the traditional Web. Real-world experiments were run using an interaction model in LCC involving two different types of MS/MS protein identification peers, viz., peptide fragment fingerprinting (PFF) and de novo sequencing with another peer ranking algorithm simply based on counting the successful and failed runs. This study demonstrated a novel integration and useful evaluation of specific proteomic

  4. Sequence Based Prediction of Antioxidant Proteins Using a Classifier Selection Strategy

    PubMed Central

    Zhang, Lina; Zhang, Chengjin; Gao, Rui; Yang, Runtao; Song, Qing

    2016-01-01

    Antioxidant proteins perform significant functions in maintaining oxidation/antioxidation balance and have potential therapies for some diseases. Accurate identification of antioxidant proteins could contribute to revealing physiological processes of oxidation/antioxidation balance and developing novel antioxidation-based drugs. In this study, an ensemble method is presented to predict antioxidant proteins with hybrid features, incorporating SSI (Secondary Structure Information), PSSM (Position Specific Scoring Matrix), RSA (Relative Solvent Accessibility), and CTD (Composition, Transition, Distribution). The prediction results of the ensemble predictor are determined by an average of prediction results of multiple base classifiers. Based on a classifier selection strategy, we obtain an optimal ensemble classifier composed of RF (Random Forest), SMO (Sequential Minimal Optimization), NNA (Nearest Neighbor Algorithm), and J48 with an accuracy of 0.925. A Relief combined with IFS (Incremental Feature Selection) method is adopted to obtain optimal features from hybrid features. With the optimal features, the ensemble method achieves improved performance with a sensitivity of 0.95, a specificity of 0.93, an accuracy of 0.94, and an MCC (Matthew’s Correlation Coefficient) of 0.880, far better than the existing method. To evaluate the prediction performance objectively, the proposed method is compared with existing methods on the same independent testing dataset. Encouragingly, our method performs better than previous studies. In addition, our method achieves more balanced performance with a sensitivity of 0.878 and a specificity of 0.860. These results suggest that the proposed ensemble method can be a potential candidate for antioxidant protein prediction. For public access, we develop a user-friendly web server for antioxidant protein identification that is freely accessible at http://antioxidant.weka.cc. PMID:27662651

  5. Identification of a Functional Plasmodesmal Localization Signal in a Plant Viral Cell-To-Cell-Movement Protein.

    PubMed

    Yuan, Cheng; Lazarowitz, Sondra G; Citovsky, Vitaly

    2016-01-19

    Our fundamental knowledge of the protein-sorting pathways required for plant cell-to-cell trafficking and communication via the intercellular connections termed plasmodesmata has been severely limited by the paucity of plasmodesmal targeting sequences that have been identified to date. To address this limitation, we have identified the plasmodesmal localization signal (PLS) in the Tobacco mosaic virus (TMV) cell-to-cell-movement protein (MP), which has emerged as the paradigm for dissecting the molecular details of cell-to-cell transport through plasmodesmata. We report here the identification of a bona fide functional TMV MP PLS, which encompasses amino acid residues between positions 1 and 50, with residues Val-4 and Phe-14 potentially representing critical sites for PLS function that most likely affect protein conformation or protein interactions. We then demonstrated that this PLS is both necessary and sufficient for protein targeting to plasmodesmata. Importantly, as TMV MP traffics to plasmodesmata by a mechanism that is distinct from those of the three plant cell proteins in which PLSs have been reported, our findings provide important new insights to expand our understanding of protein-sorting pathways to plasmodesmata. The science of virology began with the discovery of Tobacco mosaic virus (TMV). Since then, TMV has served as an experimental and conceptual model for studies of viruses and dissection of virus-host interactions. Indeed, the TMV cell-to-cell-movement protein (MP) has emerged as the paradigm for dissecting the molecular details of cell-to-cell transport through the plant intercellular connections termed plasmodesmata. However, one of the most fundamental and key functional features of TMV MP, its putative plasmodesmal localization signal (PLS), has not been identified. Here, we fill this gap in our knowledge and identify the TMV MP PLS. Copyright © 2016 Yuan et al.

  6. A unique charge-coupled device/xenon arc lamp based imaging system for the accurate detection and quantitation of multicolour fluorescence.

    PubMed

    Spibey, C A; Jackson, P; Herick, K

    2001-03-01

    In recent years the use of fluorescent dyes in biological applications has dramatically increased. The continual improvement in the capabilities of these fluorescent dyes demands increasingly sensitive detection systems that provide accurate quantitation over a wide linear dynamic range. In the field of proteomics, the detection, quantitation and identification of very low abundance proteins are of extreme importance in understanding cellular processes. Therefore, the instrumentation used to acquire an image of such samples, for spot picking and identification by mass spectrometry, must be sensitive enough to be able, not only, to maximise the sensitivity and dynamic range of the staining dyes but, as importantly, adapt to the ever changing portfolio of fluorescent dyes as they become available. Just as the available fluorescent probes are improving and evolving so are the users application requirements. Therefore, the instrumentation chosen must be flexible to address and adapt to those changing needs. As a result, a highly competitive market for the supply and production of such dyes and the instrumentation for their detection and quantitation have emerged. The instrumentation currently available is based on either laser/photomultiplier tube (PMT) scanning or lamp/charge-coupled device (CCD) based mechanisms. This review briefly discusses the advantages and disadvantages of both System types for fluorescence imaging, gives a technical overview of CCD technology and describes in detail a unique xenon/are lamp CCD based instrument, from PerkinElmer Life Sciences. The Wallac-1442 ARTHUR is unique in its ability to scan both large areas at high resolution and give accurate selectable excitation over the whole of the UV/visible range. It operates by filtering both the excitation and emission wavelengths, providing optimal and accurate measurement and quantitation of virtually any available dye and allows excellent spectral resolution between different fluorophores

  7. Accurate prediction of RNA-binding protein residues with two discriminative structural descriptors.

    PubMed

    Sun, Meijian; Wang, Xia; Zou, Chuanxin; He, Zenghui; Liu, Wei; Li, Honglin

    2016-06-07

    RNA-binding proteins participate in many important biological processes concerning RNA-mediated gene regulation, and several computational methods have been recently developed to predict the protein-RNA interactions of RNA-binding proteins. Newly developed discriminative descriptors will help to improve the prediction accuracy of these prediction methods and provide further meaningful information for researchers. In this work, we designed two structural features (residue electrostatic surface potential and triplet interface propensity) and according to the statistical and structural analysis of protein-RNA complexes, the two features were powerful for identifying RNA-binding protein residues. Using these two features and other excellent structure- and sequence-based features, a random forest classifier was constructed to predict RNA-binding residues. The area under the receiver operating characteristic curve (AUC) of five-fold cross-validation for our method on training set RBP195 was 0.900, and when applied to the test set RBP68, the prediction accuracy (ACC) was 0.868, and the F-score was 0.631. The good prediction performance of our method revealed that the two newly designed descriptors could be discriminative for inferring protein residues interacting with RNAs. To facilitate the use of our method, a web-server called RNAProSite, which implements the proposed method, was constructed and is freely available at http://lilab.ecust.edu.cn/NABind .

  8. Proteogenomics produces comprehensive and highly accurate protein-coding gene annotation in a complete genome assembly of Malassezia sympodialis.

    PubMed

    Zhu, Yafeng; Engström, Pär G; Tellgren-Roth, Christian; Baudo, Charles D; Kennell, John C; Sun, Sheng; Billmyre, R Blake; Schröder, Markus S; Andersson, Anna; Holm, Tina; Sigurgeirsson, Benjamin; Wu, Guangxi; Sankaranarayanan, Sundar Ram; Siddharthan, Rahul; Sanyal, Kaustuv; Lundeberg, Joakim; Nystedt, Björn; Boekhout, Teun; Dawson, Thomas L; Heitman, Joseph; Scheynius, Annika; Lehtiö, Janne

    2017-03-17

    Complete and accurate genome assembly and annotation is a crucial foundation for comparative and functional genomics. Despite this, few complete eukaryotic genomes are available, and genome annotation remains a major challenge. Here, we present a complete genome assembly of the skin commensal yeast Malassezia sympodialis and demonstrate how proteogenomics can substantially improve gene annotation. Through long-read DNA sequencing, we obtained a gap-free genome assembly for M. sympodialis (ATCC 42132), comprising eight nuclear and one mitochondrial chromosome. We also sequenced and assembled four M. sympodialis clinical isolates, and showed their value for understanding Malassezia reproduction by confirming four alternative allele combinations at the two mating-type loci. Importantly, we demonstrated how proteomics data could be readily integrated with transcriptomics data in standard annotation tools. This increased the number of annotated protein-coding genes by 14% (from 3612 to 4113), compared to using transcriptomics evidence alone. Manual curation further increased the number of protein-coding genes by 9% (to 4493). All of these genes have RNA-seq evidence and 87% were confirmed by proteomics. The M. sympodialis genome assembly and annotation presented here is at a quality yet achieved only for a few eukaryotic organisms, and constitutes an important reference for future host-microbe interaction studies. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  9. CYP450 phenotyping and accurate mass identification of metabolites of the 8-aminoquinoline, anti-malarial drug primaquine.

    PubMed

    Pybus, Brandon S; Sousa, Jason C; Jin, Xiannu; Ferguson, James A; Christian, Robert E; Barnhart, Rebecca; Vuong, Chau; Sciotti, Richard J; Reichard, Gregory A; Kozar, Michael P; Walker, Larry A; Ohrt, Colin; Melendez, Victor

    2012-08-02

    The 8-aminoquinoline (8AQ) drug primaquine (PQ) is currently the only approved drug effective against the persistent liver stage of the hypnozoite forming strains Plasmodium vivax and Plasmodium ovale as well as Stage V gametocytes of Plasmodium falciparum. To date, several groups have investigated the toxicity observed in the 8AQ class, however, exact mechanisms and/or metabolic species responsible for PQ's haemotoxic and anti-malarial properties are not fully understood. In the present study, the metabolism of PQ was evaluated using in vitro recombinant metabolic enzymes from the cytochrome P450 (CYP) and mono-amine oxidase (MAO) families. Based on this information, metabolite identification experiments were performed using nominal and accurate mass measurements. Relative activity factor (RAF)-weighted intrinsic clearance values show the relative role of each enzyme to be MAO-A, 2C19, 3A4, and 2D6, with 76.1, 17.0, 5.2, and 1.7% contributions to PQ metabolism, respectively. CYP 2D6 was shown to produce at least six different oxidative metabolites along with demethylations, while MAO-A products derived from the PQ aldehyde, a pre-cursor to carboxy PQ. CYPs 2C19 and 3A4 produced only trace levels of hydroxylated species. As a result of this work, CYP 2D6 and MAO-A have been implicated as the key enzymes associated with PQ metabolism, and metabolites previously identified as potentially playing a role in efficacy and haemolytic toxicity have been attributed to production via CYP 2D6 mediated pathways.

  10. Multidimensional protein identification technology (MudPIT): technical overview of a profiling method optimized for the comprehensive proteomic investigation of normal and diseased heart tissue.

    PubMed

    Kislinger, Thomas; Gramolini, Anthony O; MacLennan, David H; Emili, Andrew

    2005-08-01

    An optimized analytical expression profiling strategy based on gel-free multidimensional protein identification technology (MudPIT) is reported for the systematic investigation of biochemical (mal)-adaptations associated with healthy and diseased heart tissue. Enhanced shotgun proteomic detection coverage and improved biological inference is achieved by pre-fractionation of excised mouse cardiac muscle into subcellular components, with each organellar fraction investigated exhaustively using multiple repeat MudPIT analyses. Functional-enrichment, high-confidence identification, and relative quantification of hundreds of organelle- and tissue-specific proteins are achieved readily, including detection of low abundance transcriptional regulators, signaling factors, and proteins linked to cardiac disease. Important technical issues relating to data validation, including minimization of artifacts stemming from biased under-sampling and spurious false discovery, together with suggestions for further fine-tuning of sample preparation, are discussed. A framework for follow-up bioinformatic examination, pattern recognition, and data mining is also presented in the context of a stringent application of MudPIT for probing fundamental aspects of heart muscle physiology as well as the discovery of perturbations associated with heart failure.

  11. A rapid and accurate method for determining protein content in dairy products based on asynchronous-injection alternating merging zone flow-injection spectrophotometry.

    PubMed

    Liang, Qin-Qin; Li, Yong-Sheng

    2013-12-01

    An accurate and rapid method and a system to determine protein content using asynchronous-injection alternating merging zone flow-injection spectrophotometry based on reaction between coomassie brilliant blue G250 (CBBG) and protein was established. Main merit of our approach is that it can avoid interferences of other nitric-compounds in samples, such as melamine and urea. Optimized conditions are as follows: Concentrations of CBBG, polyvinyl alcohol (PVA), NaCl and HCl are 150 mg/l, 30 mg/l, 0.1 mol/l and 1.0% (v/v), respectively; volumes of the sample and reagent are 150 μl and 30 μl, respectively; length of a reaction coil is 200 cm; total flow rate is 2.65 ml/min. The linear range of the method is 0.5-15 mg/l (BSA), its detection limit is 0.05 mg/l, relative standard deviation is less than 1.87% (n=11), and analytical speed is 60 samples per hour. Copyright © 2013 Elsevier Ltd. All rights reserved.

  12. An Ensemble Method to Distinguish Bacteriophage Virion from Non-Virion Proteins Based on Protein Sequence Characteristics.

    PubMed

    Zhang, Lina; Zhang, Chengjin; Gao, Rui; Yang, Runtao

    2015-09-09

    Bacteriophage virion proteins and non-virion proteins have distinct functions in biological processes, such as specificity determination for host bacteria, bacteriophage replication and transcription. Accurate identification of bacteriophage virion proteins from bacteriophage protein sequences is significant to understand the complex virulence mechanism in host bacteria and the influence of bacteriophages on the development of antibacterial drugs. In this study, an ensemble method for bacteriophage virion protein prediction from bacteriophage protein sequences is put forward with hybrid feature spaces incorporating CTD (composition, transition and distribution), bi-profile Bayes, PseAAC (pseudo-amino acid composition) and PSSM (position-specific scoring matrix). When performing on the training dataset 10-fold cross-validation, the presented method achieves a satisfactory prediction result with a sensitivity of 0.870, a specificity of 0.830, an accuracy of 0.850 and Matthew's correlation coefficient (MCC) of 0.701, respectively. To evaluate the prediction performance objectively, an independent testing dataset is used to evaluate the proposed method. Encouragingly, our proposed method performs better than previous studies with a sensitivity of 0.853, a specificity of 0.815, an accuracy of 0.831 and MCC of 0.662 on the independent testing dataset. These results suggest that the proposed method can be a potential candidate for bacteriophage virion protein prediction, which may provide a useful tool to find novel antibacterial drugs and to understand the relationship between bacteriophage and host bacteria. For the convenience of the vast majority of experimental Int. J. Mol. Sci. 2015, 16,21735 scientists, a user-friendly and publicly-accessible web-server for the proposed ensemble method is established.

  13. Identification of Borrelia protein candidates in mouse skin for potential diagnosis of disseminated Lyme borreliosis.

    PubMed

    Grillon, Antoine; Westermann, Benoît; Cantero, Paola; Jaulhac, Benoît; Voordouw, Maarten J; Kapps, Delphine; Collin, Elody; Barthel, Cathy; Ehret-Sabatier, Laurence; Boulanger, Nathalie

    2017-12-01

    In vector-borne diseases, the skin plays an essential role in the transmission of vector-borne pathogens between the vertebrate host and blood-feeding arthropods and in pathogen persistence. Borrelia burgdorferi sensu lato is a tick-borne bacterium that causes Lyme borreliosis (LB) in humans. This pathogen may establish a long-lasting infection in its natural vertebrate host where it can persist in the skin and some other organs. Using a mouse model, we demonstrate that Borrelia targets the skin regardless of the route of inoculation, and can persist there at low densities that are difficult to detect via qPCR, but that were infective for blood-feeding ticks. Application of immunosuppressive dermocorticoids at 40 days post-infection (PI) significantly enhanced the Borrelia population size in the mouse skin. We used non-targeted (Ge-LC-MS/MS) and targeted (SRM-MS) proteomics to detect several Borrelia-specific proteins in the mouse skin at 40 days PI. Detected Borrelia proteins included flagellin, VlsE and GAPDH. An important problem in LB is the lack of diagnosis methods capable of detecting active infection in humans suffering from disseminated LB. The identification of Borrelia proteins in skin biopsies may provide new approaches for assessing active infection in disseminated manifestations.

  14. [Accurate diagnosis of Pseudomonas luteola in routine microbiology laboratory: on the occasion of two isolates].

    PubMed

    Çiçek, Muharrem; Hasçelik, Gülşen; Müştak, H Kaan; Diker, K Serdar; Şener, Burçin

    2016-10-01

    Pseudomonas luteola which was previously known as Chryseomonas luteola; is a gram-negative, non-fermentative, aerobic, motile, non-spore-forming bacillus. It is frequently found as a saprophyte in soil, water and other damp environments and is an opportunistic pathogen in patients with underlying medical disorders or with indwelling catheters. It has been reported as an uncommon cause of bacteremia, sepsis, septic arthritis, meningitis, endocarditis, and peritonitis. Thus, early and accurate identification of this rare species is important for the treatment and also to provide information about the epidemiology of P.luteola infections. This report was aimed to draw attention to the accurate identification of P.luteola in clinical samples, upon the isolation and identification in two cases in the medical microbiology laboratory of a university hospital. In February 2011, a 66-year-old man, with chronic obstructive pulmonary disease, coronary artery disease and aplastic anemia, was admitted to our hospital due to progressive dyspnea. A chest tube was inserted on the 20th day of admission by the reason of recurrent pleural effusion. Staphylococcus aureus and a non-fermentative gram-negative bacillus (NFGNB) with wrinkled, sticky yellow colonies were isolated from the pleural fluid sample obtained on the 9th day following the insertion of the chest tube. In February 2012, a 7-year-old male cystic fibrosis patient who had no signs and symptoms of acute pulmonary exacerbation was admitted to the hospital for a routine control. This patient had chronic colonization with Pseudomonas aeruginosa and S.aureus and his sputum sample obtained at this visit revealed isolation of P.aeruginosa, S.aureus, Aspergillus fumigatus and a wrinkled, sticky yellow NFGNB. Both of these NFGNB were identified as P.luteola by the Phoenix automated microbial identification system (BD Diagnostics, USA). To evaluate the microbiological characteristics of these two isolates, the strains were

  15. Plant Aquaporins: Genome-Wide Identification, Transcriptomics, Proteomics, and Advanced Analytical Tools.

    PubMed

    Deshmukh, Rupesh K; Sonah, Humira; Bélanger, Richard R

    2016-01-01

    Aquaporins (AQPs) are channel-forming integral membrane proteins that facilitate the movement of water and many other small molecules. Compared to animals, plants contain a much higher number of AQPs in their genome. Homology-based identification of AQPs in sequenced species is feasible because of the high level of conservation of protein sequences across plant species. Genome-wide characterization of AQPs has highlighted several important aspects such as distribution, genetic organization, evolution and conserved features governing solute specificity. From a functional point of view, the understanding of AQP transport system has expanded rapidly with the help of transcriptomics and proteomics data. The efficient analysis of enormous amounts of data generated through omic scale studies has been facilitated through computational advancements. Prediction of protein tertiary structures, pore architecture, cavities, phosphorylation sites, heterodimerization, and co-expression networks has become more sophisticated and accurate with increasing computational tools and pipelines. However, the effectiveness of computational approaches is based on the understanding of physiological and biochemical properties, transport kinetics, solute specificity, molecular interactions, sequence variations, phylogeny and evolution of aquaporins. For this purpose, tools like Xenopus oocyte assays, yeast expression systems, artificial proteoliposomes, and lipid membranes have been efficiently exploited to study the many facets that influence solute transport by AQPs. In the present review, we discuss genome-wide identification of AQPs in plants in relation with recent advancements in analytical tools, and their availability and technological challenges as they apply to AQPs. An exhaustive review of omics resources available for AQP research is also provided in order to optimize their efficient utilization. Finally, a detailed catalog of computational tools and analytical pipelines is

  16. Plant Aquaporins: Genome-Wide Identification, Transcriptomics, Proteomics, and Advanced Analytical Tools

    PubMed Central

    Deshmukh, Rupesh K.; Sonah, Humira; Bélanger, Richard R.

    2016-01-01

    Aquaporins (AQPs) are channel-forming integral membrane proteins that facilitate the movement of water and many other small molecules. Compared to animals, plants contain a much higher number of AQPs in their genome. Homology-based identification of AQPs in sequenced species is feasible because of the high level of conservation of protein sequences across plant species. Genome-wide characterization of AQPs has highlighted several important aspects such as distribution, genetic organization, evolution and conserved features governing solute specificity. From a functional point of view, the understanding of AQP transport system has expanded rapidly with the help of transcriptomics and proteomics data. The efficient analysis of enormous amounts of data generated through omic scale studies has been facilitated through computational advancements. Prediction of protein tertiary structures, pore architecture, cavities, phosphorylation sites, heterodimerization, and co-expression networks has become more sophisticated and accurate with increasing computational tools and pipelines. However, the effectiveness of computational approaches is based on the understanding of physiological and biochemical properties, transport kinetics, solute specificity, molecular interactions, sequence variations, phylogeny and evolution of aquaporins. For this purpose, tools like Xenopus oocyte assays, yeast expression systems, artificial proteoliposomes, and lipid membranes have been efficiently exploited to study the many facets that influence solute transport by AQPs. In the present review, we discuss genome-wide identification of AQPs in plants in relation with recent advancements in analytical tools, and their availability and technological challenges as they apply to AQPs. An exhaustive review of omics resources available for AQP research is also provided in order to optimize their efficient utilization. Finally, a detailed catalog of computational tools and analytical pipelines is

  17. Performance and cost analysis of matrix-assisted laser desorption ionization-time of flight mass spectrometry for routine identification of yeast.

    PubMed

    Dhiman, Neelam; Hall, Leslie; Wohlfiel, Sherri L; Buckwalter, Seanne P; Wengenack, Nancy L

    2011-04-01

    Matrix-assisted laser desorption ionization-time of flight (MALDI-TOF) mass spectrometry was compared to phenotypic testing for yeast identification. MALDI-TOF mass spectrometry yielded 96.3% and 84.5% accurate species level identifications (spectral scores, ≥ 1.8) for 138 common and 103 archived strains of yeast. MALDI-TOF mass spectrometry is accurate, rapid (5.1 min of hands-on time/identification), and cost-effective ($0.50/sample) for yeast identification in the clinical laboratory.

  18. Live Cell Genomics: RNA Exon-Specific RNA-Binding Protein Isolation.

    PubMed

    Bell, Thomas J; Eberwine, James

    2015-01-01

    RNA-binding proteins (RBPs) are essential regulatory proteins that control all modes of RNA processing and regulation. New experimental approaches to isolate these indispensable proteins under in vivo conditions are needed to advance the field of RBP biology. Historically, in vitro biochemical approaches to isolate RBP complexes have been useful and productive, but biological relevance of the identified RBP complexes can be imprecise or erroneous. Here we review an inventive experimental to isolate RBPs under the in vivo conditions. The method is called peptide nucleic acid (PNA)-assisted identification of RBP (PAIR) technology and it uses cell-penetrating peptides (CPPs) to deliver photo-activatible RBP-capture molecule to the cytoplasm of the live cells. The PAIR methodology provides two significant advantages over the most commonly used approaches: (1) it overcomes the in vitro limitation of standard biochemical approaches and (2) the PAIR RBP-capture molecule is highly selective and adaptable which allows investigators to isolate exon-specific RBP complexes. Most importantly, the in vivo capture conditions and selectivity of the RBP-capture molecule yield biologically accurate and relevant RBP data.

  19. Optimization of Search Engines and Postprocessing Approaches to Maximize Peptide and Protein Identification for High-Resolution Mass Data.

    PubMed

    Tu, Chengjian; Sheng, Quanhu; Li, Jun; Ma, Danjun; Shen, Xiaomeng; Wang, Xue; Shyr, Yu; Yi, Zhengping; Qu, Jun

    2015-11-06

    The two key steps for analyzing proteomic data generated by high-resolution MS are database searching and postprocessing. While the two steps are interrelated, studies on their combinatory effects and the optimization of these procedures have not been adequately conducted. Here, we investigated the performance of three popular search engines (SEQUEST, Mascot, and MS Amanda) in conjunction with five filtering approaches, including respective score-based filtering, a group-based approach, local false discovery rate (LFDR), PeptideProphet, and Percolator. A total of eight data sets from various proteomes (e.g., E. coli, yeast, and human) produced by various instruments with high-accuracy survey scan (MS1) and high- or low-accuracy fragment ion scan (MS2) (LTQ-Orbitrap, Orbitrap-Velos, Orbitrap-Elite, Q-Exactive, Orbitrap-Fusion, and Q-TOF) were analyzed. It was found combinations involving Percolator achieved markedly more peptide and protein identifications at the same FDR level than the other 12 combinations for all data sets. Among these, combinations of SEQUEST-Percolator and MS Amanda-Percolator provided slightly better performances for data sets with low-accuracy MS2 (ion trap or IT) and high accuracy MS2 (Orbitrap or TOF), respectively, than did other methods. For approaches without Percolator, SEQUEST-group performs the best for data sets with MS2 produced by collision-induced dissociation (CID) and IT analysis; Mascot-LFDR gives more identifications for data sets generated by higher-energy collisional dissociation (HCD) and analyzed in Orbitrap (HCD-OT) and in Orbitrap Fusion (HCD-IT); MS Amanda-Group excels for the Q-TOF data set and the Orbitrap Velos HCD-OT data set. Therefore, if Percolator was not used, a specific combination should be applied for each type of data set. Moreover, a higher percentage of multiple-peptide proteins and lower variation of protein spectral counts were observed when analyzing technical replicates using Percolator

  20. Distyrylbenzene-aldehydes: identification of proteins in water.

    PubMed

    Kumpf, Jan; Freudenberg, Jan; Bunz, Uwe H F

    2015-05-07

    Three different, water soluble, aldehyde-appended distyrylbenzene (DSB) derivatives were prepared. Their interaction with different albumin variants (human, porcine, bovine, lactalbumin, ovalbumin) was investigated (pH 11). All three fluorophores exhibit graded, protein-dependent fluorescence turn-on at slightly differing wavelengths. Linear discriminant analysis (LDA) differentiated all of the investigated albumins and was used to discern commercially available protein shakes. The three DSB derivatives barely react with the constituting amino acids but cysteine. In the proteins significant fluorescence signals are generated, probably due to a combination of imine/N,S-aminal formation and hydrophobic interactions between the DSBs and the proteins.